Minimal required:
backplane:
redisUri: "redis://localhost:6379"
queues:
- name: "cpu"
properties:
- name: "min-cores"
value: "*"
- name: "max-cores"
value: "*"
worker:
publicName: "localhost:8981"
The configuration can be provided to the server and worker as a CLI argument or through the environment variable CONFIG_PATH
For an example configuration containing all of the configuration values, see examples/config.yml
.
All Configurations
Common
Configuration | Accepted and Default Values | Command Line Argument | Description |
---|---|---|---|
digestFunction | SHA256, SHA1 | Digest function for this implementation | |
defaultActionTimeout | Integer, 600 | Default timeout value for an action (seconds) | |
maximumActionTimeout | Integer, 3600 | Maximum allowed action timeout (seconds) | |
maxEntrySizeBytes | Long, 2147483648 | Maximum size of a single blob accepted (bytes) | |
prometheusPort | Integer, 9090 | –prometheus_port | Listening port of the Prometheus metrics endpoint |
allowSymlinkTargetAbsolute | boolean, false | Permit inputs to contain symlinks with absolute path targets |
Example:
digestFunction: SHA1
defaultActionTimeout: 1800
maximumActionTimeout: 1800
prometheusPort: 9090
server:
...
worker:
...
Server
Configuration | Accepted and Default Values | Environment Var | Description |
---|---|---|---|
instanceType | SHARD | Type of implementation (SHARD is the only one supported) | |
name | String, shard | Implementation name | |
publicName | String, DERIVED:port | INSTANCE_NAME | Host:port of the GRPC server, required to be accessible by all servers |
actionCacheReadOnly | boolean, false | Allow/Deny writing to action cache | |
port | Integer, 8980 | Listening port of the GRPC server | |
bindAddress | String | Listening address of the GRPC server, default for Java Grpc (all interface addresses) if unspecified | |
casWriteTimeout | Integer, 3600 | CAS write timeout (seconds) | |
bytestreamTimeout | Integer, 3600 | Byte Stream write timeout (seconds) | |
sslCertificatePath | String, null | Absolute path of the SSL certificate (if TLS used) | |
sslPrivateKeyPath | String, null | Absolute path of the SSL private key (if TLS used) | |
runDispatchedMonitor | boolean, true | Enable an agent to monitor the operation store to ensure that dispatched operations with expired worker leases are requeued | |
dispatchedMonitorIntervalSeconds | Integer, 1 | Dispatched monitor’s lease expiration check interval (seconds) | |
runOperationQueuer | boolean, true | Acquire execute request entries cooperatively from an arrival queue on the backplane | |
ensureOutputsPresent | boolean, false | Decide if all outputs are also present in the CAS. If any outputs are missing a cache miss is returned | |
maxCpu | Integer, 0 | Maximum number of CPU cores that any min/max-cores property may request (0 = unlimited) | |
maxRequeueAttempts | Integer, 5 | Maximum number of requeue attempts for an operation | |
useDenyList | boolean, true | Allow usage of a deny list when looking up actions and invocations (for cache only it is recommended to disable this check) | |
grpcTimeout | Integer, 3600 | GRPC request timeout (seconds) | |
executeKeepaliveAfterSeconds | Integer, 60 | Execute keep alive (seconds) | |
recordBesEvents | boolean, false | Allow recording of BES events | |
clusterId | String, local | Buildfarm cluster ID | |
cloudRegion | String, us-east_1 | Deployment region in the cloud | |
gracefulShutdownSeconds | Integer, 0 | Time in seconds to allow for connections in flight to finish when shutdown signal is received |
Example:
server:
instanceType: SHARD
name: shard
actionCacheReadOnly: true
recordBesEvents: true
GRPC Metrics
Configuration | Accepted and Default Values | Description |
---|---|---|
enabled | boolean, false | Publish basic GRPC metrics to a Prometheus endpoint |
provideLatencyHistograms | boolean, false | Publish detailed, more expensive to calculate, metrics |
labelsToReport | List of Strings, [] | Include custom metrics labels in Prometheus metrics |
Example:
server:
grpcMetrics:
enabled: false
provideLatencyHistograms: false
labelsToReport: []
Server Caches
Configuration | Accepted and Default Values | Description |
---|---|---|
directoryCacheMaxEntries | Long, 64 * 1024 | The max number of entries that the directory cache will hold. |
commandCacheMaxEntries | Long, 64 * 1024 | The max number of entries that the command cache will hold. |
digestToActionCacheMaxEntries | Long, 64 * 1024 | The max number of entries that the digest-to-action cache will hold. |
recentServedExecutionsCacheMaxEntries | Long, 64 * 1024 | The max number of entries that the executions cache will hold. |
Example:
server:
caches:
directoryCacheMaxEntries: 10000
commandCacheMaxEntries: 10000
digestToActionCacheMaxEntries: 10000
recentServedExecutionsCacheMaxEntries: 10000
Admin
Configuration | Accepted and Default Values | Description |
---|---|---|
deploymentEnvironment | String, AWS, GCP | Specify deloyment environment in the cloud |
clusterEndpoint | String, grpc://localhost | Buildfarm cluster endpoint for Admin use (this is a full buildfarm endpoint) |
Example:
server:
admin:
deploymentEnvironment: AWS
clusterEndpoint: "grpc://localhost"
Metrics
Configuration | Accepted and Default Values | Description |
---|---|---|
publisher | String, aws, gcp, log | Specify publisher type for sending metadata |
logLevel | String, INFO, FINEST | Specify log level (“log” publisher only, all Java util logging levels are allowed here) |
topic | String, test | Specify SNS topic name for cloud publishing (“aws” publisher only) |
topicMaxConnections | Integer, 1000 | Specify maximum number of connections allowed for cloud publishing (“aws” publisher only) |
secretName | String, test | Specify secret name to pull SNS permissions from (“aws” publisher only) |
Example:
server:
metrics:
publisher: log
logLevel: INFO
server:
metrics:
publisher: aws
topic: buildfarm-metadata-test
topicMaxConnections: 1000
secretName: buildfarm-secret
Redis Backplane
Configuration | Accepted and Default Values | Environment Var | Command Line Argument | Description |
---|---|---|---|---|
type | SHARD | Type of backplane. Currently, the only implementation is SHARD utilizing Redis | ||
redisUri | String, redis://localhost:6379 | REDIS_URI | –redis_uri | Redis cluster endpoint. This must be a single URI. This can embed a username/password per RFC-3986 Section 3.2.1 and this will take precedence over redisPassword and redisPasswordFile . |
redisPassword | String, null | Redis password, if applicable | ||
redisPasswordFile | String, null | File to read for a Redis password. If specified, this takes precedence over redisPassword | ||
redisNodes | List of Strings, null | List of individual Redis nodes, if applicable | ||
jedisPoolMaxTotal | Integer, 4000 | The size of the Redis connection pool | ||
workersHashName | String, Workers | Redis key used to store a hash of registered workers | ||
workerChannel | String, WorkerChannel | Redis pubsub channel key where changes of the cluster membership are announced | ||
actionCachePrefix | String, ActionCache | Redis key prefix for all ActionCache entries | ||
actionCacheExpire | Integer, 2419200 | The TTL maintained for ActionCache entries, not refreshed on getActionResult hit | ||
actionBlacklistPrefix | String, ActionBlacklist | Redis key prefix for all blacklisted actions, which are rejected | ||
actionBlacklistExpire | Integer, 3600 | The TTL maintained for action blacklist entries | ||
invocationBlacklistPrefix | String, InvocationBlacklist | Redis key prefix for blacklisted invocations, suffixed with a a tool invocation ID | ||
operationPrefix | String, Operation | Redis key prefix for all operations, suffixed with the operation’s name | ||
operationExpire | Integer, 604800 | The TTL maintained for all operations, updated on each modification | ||
preQueuedOperationsListName | String, {Arrival}:PreQueuedOperations | Redis key used to store a list of ExecuteEntry awaiting transformation into QueryEntry | ||
processingListName | String, {Arrival}:ProcessingOperations | Redis key of a list used to ensure reliable processing of arrival queue entries with operation watch monitoring | ||
processingPrefix | String, Processing | Redis key prefix for operations which are being dequeued from the arrival queue | ||
processingTimeoutMillis | Integer, 20000 | Delay (in ms) used to populate processing operation entries | ||
queuedOperationsListName | String, {Execution}:QueuedOperations | Redis key used to store a list of QueueEntry awaiting execution by workers | ||
dispatchingPrefix | String, Dispatching | Redis key prefix for operations which are being dequeued from the ready to run queue | ||
dispatchingTimeoutMillis | Integer, 10000 | Delay (in ms) used to populate dispatching operation entries | ||
dispatchedOperationsHashName | String, DispatchedOperations | Redis key of a hash of operation names to the worker lease for its execution, which are monitored by the dispatched monitor | ||
operationChannelPrefix | String, OperationChannel | Redis pubsub channel prefix suffixed by an operation name | ||
casPrefix | String, ContentAddressableStorage | Redis key prefix suffixed with a blob digest that maps to a set of workers with that blob’s availability | ||
casExpire | Integer, 604800 | The TTL maintained for CAS entries, which is not refreshed on any read access of the blob | ||
subscribeToBackplane | boolean, true | Enable an agent of the backplane client which subscribes to worker channel and operation channel events. If disabled, responsiveness of watchers and CAS are reduced | ||
runFailsafeOperation | boolean, true | Enable an agent in the backplane client which monitors watched operations and ensures they are in a known maintained, or expirable state | ||
maxQueueDepth | Integer, 100000 | Maximum length that the ready to run queue is allowed to reach to control an arrival flow for execution | ||
maxPreQueueDepth | Integer, 1000000 | Maximum lengh that the arrival queue is allowed to reach to control load on the Redis cluster | ||
priorityQueue | boolean, false | Priority queue type allows prioritizing operations based on Bazel’s –remote_execution_priority= | ||
timeout | Integer, 10000 | Default timeout | ||
maxInvocationIdTimeout | Integer, 604800 | Maximum TTL (Time-to-Live in second) of invocationId keys in RedisBackplane | ||
maxAttempts | Integer, 20 | Maximum number of execution attempts | ||
cacheCas | boolean, false |
Example:
backplane:
type: SHARD
redisUri: "redis://localhost:6379"
priorityQueue: true
Execution Queues
Configuration | Accepted and Default Values | Description |
---|---|---|
name | String | Name of the execution queue (ex: cpu, gpu) |
allowUnmatched | boolean, true | |
properties | List of name/value pairs | Any specification of min/max-cores will be allowed to support CPU controls and worker resource delegation |
Example:
backplane:
type: SHARD
redisUri: "redis://localhost:6379"
queues:
- name: "cpu"
allowUnmatched: true
properties:
- name: "min-cores"
value: "*"
- name: "max-cores"
value: "*"
Worker
Configuration | Accepted and Default Values | Environment Var | Description |
---|---|---|---|
port | Integer, 8981 | Listening port of the worker | |
publicName | String, DERIVED:port | INSTANCE_NAME | Host:port of the GRPC server, required to be accessible by all servers |
root | String, /tmp/worker | Path for all operation content storage | |
inlineContentLimit | Integer, 1048567 | Total size in bytes of inline content for action results, output files, stdout, stderr content | |
operationPollPeriod | Integer, 1 | Period between poll operations at any stage | |
executeStageWidth | Integer, 0 | EXECUTION_STAGE_WIDTH | Number of CPU cores available for execution (0 = system available cores) |
executeStageWidthOffset | Integer, 0 | Offset number of CPU cores available for execution (to allow for use by other processes) | |
inputFetchStageWidth | Integer, 0 | Number of concurrently available slots to fetch inputs (0 = system calculated based on CPU cores) | |
inputFetchDeadline | Integer, 60 | Limit on time (seconds) for input fetch stage to fetch inputs | |
linkInputDirectories | boolean, true | Use an input directory creation strategy which creates a single directory tree at the highest level containing no output paths of any kind, and symlinks that directory into an action’s execroot, saving large amounts of time spent manufacturing the same read-only input hierirchy over multiple actions’ executions | |
execOwner | String, null | Create exec trees containing directories that are owned by this user | |
hexBucketLevels | Integer, 0 | Number of levels to create for directory storage by leading byte of the hash (problematic, not recommended) | |
defaultMaxCores | Integer, 0 | Constrain all executions to this logical core count unless otherwise specified via min/max-cores (0 = no limit) | |
limitGlobalExecution | boolean, false | Constrain all executions to a pool of logical cores specified in executeStageWidth | |
onlyMulticoreTests | boolean, false | Only permit tests to exceed the default coresvalue for their min/max-cores range specification (only works with non-zero defaultMaxCores) | |
allowBringYourOwnContainer | boolean, false | Enable execution in a custom Docker container | |
errorOperationRemainingResources | boolean, false | ||
errorOperationOutputSizeExceeded | boolean, false | Operations which produce single output files which exceed maxEntrySizeBytes will fail with a violation type which implies a user error. When disabled, the violation will indicate a transient error, with the action blacklisted. | |
realInputDirectories | List of Strings, external | A list of paths that will not be subject to the effects of linkInputDirectories setting, may also be used to provide writable directories as input roots for actions which expect to be able to write to an input location and will fail if they cannot | |
gracefulShutdownSeconds | Integer, 0 | Time in seconds to allow for operations in flight to finish when shutdown signal is received | |
createSymlinkOutputs | boolean, false | Creates SymlinkNodes for symbolic links discovered in output paths for actions. No verification of the symlink target path occurs. Buildstream, for example, requires this. | |
zstdBufferPoolSize | Integer, 2048 | Specifies the maximum number of zstd data buffers that may be in use concurrently by the filesystem CAS. Increase to improve compressed blob throughput, decrease to reduce memory usage. |
worker:
port: 8981
publicName: "localhost:8981"
realInputDirectories:
- "external"
Capabilities
Configuration | Accepted and Default Values | Description |
---|---|---|
cas | boolean, true | Enables worker to be a shard of the CAS |
execution | boolean, true | Enables worker to participate in execution pool |
Example:
worker:
capabilities:
cas: true
execution: true
Sandbox Settings
Configuration | Accepted and Default Values | Description |
---|---|---|
alwaysUseSandbox | boolean, false | Enforce that the sandbox be used on every acion. |
alwaysUseCgroups | boolean, true | Enforce that actions run under cgroups. |
alwaysUseTmpFs | boolean, false | Enforce that the sandbox uses tmpfs on every acion. |
selectForBlockNetwork | boolean, false | block-network enables sandbox action execution. |
selectForTmpFs | boolean, false | tmpfs enables sandbox action execution. |
Example:
worker:
sandboxSettings:
alwaysUseSandbox: true
alwaysUseCgroups: true
alwaysUseTmpFs: true
selectForBlockNetwork: false
selectForTmpFs: false
Note: In order for these settings to take effect, you must also configure limitGlobalExecution: true
.
Dequeue Match
Configuration | Accepted and Default Values | Description |
---|---|---|
allowUnmatched | boolean, false | |
properties | List of name/value pairs | Pairs of provisions available to match against action properties |
Example:
worker:
dequeueMatchSettings:
allowUnmatched: false
properties:
- name: "gpu"
- value: "nvidia RTX 2090"
Resources
A list of limited resources that are available to the worker to be depleted by actions which execute containing a “resource:
- specify
allowUnmatched: true
- contain “resource:
" in properties, with either a specific limited resource count as the only accepted value for the action property or "*"
Configuration | Accepted Values | Description |
---|---|---|
name | string | Resource identifier present on worker |
amount | Integer | Resource count depleted by actions |
Example:
worker:
dequeueMatchSettings:
properties:
- name: "resource:special-compiler-license"
- value: "1" # only actions which request one compiler license at a time will be accepted
resources:
name: "special-compiler-license"
amount: 3
Worker CAS
Unless specified, options are only relevant for FILESYSTEM type
Configuration | Accepted and Default Values | Description |
---|---|---|
type | FILESYSTEM, GRPC | Type of CAS used |
path | String, cache | Local cache location relative to the ‘root’, or absolute |
maxSizeBytes | Integer, 0 | Limit for contents of files retained from CAS in the cache, value of 0 means to auto-configure to 90% of root/path underlying filesystem space |
fileDirectoriesIndexInMemory | boolean, false | Determines if the file directories bidirectional mapping should be stored in memory or in sqlite |
skipLoad | boolean, false | Determines if transient data on the worker should be loaded into CAS on worker startup (affects startup time) |
target | String, null | For GRPC CAS type, target for external CAS endpoint |
Example:
This definition will create a filesystem-based CAS file cache at the path “
worker:
storages:
- type: FILESYSTEM
path: "cache"
maxSizeBytes: 2147483648 # 2 * 1024 * 1024 * 1024
maxEntrySizeBytes: 2147483648 # 2 * 1024 * 1024 * 1024
This definition elides FILESYSTEM configuration with ‘…’, will read-through an external GRPC CAS supporting the REAPI CAS Services into its storage, and will attempt to write expiring entries into the GRPC CAS (i.e. pushing new entries into the head of a worker LRU list will drop the entries from the tail into the GRPC CAS).
worker:
storages:
- type: FILESYSTEM
...
- type: GRPC
target: "cas.external.com:1234"
Execution Policies
Configuration | Accepted and Default Values | Description |
---|---|---|
name | String | Execution policy name |
executionWrapper | Execution wrapper, containing a path and list of arguments | Execution wrapper, its path and a list of arguments for the wrapper |
Example:
worker:
executionPolicies:
- name: test
executionWrapper:
path: /
arguments:
- arg1
- arg2