Bazel, Buildbarn and Bonanza

March 24, 2025 · 3 min read

Benjamin Ingberg

In the Buildbarn event at Snowflake this week, Ed Schouten presented the work on the Bonanza project. Bonanza is a prototype for a drop in replacement Bazel with improvements to its communication protocols and written as a client that performs all of its actions remotely, deduplicated and cacheable.

How is it different from Bazel?

Bazel is written in an interesting manner. It is designed from the ground up to represent a stable description of a build that is reproducible from any developer's machine. Any developer can check out the same source code ask Bazel to analyze it and get the exact commands Bazel would run to perform the build.

This involves a lot of computation: Resolving bazel modules, downloading repository rules, executing them, analyzing the .bzl and BUILD.bazel files, and finally executing the actions themselves.

Bazel internally represents these computation steps as pure functions returning deterministic values from a stable key, internally represented as SkyFunctions returning SkyValues from SkyKeys.

A SkyFunction should return the same SkyValue given the same SkyKey which makes it a good candidate for being computed only once and have the result cached and fed through to the next step. But peculiarly only the action execution itself has been offloaded to remote cache and execution.

cache all the things

This is what Bonanza attempts to resolve. Instead of having all of that state computed and represented locally, repeated by each developer and discarded whenever a flag changes, Bonanza is designed from the ground up to offload all computations, downloads, and analysis to a remote cacheable system. This essentially reduces the client side to a very light weight program with the primary purpose of scanning the source code for modifications and uploading any local changes.

What's the current state?

The project is still in an early state prototype state. In the presentation Ed demonstrated how far the prototype has come, and it was very impressive. The Bonanza client as of today is capable of analyzing the bb-storage codebase.

There is still a lot remaining before it's a production ready system but it is very impressive progress. Bonanza is capable of running the full bzlmod resolution, fetching all relevant repository rules from upstream, run those repository rules, and analyzing the resulting codebase, basically performing everything a Bazel cquery does, all with fully remote and cacheable manner.

For shops which lean heavily on remote caching and execution the remaining local load is often the bottleneck for your builds, the ability to offload the remaining pieces to your build cluster may be the next evolution of your build system.

Understanding the KLM

November 11, 2024 · 6 min read

Benjamin Ingberg

When running a remote cache and execution cluster based on Buildbarn the Key Location Map (KLM) is a term that you will run into and it is important to take proper care when sizing the KLM and the number of KLM attempts.

If you are just looking for a ballpark number to get you started, set the number of get attempts to 16 and the number of put attempts to 64 and use the following table.

	CAS	AC
Average Object Size	125KB	1KB
Storage Size	500TB	1GB
KLM Entries	16 000 000	4 000 000
KLM Size	1056MB	264MB

These are arbitrarily chosen values which are unlikely to match your actual workload. I recommend reading the rest of the article to understand your settings and how to reason about this. You can then use the prometheus metrics in the end to validate if your settings are a good match for your workload.

How does the KLM work?

The KLM is a hash table which describes the position in your storage layer where your desired data is written and is indexed by hashing the key of your storage data.

Given a limited key space hash functions will have collisions, it is therefore important that the KLM is significantly larger than required for fitting a key for every object. To keep the likelihood of a collision low a naive implementation would require an enourmous hash table but using a technique called Robin Hood hashing this requirement can be kept down to a small factor larger than the size of the key set requirement.

For example, in a blobstore which can fit n objects and that has a KLM which can fit 2n entries every hash would have a 50% chance of corresponding to an already occupied slot. With Robin Hood hashing we can repeat this process multiple times by incrementing an attempts counter giving us multiple possible locations for the same object.

When querying for an object we can then search up to the maximum number of allowed iterations to find our object in one of these slots. When inserting we perform a similar solution, namely incrementing the number of attempts whenever we encounter a collision but taking care to insert the younger of the colliding objects in the colliding slot and pushing the older object forward.

The number of attempts we allow the KLM to look for a free slot is described by the two parameters key_location_map_maximum_get_attempts and the key_location_map_maximum_put_attempts described by the LocalBlobAccessConfiguration.

So, how big should the KLM be?

Given a utilization rate r the chance of finding the object within k iterations is 1-r^k, we can therefore either decrease the utilization rate (by increasing the size of the KLM) or increase the number of attempts.

Due to the random access nature of the KLM the KLM greatly benefits from being small enough to fit in memory, even if the KLM itself is disk backed. Should the KLM be too big to fit in memory it will be constantly paged in and out detrimenting the system performance.

Simularly, there also needs to be a max number of iterations, in the degenerate case where the storage fits more entries than the klm is capable of inserting is full the algorithm would never terminate since every single slot would be occupied.

Having a KLM that is too small for the number of iterations used is bad.

This is somewhat mitigated by the insertion order where the oldest entries get pushed out first, since they are less likely to be relevant. This gives a graceful degradation for when your KLM is too small. You should choose a KLM so that the number of times you reach the maximum number of iterations is acceptably low.

How rare should you keep the maximum nuber of iterations?

It should be rare, but most objects that get discarded due to the KLM being full will tend to be old and unused. There is however a point where it is no longer meaningful to have a larger KLM.

Ultimately, any time you read or write to a disk there is a risk of failure. Popularly this is described as happening due to cosmic radiation but more realistically it is due random hardware failures from imperfections in the hardware.

Picking k and r values that gives a risk of dataloss below the Uncorrectable Bit Error Rate (UBER) of a disk is simply wasteful, should you wish to reduce the risk below this value you need to look at mirroring data.

Western Digital advertises that their Gold enterprise NVME disks has an UBER rate of 1 in 10^17, i.e. about once per 10 petabytes of read data so will serve as a decent standard.

For a random CAS object of 125KB this corresponds to a failure rate of about 1 in 10^11 reads, giving us this neat graph.

diagram

That is, for a KLM using the recommended 16 iterations giving it more than 5 entries per object in the storage is a waste since you are just as likely to fail to read the object due to disk errors as due to the KLM accidentally pushing it out.

Similarly for 32 iterations there is no point in having more than 2 entries per object, and for 8 iterations there is no point in having more than 20 entries per object.

As for number of put iterations, just keep it at 4x the number of get iterations. There is no fancy math here, it just needs to be bigger than the number of get iterations and it is very cheap since you will only put objects a miniscule fraction of the amount of times you will get objects.

The thought of data randomly getting lost might upset you spiritually, but you can comfort yourself with that you are far more likely to lose due to AWS engineer tripping on a cable in the datacenter.

How do I verify if my KLMs are properly sized?

Buildbarn exposes the behavior of the hashing strategy in it's Prometheus metrics, they are exposed in the following metrics:

hashing_key_location_map_get_attempts
hashing_key_location_map_get_too_many_attempts_total
hashing_key_location_map_put_iterations
hashing_key_location_map_put_too_many_iterations_total

These metrics exposes the required number of get and put attempts respectively as well as how many times we exceeded the maximum number of iterations, you can read the ratio between how many iterations were required to figure out how full the klm is. I.e. if you perform half as many attempts with 2 iterations as with 1 iteration this implies the klm is half full.

There are ready made Grafana dashboards which visualizes these metrics in in bb-deployments.

Introducing Meroton’s New Course: Buildbarn Fundamentals

September 8, 2024 · 2 min read

Benjamin Ingberg

At Meroton, we’ve long provided managed Buildbarn environments to help development teams streamline their build processes. Now, we’re excited to take it a step further with the introduction of our latest offering: Buildbarn Fundamentals.

This new course is designed to empower your team with the knowledge and practical skills needed to manage and operate your own Buildbarn environment. Whether you're just getting started with remote build execution or looking to take control of your infrastructure, this hands-on course will provide you with the tools to succeed.

Unlock the Power of Buildbarn with Expert Guidance

Buildbarn Fundamentals is more than just a training session; it’s an opportunity to set up a production-ready Buildbarn reference cluster in your own AWS environment. By participating in the course, your team will not only understand what makes Buildbarn tick but also walk away with a fully functional Remote Build Environment (RBE) cluster, which you can continue to use or adapt to your organization’s needs.

At Meroton, we believe that mastering the management of remote build environments is crucial for modern development workflows. This course is designed to be both comprehensive and practical, offering a deep dive into Buildbarn while equipping your team with the operational knowledge to effectively maintain and scale your infrastructure.

What You'll Learn

Buildbarn Fundamentals is a hands-on course that teaches participants how to:

Set up a fully operational Buildbarn cluster in AWS
Manage and operate Buildbarn clusters independently
Integrate third-party tools to enhance your build environment and understand the needs of consumers
Optimize caching, remote execution, and troubleshooting with Buildbarn

By the end of the course, your team will have the skills and confidence to self-manage a Buildbarn environment, enabling you to scale and optimize your development processes independently.

More Information

If you're ready to empower your team with the skills to manage your own Buildbarn infrastructure, get in touch! Contact us at sales@meroton.com to learn more and secure your spot. Or read more at Buildbarn Fundamentals

Summer Buildbar

June 13, 2024 · One min read

Benjamin Ingberg

Before heads out to summer adventures I'd like to invite everyone to a cool summer Buildbar. At the Buildbar we'll eat good food, talk about interesting technical problems, new developments with Bazel and Buildbarn.

And also have a few beers.

Feel welcome to come over on Wednesday 19 June 2024, from 16 to 20.

Directions

You'll find us at our Linköping offices at Fridtunagatan 33. Currently there is some ongoing construction but follow the red lines and you'll be fine.

Automatically Reformat all Commits on a Branch

December 22, 2023 · 4 min read

Nils Wireklint

If you have a formatter tool that can rewrite your code you can run it automatically on all unmerged commits. This will show you how to script git-rebase to do so without any conflicts.

There are two ways to do it manually, forward or backward. The forward pass amends each commit and deals with the conflicts when stepping to the next commit. In contrast the backwards pass, formats each commit from the end, which will avoid conflicts but for long commit chains it can be almost as boring.

This pattern comes up when working with long-lived feature branches, or tasks that were almost done, and then pre-empted by other prioritized work. Here are a few oneliners you can run to tidy up your commits.

See also the full technical guide for developing this git-rebase workflow in our documentation. Which contains more details on rebasing with git, using a scriptable editor to automate the git-rebase todo-list, as well as the squashed commit messages.

Example commits

Say you have three unmerged commits:

21cc7b5 My amazing feature e05fd9f Other complimentary work acb9fae Fix annoying bug

They contain important work, but you forgot to run some linters, or the main branch added more lint requirements after the feature work was started. This will run linters that can automatically fix issues on each commit through a scripted git-rebase.

Rebase algorithm

We have a three-step process to update each commit.

1: Create a fixup commit with the applied lint suggestions, which we immediately revert so the next commit still applies

#!/bin/sh

# Formatters and fixers go here.
# Replace with your tools of choice! rustfmt, gofmt, black, ...
./run-all-linters-and-autofixers.sh

# Add a new commit with the changes and revert it again.
git add -u
git commit --allow-empty --fixup HEAD
# 'git-revert' does not support '--allow-empty'.
git revert --no-commit HEAD
git commit --allow-empty --no-edit

2: Squash the fixup commit into the original feature commit
3: Squash the revert down into the next feature commit

These tabs show how the commits evolve and are squashed, the extra commits are grouped to indicate the target commit. The revert of the first commit is grouped with the second feature commit, and so on. We discard the final revert.

Original
1: Reformated
2: Fixed-up
3: Fully-squashed

21cc7b5 My amazing feature

e05fd9f Other complimentary work

acb9fae Fix annoying bug

21cc7b5 My amazing feature
01900c5 fixup! My amazing feature

55feaba Revert "fixup! My amazing feature"
e05fd9f Other complimentary work
d122da7 fixup! Other complimentary work

249b0d3 Revert "fixup! Other complimentary work"
acb9fae Fix annoying bug
50e426a fixup! Fix annoying bug

7e84259 Revert "fixup! Fix annoying bug"

9ed9557 My amazing feature

55feaba Revert "fixup! My amazing feature"
0db521b Other complimentary work

249b0d3 Revert "fixup! Other complimentary work"
e2e991b Fix annoying bug

7e84259 Revert "fixup! Fix annoying bug"

9ed9557 My amazing feature

8e76352 Other complimentary work

f286036 Fix annoying bug

Oneliners

git allows us to set any editor to edit the todo-list, $GIT_SEQUENCE_EDITOR, and the commit message, $EDITOR. We choose vim as it is often available, and easier to use than sed and awk. It is nice to have a scriptable interactive editor to make changes to the workflow and try out the commands.

See the full technical guide for details and more tips on git-rebase and vim.

Reformat:

$ env                          \
    GIT_SEQUENCE_EDITOR="true" \
    git rebase -i --exec ./reformat.sh origin/main

Fixup (autosquash):

# More robust autosquash, that handles duplicated commit messages.
# If your commit messages are all unique you can use '--autosquash' instead.
# See the technical guide for more details.
$ env                                                             \
    GIT_SEQUENCE_EDITOR="vim +'g/^\w* \w* fixup!/s/^pick/fixup/'" \
    git rebase -i origin/main

Squash:

$ env                                                                                               \
    EDITOR="sed -i '1,9d'"                                                                          \
    GIT_SEQUENCE_EDITOR="vim +'g/^#/d' +'normal! Gdk' +'g/^pick \w* Revert \"fixup!/normal! j0ces'" \
    git rebase -i origin/main

info

We have not developed the incantation, git-rebase command, to preserve the author date from the original commits. We will address that next!

Improved Chroot in Buildbarn

December 1, 2023 · One min read

Nils Wireklint

We have just started a documentation series describing the Buildbarn chroot runners, and how they can be used for hermetic input roots that contain all the required tools. This includes implementation notes for a "mountat" functionality created through the new Linux mount API, how you can use this under-documented API and its shortcomings. And how this can/will be integrated into Buildbarn, with technical descriptions of the workers and runners.

The first sections are already available, with more to come!

Sections:

Reference code repository:

https://github.com/meroton/prototype-mountat/

Updates to Buildbarn as of November 2023

November 21, 2023 · 2 min read

Benjamin Ingberg

This is a continuation of the previous update article and is a high level summary of what has happened in Buildbarn from 2023-02-16 to 2023-11-14.

Added support for JWTs signed with RSA

Support for JWTs signed with RSA has been added. The following JWT signing algorithms are now supported:

HS256
HS384
HS512
RS256
RS384
RS512
EdDSA
ES256
ES384
ES512

Generalized tuneables for Linux BDI options

Linux 6.2 added a sysfs attribute for toggling BDI_CAP_STRICTLIMIT on FUSE mounts. If using the FUSE backed virtual file system on Linux 6.2 adding { "strict_limit": "0" } to linux_backing_dev_info_tunables will remove the BDI_CAP_STRICTLIMIT flag from the FUSE mount.

This may improve fileystem performance especially when running build actions which uses mmap'ed files extensively.

Add support for injecting Xcode environment variables

Remote build with macOS may call into locally installed copies of Xcode. The path to the local copy of Xcode may vary and Bazel assumes that the remote execution service is capable of processing Xcode specific environment variables.

See the proto files for details.

Add a minimum timestamp to ActionResultExpiringBlobAccess

A misbehaving worker may polluted the action cache, after fixing the misbehaving worker we would rather not throw away the entire action cache.

A minimum timestamp in ActionResultExpiringBlobAccess allows us to mark a timestamp in the past before which the action should be considered invalid.

Add authentication to HTTP servers

Much like the gRPC servers are capable of authenticated configuration the http servers can now also require authentication.

This allows the bb_browser and bb_scheduler UI to authenticate access using OAuth2 without involving any other middleware.

This also allows us to add authorization configuration for administrative tasks such as draining workers or killing of jobs.

Authentication using a JSON Web Key Set

JSON Web Key Sets (JWKS) is a standard format which allows us to specify multiple different encryption keys that may have been used to sign our JWT authentication.

Buildbarn can load the JWKS specification, either inline or as a file, when specifying trusted encryption keys.

This allows us to have rotation with overlap of encryption keys.

Memory Adventure

November 13, 2023 · 8 min read

Nils Wireklint

An adventure in finding a memory thief in Starlark-land

This is a summary and follow-up to my talk at BazelCon-2023. With abridged code examples, the full instructions are available together with the code.

Problem Statement

First, we lament Bazel's out-of-memory errors, and point out that the often useful Starlark stacktrace does not always show up. Some allocation errors just crash Bazel without giving and indication of which allocation failed.

allocation

This diagram illustrates a common problem for memory errors, the allocation that fails may not be the problem, it is just the straw that breaks the camel's back. And the real thief may already have allocated its memory.

We have seen many errors when working with clients, and they typically hide in big corporate code bases. Which complicates troubleshooting, discussion and error reporting. So we create a synthetic repository to try to illustrate the problem, and have something to discuss. The code and instructions are available here.

Errors and poor performance in the analysis phase are not good at all. This is because the analysis must always be done before starting to build all actions. With big projects the number of configuration to build for can be very large, so one cannot rely on CI runners to build the same configuration over and over, to retain the analysis cache. Instead it is on the critical-path for all builds, especially if the actions themselves are cached remotely.

To illustrate (some of the problem) we have a reproduction repository with example code base with some Python and C programs. To introduce memory problems, and make it a little more complex we add two rules: one CPU intensive rule ("spinlock") and one memory intensive aspect ("traverse"). The "traverse" aspect encodes the full dependency tree of all targets and writes that to a file with ctx.actions.write. So the allocations are tied to the Action object.

Toolbox

We have a couple of tools available, many are discussed in the memory optimization guide, but we find that some problems can slip through the cracks.

First off, there are the post-build analysis tools in bazel:

bazel info
bazel dump --rules
bazel aquery --skyframe_state

These are a good starting point and have served us well on many occasions. But with this project they seem to miss some allocations We will return to that later. Additionally, these tool will not give any information if the Bazel server crashes. You will need to increase the memory and run the same build again.

Then one can use Java tools to inspect what the JVM is doing:

Eclipse Memory Analyzer
jmap

The best approach here is to ask Bazel to save the heap if it crashes, so it can be analyzed post-mortem: bazel --heap_dump_on_oom

And lastly, use Bazel's profiling information:

bazel --profile=profile.gz --generate_json_trace_profile --noslim_profile

This contains structured information and is written continuously to disk, so if Bazel crashes we can still parse it, we just need to discard partially truncated events.

Expected Memory consumption

As the two rules write their string allocations to output files we get a clear picture of the expected RAM usage (or at least a lower bound).

$ bazel clean
$ bazel build \
  --aspects @example//memory:eat.bzl%traverse \
  --output_groups=default,eat_memory \
  //...
# Memory intensive tree traversal (in KB)
$ find bazel-out/ -name '*.tree' | xargs du | cut -f1 | paste -sd '+' | bc
78504
# CPU intensive spinlocks (in KB)
$ find bazel-out/ -name '*.spinlock' | xargs du | cut -f1 | paste -sd '+' | bc
3400

Here is a table with the data:

	Memory for each target	Total
Memory intensive	0-17 MB	79 MB
CPU intensive	136 KB	3.4 MB

Reported Memory Consumption

Next, we check with the diagnostic tools.

$ bazel version
Bazelisk version: development
Build label: 6.4.0

Bazel dump --rules

$ bazel $STARTUP_FLAGS --host_jvm_args=-Xmx"10g" dump --rules
Warning: this information is intended for consumption by developers
only, and may change at any time. Script against it at your own risk!

RULE                              COUNT     ACTIONS          BYTES         EACH
cc_library                            4          17        524,320      131,080
native_binary                         1           4        524,288      524,288
cc_binary                             6          54        262,176       43,696
toolchain_type                       14           0              0            0
toolchain                            74           0              0            0
...

ASPECT                             COUNT     ACTIONS          BYTES         EACH
traverse                              85          81        262,432        3,087
spinlock14                            35          66        524,112       14,974
spinlock15                            35          66              0            0
...

First, there are some common rules that we do not care about here, then we have the Aspects. traverse is the memory intensive aspect, which is applied on the command line and spinlock<N> are the CPU intensive rules, with identical implementations just numbered (there are 25 of them).

It is a little surprising that only one have allocations. And the action count for each aspect does not make sense either, as this is not a transitive aspect. It just runs a single action each time the rule is instantiated. The hypothesis is that this is a display problem, with code shared between rules. There are 25 rules, with 25 distinct implementation functions, but they in turn call the same function with the action. So the "count" and "actions" columns are glued together, but the "bytes" is reported for just one of the rules (it would be bad if this was double-counted).

Either way, the total number of bytes does not add up to what we expect. Compare the output to the lower-bound determined before:

| | Memory for each target | Total | Reported Total | | ---- | ---- | ----------- | | Memory intensive | 0-17 MB | 79 MB | 262 kB | CPU intensive | 136 KB | 3.4 MB | 524 kB

Skylark Memory Profile

info

This is not part of the video.

The skylark memory profiler is much more advanced, and can be dumped after a successful build.

$ bazel $STARTUP_FLAGS --host_jvm_args=-Xmx"$mem" dump \
    --skylark_memory="$dir/memory.pprof"

$ pprof manual/2023-10-30/10g-2/memory.pprof
Main binary filename not available.
Type: memory
Time: Oct 30, 2023 at 12:16pm (CET)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 2816.70kB, 73.34% of 3840.68kB total
Showing top 10 nodes out of 19
      flat  flat%   sum%        cum   cum%
     512kB 13.33% 13.33%      512kB 13.33%  impl2
  256.16kB  6.67% 20.00%   256.16kB  6.67%  traverse_impl
  256.11kB  6.67% 26.67%   256.11kB  6.67%  _add_linker_artifacts_output_groups
  256.09kB  6.67% 33.34%   256.09kB  6.67%  alias
  256.09kB  6.67% 40.00%   256.09kB  6.67%  rule
  256.08kB  6.67% 46.67%   256.08kB  6.67%  to_list
  256.06kB  6.67% 53.34%   256.06kB  6.67%  impl7
  256.04kB  6.67% 60.01%   256.04kB  6.67%  _is_stamping_enabled
  256.04kB  6.67% 66.67%   256.04kB  6.67%  impl18
  256.03kB  6.67% 73.34%   768.15kB 20.00%  cc_binary_impl

Here the Memory intensive aspect shows up with 256kB, which is inline with the output from bazel dump --rules, but not reflecting the big allocations we know it makes.

Eclipse Memory Analyzer

The final tool we have investigated is the Java heap analysis tool Eclipse Memory Analyzer, which can easily be used with Bazel's --heap_dump_on_oom flag. On the other hand it is a little tricker to find a heap dump from a successful build.

eclipse-analysis

Here we see the (very) big allocation clear as day, but have no information of its provenance.

We have not found how to track this back to a Skylark function, Skyframe evaluator or anything that could be cross-referenced with the profiling information.

Build Time

The next section of the talk shows the execution time of the build with varying memory limits.

combined

This is benchmarked with 5 data points for each memory limit, and the plot shows failure if there was at least one crash among the data points. There is a region where the build starts to succeed more and more often, but sometimes crashes. So the Crash and not-crash graphs overlap a little, you want to have some leeway to avoid flaky builds from occasional out-of-memory crashes.

We see that the Skymeld graph requires a lot less memory than a regular build, that is because our big allocations are all tied to Action objects. Enabling Skymeld lets Bazel start executing Actions as soon as they are ready, so the resident set of Action objects does not grow so large, and the allocations can be freed much sooner.

Pessimization with limited memory

pessimization

We saw a hump in the build time for the Skymeld graph, where the builds did succeed in the 300 - 400 MB range, but the build speed gradually increased, reaching a plateau at around 500 MB. This is a pattern we have seen before, where more RAM, or more efficient rules can improve build performance.

This is probably because the memory pressure and the Java Garbage Collector interferes with the Skyframe work. See Benjamin Peterson's great talk about the Skyframe for more information.

Future work

example profile

This section details future work for more tools and signals that we can find from Bazel's profile information --profile=profile.gz --generate_json_trace_profile --noslim_profile. Written in the standard chrome://tracing format it is easy to parse for both successful and failed builds.

This contains events for the garbage collector, and all executed Starlark functions.

These can be correlated to find which functions are active during, or before, garbage collection events. Additionally, one could collect this information for all failed builds, and see if some functions are overrepresented among the last active functions for each evaluator in the build.

BazelCon 2023

November 2, 2023 · 2 min read

Fredrik Medley

Meroton visited BazelCon 2023 in Munich October 24-25, 2023. During the conference, we held three talks:

Remote Output Service - How not to have your bytes and eat them too by Benjamin Ingberg about bb_clientd and the Remote Output Service.
Buildbarn - From 100 to 100.000 CPUs (slides) by Fredrik Medley.
Dude, where is my RAM? - An adventure in finding a RAM thief in Starlak land by Nils Wireklint about difficulties in debugging out-of-memory errors in Bazel.

Other talks that mentioned Buildbarn were:

Migrating a Multiple-Platform Game Engine to Bazel where Kai Zhang from NetEase talks about a Buildbarn worker implementation in Python for Windows and MacOS.
Planting Bazel in barren soil: A Perl Story where Manuel Naranjo from Booking.com is using Buildbarn for remote execution and Buildbuddy for Build Event Streaming.

We are thankful for all amazing chats with the community and are looking forward to BazelCon 2024.

Buildbarn Block Sizes

April 11, 2023 · 3 min read

Benjamin Ingberg

When starting out with remote caching, an error you are likely to run into is:

java.io.IOException: com.google.devtools.build.lib.remote.ExecutionStatusException:
INVALID_ARGUMENT: Failed to store previous blob 1-<HASH>-<LARGE_NUM>:
Shard 1: Blob is <LARGE_NUM> bytes in size,
while this backend is only capable of storing blobs of up to 238608384 bytes in size

This is because your storage backend is too small. You are attempting to upload a blob larger than the largest blob accepted by your storage backend.

How do I fix it?

The largest blob you can store is the size of your your storage device divided by the number of blocks in your device.

To store larger blobs, either increase the size of your storage device or decrease the number of blocks it is split into. Larger storage devices will take more disk, while fewer blocks will decrease the granularity which your cache works with.

In bb-deployments this setting is found in storage.jsonnet.

{
  // ...
  contentAddressableStorage: {
    backend: {
      'local': {
        // ...
        oldBlocks: 8,
        currentBlocks: 24,
        newBlocks: 1,
        blocksOnBlockDevice: {
          source: {
            file: {
              path: '/storage-cas/blocks',
              sizeBytes: 8 * 1024 * 1024 * 1024, // 8GiB
            },
          },
          spareBlocks: 3,
        },
        // ...
      },
    },
  },
  // ...
}

To facilitate getting started bb-deployments emulates a block device by using an 8GiB large file. This file is small enough to fit most builds while not taking over the disk completely from a developers machine.

The device is then split into 36 blocks (8+24+1+3), where each block can then store a maximum of 238608384 bytes (8GiB / 36 - some alignment).

In production it is preferable to use a large raw block device for this purpose.

What does new/old/current/spare mean?

In depth documentation about all the settings are available in the configuration proto files.

In essence the storage works as a ringbuffer where the assignment of each block is rotated. Consider a 5 block configuration with 1 old, 2 current, 1 new and 1 spare block.

diagram

As data is referenced from an old block it gets written into a new block. When the new block is full the role rotates.

diagram

There are some tradeoffs in behaviour to consider when choosing your block layout. Fewer blocks will allow larger individual blobs at the cost of granularity. Here is a quick summary of the meaning of the different fields.

Old - Region where reads are actively copied over to new, too small value and your device behaves more like a FIFO than a LRU cache, too large and your device does a lot of uneccesary copying.
Current - Stable region, should be the majority of your device.
New - Region for writing new data to, must be 1 for AC and should be 1-4 for CAS. Having a couple of new blocks allows data to be better spread out over the device so as to not expire at the same time.
Spare - Region for giving ongoing reads some time to finish before data starts getting overwritten.

How is it different from Bazel?

What's the current state?

How does the KLM work?

So, how big should the KLM be?

How rare should you keep the maximum nuber of iterations?

How do I verify if my KLMs are properly sized?

Unlock the Power of Buildbarn with Expert Guidance​

What You'll Learn​

More Information​

Directions​

Example commits​

Rebase algorithm​

Oneliners​

Added support for JWTs signed with RSA​

Generalized tuneables for Linux BDI options​

Add support for injecting Xcode environment variables​

Add a minimum timestamp to ActionResultExpiringBlobAccess​

Add authentication to HTTP servers​

Authentication using a JSON Web Key Set​

Problem Statement​

Toolbox​

Expected Memory consumption​

Reported Memory Consumption​

Bazel dump --rules​

Skylark Memory Profile​

Eclipse Memory Analyzer​

Build Time​

Pessimization with limited memory​

Future work​

How do I fix it?​

What does new/old/current/spare mean?​

Unlock the Power of Buildbarn with Expert Guidance

What You'll Learn

More Information

Directions

Example commits

Rebase algorithm

Oneliners

Added support for JWTs signed with RSA

Generalized tuneables for Linux BDI options

Add support for injecting Xcode environment variables

Add a minimum timestamp to ActionResultExpiringBlobAccess

Add authentication to HTTP servers

Authentication using a JSON Web Key Set

Problem Statement

Toolbox

Expected Memory consumption

Reported Memory Consumption

Bazel dump --rules

Skylark Memory Profile

Eclipse Memory Analyzer

Build Time

Pessimization with limited memory

Future work

How do I fix it?

What does new/old/current/spare mean?