Cobuilds (experimental)

Rush's "cobuild" feature (cooperative builds) provides a lightweight solution for distributing work across multiple machines. The idea is a simple extension of what you're already doing: just spawn multiple instances of the same CI pipeline on different machines, allowing them to share work via Rush's build cache.

For example, suppose your job runs rush install && rush build, and we launch this command on two machines. If machine #1 has already built a project, then machine #2 will skip that project, instead fetching the result from the build cache. In this way, the building gets divided between the two pipelines, and with perfect parallelism the build might finish in half the time.

But there is a flaw in this idea: What if machine #2 reaches a project that machine #1 already started building but has not finished yet? This cache miss will cause machine #2 to start building the same project, when it may have been better to work on something else while waiting for machine #1 to finish that project. We can solve this by using a simple key/value store to communicate progress between machines. (In this tutorial we'll use Rush's Redis provider, but if your company already hosts some other service such as Memcached, it's fairly easy to implement your own provider.)

When to use cobuilds?

Without cobuilds, Rush already parallelizes your jobs on a single machine. (This may not be immediately obvious, since Rush's output is "collated" for readability, making it appear as if projects are getting built one at a time.) You can fine-tune the maximum parallelism using the --parallelism command-line parameter, but keep in mind that projects can only build concurrently if they don't depend on each other. Thus, cobuilds will only help if you've already reached the limits for a single machine (considering cpu cores, disk I/O rates, and available memory). And only if further parallelism is actually possible for your monorepo's project dependency graph.

The cobuild feature launches multiple instances of a CI pipeline, under the assumption that machines will be readily available. For example, if your cobuild allocates 4 machines, and your machine pool has 40 machines, then pool contention would not become a concern until 10 pull requests are waiting in the queue. By contrast, an extremely large monorepo might need thousands of machines, at which point it would make more sense to use a "build accelerator" such as BuildXL instead of cobuilds. (There are also plans to integrate Rush with bazel-buildfarm; Bazel is Google's equivalent of BuildXL.) Build accelerators generally require you to replace your CI system with their centralized job scheduler that manages its own dedicated pool of machines. Such systems require nontrivial maintenance and can have steeper learning curves, so we generally recommend to start with cobuilds first.

Before adopting cobuilds, we recommend to try these things first:

Enable the build cache: The build cache is a prerequisite for cobuilds.
Identify bottlenecks: If your monorepo's dependency graph does not actually allow lots of projects to be built in parallel, that must be fixed first before considering distributed builds. You can use Rush's --timeline parameter to identify bottlenecks that are causing too many projects to wait before they can start building. These bottlenecks can be solved by:
- eliminating unnecessary dependencies between projects
- introducing Rush phases to break up build steps into multiple operations
- refactoring code to break up big projects into smaller projects
Upgrade your hardware: If your builds are slow, it can help to add more machines. We generally recommend to choose high end hardware with the maximum amount of RAM and CPU cores for your plan, based on typical behavior of rush install and rush build. But every monorepo is different, so collect benchmarks on different hardware configurations to inform your decision. Speeding up the build makes everybody more productive; however, because hardware upgrades usually come from a different budget than engineering salaries, management sometimes may need some help to see this connection.
Cache state between runs: CI machines often start rush install && rush build with a completely clean machine image. For example, rush install time can be improved by using RUSH_PNPM_STORE_PATH to save the PNPM store and restore it between runs. Some environments permit the machine to be reused for multiple jobs, so that other Rush caches are preserved.
Consider using a merge queue: If two pull requests are waiting to get merged, normally a CI system will build a hot merge of pr1+main and pr2+main, to ensure that each PR branch is tested with the latest main. However after pr1+main has merged, we generally won't force pr2+main to be redone with the new main; this lack of safety can occasionally cause build breaks. (For example, suppose pr1 deleted an API, but pr2 added another call to that API.) A "merge queue" (also known as "commit queue") improves safety by instead building pr1+main and pr1+pr2+main; if the first PR fails, then it will retry with pr2+main. Advanced merge queues support "batches", where they directly test a "train" of pull requests pr1+pr2+main and only test pr1+main if there is a failure. This can speed up builds and/or reduce machine contention, while still guaranteeing safety. GitHub's merge queue doesn't support batches at the time of this writing, however, the Mergify third-party service implements batches and has been tested with Rush.

Prerequisites
In order to use the cobuild feature, you will need:
The Rush build cache enabled with a cloud storage provider.
A Redis server. If your company uses some other key/value service, you can implement a plugin by following the example of rush-redis-cobuild-plugin. (And consider contributing it back to Rush Stack!)
A CI system that is able to allocate multiple machines when a CI pipeline is triggered. For example, with GitHub Actions, a "workflow" can launch multiple "jobs" whose "runner" is a separate machine. With Azure DevOps, "pipelines" can run jobs on multiple "agents" that can be on different machines.
Rush phases are suggested to increase parallelism, but are not required for cobuilds.

Enabling the cobuild feature

Upgrade rushVersion in your rush.json to 5.104.1 or newer.
Create an autoinstaller for the Rush plugin:
```
rush init-autoinstaller --name cobuild-plugin
```
It's also okay to use an existing autoinstaller. For more about Rush plugins and autoinstallers, see Using Rush plugins and Autoinstallers.
Add the @rushstack/rush-redis-cobuild-plugin plugin to the autoinstaller. (We'll use Redis for this tutorial.)
common/autoinstallers/cobuild-plugin/package.json
```
{
  "name": "cobuild-plugin",
  "version": "1.0.0",
  "private": true,
  "dependencies": {
    "@rushstack/rush-redis-cobuild-plugin": "5.104.0"
  }
}
```
👉 IMPORTANT:
Over time, make sure to keep the version of @rushstack/rush-redis-cobuild-plugin in sync with the rushVersion from your rush.json.
Update the autoinstaller's lockfile:

When to use cobuilds?​

Enabling the cobuild feature​

When to use cobuilds?

Enabling the cobuild feature