Is Bazel right for me?¶
Nearly as many build tools exist as there are programming languages out there. C++ has Autotools/Make, CMake and many others. Java has Ant, Maven, Gradle and several more. Haskell has Cabal, Stack, Shake and several more. Each of these originated in a given language community but are in some cases generic enough to support building any language. Are any of them the right choice for your use case? Should you be combining several systems? That’s what this document should help you answer.
Rule of thumb¶
your project is an independently publishable single library, or small set of libraries;
your project is open source code and has at most small static assets (hence publishable on Hackage);
your project is nearly entirely Haskell code with perhaps a little bit of C;
your project has many dependencies on other packages found on Hackage but few if any system dependencies (like zlib, libpng etc);
Bazel works well for the following use cases:
projects that cannot be hosted on Hackage (games with large static assets, proprietary code etc);
projects with a very large amount of code hosted in a single repository;
projects in which you or your team are writing code in two or more languages (e.g. Haskell/PureScript, or Haskell/Java, or Haskell/C++/FORTRAN);
For all the benefits it can bring, Bazel also has an upfront cost. Don’t pay that cost if the benefits don’t justify it.
If you don’t have much code to build, any build tool will do. Build issues like lack of complete reproducibility are comparatively easier to debug, and working around build system bugs by wiping the entire build cache first is entirely viable in this particular case. So might as well use low-powered Haskell-native build tools that ship with GHC. You won’t need sandboxed build actions to guarantee build system correctness, completely hermetic builds for good reproducibility, build caching, test result caching or distributed builds for faster build and test times. Those features start to matter for larger projects, and become essential for very large monorepos.
Why exactly do these features matter?
Hermetic builds are builds that do not take any part of the host’s system configuration (set of installed system libraries and their versions, content of
/etc, OS version, etc) as an input. If all build actions are deterministic, hermeticity guarantees that builds are reproducible anywhere, anytime. More developers on a project means more subtly different system configurations to cope with. The more system configurations, the more likely that the build will fail in one of these configurations but not in others… Unless the build is completely hermetic.
Sandboxing build actions guarantees that all inputs to all build actions are properly declared. This helps prevent build system correctness bugs, which are surprisingly and exceedingly common in most non-sandboxing build systems, especially as the build system becomes more complex. When a build system might be incorrect, users regularly have to wipe the entire build cache to work around issues. As the codebase becomes very large, rebuilding from scratch can cost a lot of CPU time.
Distributed build caches make building the code from a fresh checkout trivially fast. Continuous integration populates the build cache at every branch push, so that building all artifacts from fresh checkouts seldom needs to actually build anything at all locally. In the common case, builds become network-bound instead of CPU-bound.
Distributed build action execution mean that average build times can stay constant even as the codebase grows, because you can seamlessly distribute the build on more machines.
Test result caching is the key to keeping continuous integration times very low. Only those tests that depend on code that was modified need be rerun.
On their own hermetic and sandboxed builds can already save quite a few headaches. But crucially, without them one can’t even hope to have any of the other features that follow them above.