The Build System Engineer

The Build System Engineer

Is there a need for a “Build System Engineer”? tl;dr Yes and no. A Build System Engineer’s primary focus is to enable simple and maintainable build systems for local and remote deployments. Today, these skills are shared among roles, like SRE or Software Engineer. Hiring an engineer to take on solely these responsibilities, however, fulfills a critical requirement in today’s software development landscape and enables an organization to scale efficiently.

Today’s Landscape

Most medium to large sized modern software companies run more than one service hosted on a cloud provider (ref: my experience). Site Reliability Engineers (SREs) are responsible for maintaining up-time of services, whether they’re internal, like CI/CD pipelines, or external, like AWS’ S3 or GCP’s Cloud Storage. Sometimes these responsibilities fall on Software Engineers, who tend to be primarily responsible for implementing the business logic, which they ship to another team to deploy. But who is responsible for the interface between these two roles? The interface I’m referring to is the build system – the sticks and glue that pull code together into an application and deploy it on a developer’s machine and on a server.

The Problem: Early Male Pattern Baldness

There are about 100,000 hairs on the average human head says Google. There about 1/2 to 1/4 of that number on the average software engineer’s head, according to my own experience. The cause? Poor build systems… well, that and genetics.

make deploy

Joking aside… How many times have you run: make deploy (or something similar) To deploy a fully-functioning version of an application under development to your local machine? How many times has it resulted in utter frustration and wasted time? What about this command is so powerful, so important, yet so frustrating that almost every engineer can relate to it? It’s the promise of simplifying the complexity of many parts into one iterable command. At its best, this command represents quick iteration in local development and reliable remote deployment. At its worst, it represents unreliable builds and incongruency between local and remote state of an application.

What is a “Build System Engineer”?

Put simply, a Build System Engineer is responsible for ensuring make deploy results in effortless and ignorable abstraction of deploying a fully-functioning application locally and remotely. Realistically, this means developing a build system that:

  • satisfies the existing needs: make build and make deploy reliably and quickly build and deploy the app.
  • is extensible and easily modifiable: if the organization needs to write an app in a new language tomorrow, the build system engineer can quickly extended the build system to work for that language.
  • is highly reliable and reproducable: commands produce the same result every time, locally and remotely.
  • is maintainable: when the engineer resposible for building a system leaves, another engineer can replace them; tools are well known and documentation is comprehensive, detailed, and up to date.

Tools

These requirements align with knowledge in the following tools and skills:

  • GNU Make or another well-known build automation tool
  • Docker, including Dockerfiles and its interaction with Unix systems
  • CI/CD tool(s) like Concourse, Travis CI, etc.
  • Infrastructure as code tools like Terraform
  • Writing services to mock external dependencies (like Stripe)

These are all aspects of creating a system that works remotely the same way it works locally, allowing for quick and reliable development by many engineers simultaneously across an engineering organization.

Importance of the Role

Local vs Remote

To prove the importance of this role, let’s take the classic example of running a system locally vs remotely. Say we have two computers: the local computer that a developer writes code on, and the remote computer that production code runs on. Let’s assume that, as development of a service progresses, there are certain things that become accessible only in the remote environment or app. These are things like external APIs that require API keys, like the Stripe API, or repositories that contain container images, like Dockerhub, and require credential and privileges to access. To share credentials to these services across an engineering organization would both be insecure and create a chaotic state in those remote services. For example, imagine everyone had access to Docker Hub – this would result in, over time, many unnecessary, unmaintained, and forgotten container images, which cost the organization money to keep around, and some of which could pose security risks if they contained sensitive data. This discrepency between local and remote states creates a significant rift between

  1. what can be tested by the developer quickly
  2. how the system and tests function on a developer’s local machine

… And we all know what happens when a developer’s machine differs significantly enough from the machine the app is running on - things break (or never work to begin with!).

Wrap-Up

While I don’t think the skills of a good Build System Engineer are out of scope of many competent and experienced engineers, today’s systems are sufficiently complex that build and deploying them deserves a dedicated role. In most organizations, the responsibilities of a Build System Engineer are met by the collective efforts of the team that use the build systems. While this is somtimes works, in my experience, it often leads to discrepencies between projects or lack of recognition for the full amount of effort needed to make a good and scalable development environment. I urge you to consider putting more emphasis on the role I’ve outlines here, and ask yourself, “Can problems I face be resolved by someone who has the skills outlined here?”