Each of these posts has a remark about “lunar metadata.” This was an idea first proposed to me in the early 2000s (I cannot recall which coworker to credit with this… Heather Wakefield, perhaps?) that we should regard software development as proceeding in phases.

The theoretical principle here, oddly, is down to cybernetics.

If we want to hold some variable at a target value, and it is influenced by some input, control theory tells us we ought to start out by pushing back on any deviations with the first derivative of their relationship. If x is one unit too high, and increasing y by one unit increases x by two, apply negative one half y and see what happens. In this way we “attach a spring” between inputs and outputs, keeping everything in line.

But if the relationship between x and y has even the slightest bit of lag, or is in some interesting way other than linear, this synthetic spring will get bouncy. The greater the lag, or the greater the higher-order derivatives (call them “stiffness” if you like) the more that madness will creep in.

For any attempt to control such a system there is a slowest possible change that you can counteract (usually defined by limits on how finely you can measure and how hard you can push the system) and a fastest possible change that you can counteract (usually defined by how quickly you can measure and adjust control) and the difference between these is your “control bandwidth”. When we build layered systems we generally want the highest control bandwidth in the lowest levels, and the lowest control bandwidth in the highest level.

When we don’t have enough control bandwidth, our systems blow up or oscillate wildly.

Which is a problem for us.

Whatever we are trying to accomplish with software, the fuck around / find out loop is annoyingly long. In control terms the control bandwidth of software development is really crap. And where is it most crap? In the code we are least able to control. What is the code we are least able to control? The code that is most load bearing, where any adjustment can wreck the world.

We can of course take countermeasures. Broad, intensional testing of low-level code lets us treat its behaviour rather than its embodiment as normative, freeing us up change it with higher frequency. Skillfull, patient, and ethical development practices allow us to better anticipate the results of our changes and to be attentive to the parts of our code where slow degradation might otherwise go unnoticed.

But in the end we are always working at the edges of controllability: If software development reliably produces the results we intend, we do more of it, until it doesn’t; some combination of a Peter Principle and survivorship bias guarantees that in a complex software system nearly every bit of the codebase is collapsing, exploding, or oscillating wildly, because if it weren’t already we would make it so.

Some failures are better than others.

So as long as we are living in e^kx what k do we want? Positive k means explosion; we don’t want that. Small k is collapse, also no good since the whole principle is that people do want changes. Imaginary k means oscillation – things can change but they ultimately roll back around. That sounds tastier.

Big imaginary ks mean high frequencies; we sure would like to do things fast! Can we do that? Well that gets back to control bandwidth – we have to put that imaginary component in the range of our control.

And so we can say – the whole ballgame is control bandwidth. Can we anticipate and detect slow changes; can we quickly match our control to a quickly spiraling project; can we time our interventions to not worsen a pathological explosion?

When we train a junior engineer, we teach them how to push on the code, how to push hard, how to respond quickly, how to correct. It is a discipline of sudden extremes and fast feedback. These are the skills of the linear mode, of seeing where you want things to be and pushing in that direction until they are there.

As we become senior we must shift instead to grasping the intertemporal, oscillatory mode; we anticipate how the code changes, how its uses and intents change, and focus instead of writing the code that someone else will discover themselves needing in the future, positioned where they will be looking at the moment they realized they need it.

Lunar Process

Hence the lunar metaphor. We want our work to by cyclic, so that we can match our cycles of work to the cycles of code and then adjust our cycles to lead or lag those of the code to push and pull its phase and frequency. This will not bring our development process to a perfectly damped static equilibrium, but a dynamic equilibrium of converging oscillation.

The simplest cycle of practice I have found is to alternate expansion and contraction. These are kinds of PRs we know how to write! Here are the definitions I am currently using:

  • An expansion PR straightforward expansion. The code does more than it did yesterday. There are new API endpoints, new types and classes, new seams. TODO comments proliferate; functions emerge like mushrooms, novel paths blast into undiscovered functionality territory quite without regard for desire lines. Brainstorming and adding placeholder files are the purest form of expansion.
  • A contraction PR consolidates prior expansions. The code does what it did yesterday but with greater clarity, correctness, or performance. It is better supported by tools; TODOs are resolved, functions and types are pruned to match what their use cases have turned out to be. Documentation and design are contraction; behaviour-driven testing (invariant-driven testing co-located with functional documentation) is the purest form.

There are many other sorts of cyclical development, but keeping this form in mind and alternating batches of PRs between them is a useful mental discipline. Asking, “is this an expansion day or a contraction day for this project?” will seldom leave you worse off.

Today’s PR expands us into data storage. There’s a lot of trouble brewing there, but today is not the day to explain it. Suffice it to say: We don’t ever want to deal with a filesystem; the right choice is to slap one or two carefully chosen abstractions on and run away. Once we get to perf optimization that will will become very false, but that’s trouble we chose not to borrow right now.

This blog post corresponds to repository state post_09

Lunar metadata: This is an expansion phase; the scope of the codebase grows.