How We Build Here
What's changed about writing production software, and what this trail is going to teach you · ~14 min read ~– min read · Suggested by Bob engineerpm
You were hired to write production software. The shape of that work has changed: you'll spend less time typing code and more time briefing, reviewing, and judging the work of agents that type for you. This trail is what we wish someone had handed us on day one.
The shape of the work has shifted up the stack
A few years ago, a senior engineer’s day was mostly typing — read a ticket, write code, run tests, push, repeat — with design squeezed around the edges. The typing is now the cheapest part of most days; what used to be a long afternoon of writing a feature is a short briefing followed by a longer review. We hired you to be a designer, PM, architect, and tech lead with the execution layer delegated to agents — technical depth still matters, especially in the review seat.
The day-to-day shape of the role changed. The work that always mattered most — thinking clearly about what should happen — is now most of what you do.
Outcome-oriented, not code-oriented
Here is the hinge: when many agents are editing in parallel, taste alone cannot be the contract — there are too many actors moving too fast. So we trust gates more than vibes. The gates are the linter’s exact thresholds, the type checker’s strict mode, the test suite, the coverage floor, the vulnerability scanner, the pre-commit hook, CI, and second-model review. Inside them the agent is free; outside them, the build fails. This is a real culture change: we add lint rules instead of lecturing, tighten coverage when we burn ourselves on a regression, and turn the “subjective” preferences that matter enough into deterministic checks — or watch them silently fade.
The frameworks transfer
A surprise we did not expect was that almost every old discipline of software engineering still applies. Specs, tickets, code review, decision records, post-mortems, retros — all of it. What changes is the cast: some participants are agents, some are you working with one, and the speed at which we exercise the rituals goes up. “We don’t need tickets, the agent already knows” is wrong in exactly the way “the senior engineer already knows” was wrong before — agents have the memory of goldfish without scaffolding, and a clear ticket in a tracker the agent can read is how the next session figures out what to do.
You will build your own trust model
There is one part of this work we are deliberately not prescriptive about. Constructured is BYOD — every contributor brings their own laptop, threat model, and comfort with how much rope to give an agent. We can teach you the mechanics: how Claude Code’s sandbox works, how auto mode trades safety for speed, what a permissions list looks like. Where you land on the spectrum is yours; the only wrong answer is to stop thinking about it. Your Box and Your Trust Model covers the mechanics; The Injection Problem is the threat side.
What’s in this trail
The remaining eleven cairns are the practical content, and they build on each other. The required tools — beads (the agent’s memory), timbers (the development ledger), Claude Code, just — get a cairn each. The trust model and box setup get a cairn. The recommended layer — Codex, quiet power tools, worktrees — gets three. Quality gates get the longest cairn in the trail. From Plan to Pull Request integrates everything in one realistic feature, end to end. If you only have an afternoon, read this cairn, Beads, the Backbone, Timbers, the Ledger, Quality Gates, and From Plan to Pull Request. The rest is the texture.
- Your Box and Your Trust Model — The mechanics of the BYOD trust model: mise, Docker, direnv, sandbox vs. auto mode.
- The Injection Problem — The threat-side companion to the trust-model framing here.
- Beads — The CLI issue tracker we use as the agent's persistent memory across sessions.
- Timbers — The development ledger that captures why a commit happened, not just what changed.
Welcome. If you are reading this on day one or day thirty, this trail is for you. Twelve cairns, each twelve to eighteen minutes, written so a person who has shipped software professionally — but never alongside a coding agent — can leave with a working mental model of how we work and why.
This first cairn is the philosophy. The next eleven are the mechanics: the tools, the gates, and the loop from spec to merged pull request. There are no code samples in this one. The argument has to land before any of the rest of it makes sense.
The shape of the work has shifted up the stack
A few years ago, the typical day for a senior engineer at a small product team was roughly: read a ticket, type code, run tests, fix what broke, push, repeat. The word “engineer” pointed at the typing. Design and architecture happened in slower bursts, in between the typing.
That ratio has flipped. The typing is now the cheapest part of most days. What used to be a long afternoon of writing a feature is, increasingly, a short briefing followed by a longer review.“Briefing” is not a casual word here. The quality of an agent-assisted feature is gated by the quality of the brief: what the change is, what it is not, which constraints are inviolable, where the decision points are. A vague brief produces vague code at startling speed. Architecture, naming, clarifying the spec, deciding what the change is not allowed to do — those used to be edge activities sandwiched around hours of mechanical work. They are now most of the work.
We did not pick this. The economics did. A senior engineer with a competent agent harness can move through several screens of code in the time it used to take to write one. The bottleneck is no longer “how fast can you type” but “how fast can you think clearly about what should happen.”
We hired you to be a designer, a product manager, an architect, and a tech lead with the execution layer delegated to agents. Technical depth still matters — more than ever, in the review seat. The day-to-day shape of the role is what changed.
This is not a downgrade. The work that mattered most has always been the work that came before the keystrokes. The keystrokes used to take the rest of the day; now they don’t, and we get more of the actual job.
Outcome-oriented, not code-oriented
Here is the hinge of everything that follows in this trail. When N agents are editing in parallel — sometimes two of yours, sometimes one of yours and one of a teammate’s, sometimes a subagent you spawned and forgot about — taste alone cannot be the contract. There are too many actors and they move too fast. The rope that holds the system together has to be deterministic.
So we trust gates more than vibes. The gates are: the linter’s exact thresholds, the type checker’s strict mode, the test suite, the coverage floor, the vulnerability scanner, the pre-commit hook, the CI pipeline, and the second-model review. Inside those gates the agent is free; outside them, the build fails and nothing merges.An obvious worry: if the gates are the contract, do we lose taste? No. We move taste up the stack. The choice of gates is taste. The architecture being reviewed is taste. The brief is taste. What we surrender is the pretense that we will catch every subtle regression by reading every diff line by line — at the speeds we work now, that pretense was already unsafe.
This is a real change in culture, not a rhetorical one. It means we add lint rules instead of lecturing. It means we tighten coverage when we burn ourselves on a regression, instead of writing a memo. It means a “subjective” preference that matters enough has to become a deterministic check, or it will silently fade.
Deterministic constraint: a rule the build can verify, by itself, on every commit, without a human reading the diff. Examples in this trail: golangci-lint’s funlen 60, ESLint’s no-floating-promises, the Go coverage floor (target 85%, currently enforced at 83%), audit-ci’s vulnerability gate. Counterexamples: code style someone enforces in review, “we usually don’t do it this way,” “I would have written it differently.”
The gates are the trust mechanism. They are why a code change drafted in fifteen minutes by an agent can land on main without a senior engineer reading every line: because every line had to pass an objective bar to get there. The reviewer’s job, human or agent, becomes the things the gates cannot check — architecture fit, naming, the question of whether this change should have happened at all.
The frameworks transfer
A surprise we did not expect when we started building this way is that almost every old discipline of software engineering still applies. Written specs. Lightweight tickets that describe one unit of work. Code review. Recorded architectural decisions. Post-mortems when something breaks. Retros when a sprint ends.
What changes is the cast. Some of those participants are now agents. Some of those participants are now you, working with an agent. The artifacts and the rituals look almost identical from the outside; the speed at which they move is what’s different.
The corollary is that “we don’t need tickets, the agent already knows” is wrong, in exactly the same way that “we don’t need tickets, the senior engineer already knows” was wrong before. Agents have the memory of goldfish without scaffolding. A clear ticket, in a tracker the agent can read, with a graph of dependencies, is not legacy ceremony — it is how an agent’s next session figures out what to do. We use beads for that, and the next several cairns explain why.
The same logic runs through the rest of the stack. We still write specs because the agent reads them. We still write timbers entries because the next agent reads them. We still review pull requests because the gates do not catch architectural drift. The frameworks transfer; the frequency at which we exercise them goes up.
You will build your own trust model
There is one part of this work we are deliberately not going to be prescriptive about. Constructured uses BYOD — every contributor brings their own laptop, their own threat model, their own comfort with how much rope to give an agent. We can teach you the mechanics: how Claude Code’s sandbox works, how auto mode trades safety for speed, what a permissions list looks like, how to constrain a subagent. We are not going to tell you where on that spectrum to land.
This is partly humility. The technology is moving fast enough that today’s “obviously safe” pattern may not survive next quarter’s news cycle. It is also partly respect. You will work with an agent in your own way; that style is yours to develop, and the best way for us to support it is to give you the levers and explain what each one does.
No setup is foolproof. An agent with broad permissions can run a malicious tool. An agent with narrow permissions can still leak data through what it writes to a public log. The trust model you build is a moving target, and the only wrong answer is to stop thinking about it.
You will hear an honest data point throughout this trail. The author runs auto mode in both Claude Code and Codex, with a curated permissions list, on a personal laptop he treats as the trust boundary. That is one point in a wide space. Other contributors are more conservative; they keep more interactions gated and live with more friction. Both are correct in the sense that both are thought through. Your Box and Your Trust Model walks through the mechanics so your own version can be thought through too, and The Injection Problem is the deeper read on why the threat surface is real.
What’s in this trail
The remaining eleven cairns are the practical content. We grouped them so you can read them in order — they build on each other — or skim a few and come back. A rough map:
- The tool tour. The Workshop maps every tool we use, where it comes from, and whether it is required. Then four cairns go deep on the required practices: Beads, the Backbone; Timbers, the Ledger; Claude Code as Daily Driver; and Just: One Place to Discover.
- Your Box and Your Trust Model. Your machine and your posture. mise, Docker, direnv, sandbox versus auto mode, BYOD framing.
- The recommended layer. Codex as Second Opinion, Quiet Power Tools, and Working in Parallel (Mostly) cover Codex, the small CLI tools that compound, and worktrees.
- Quality Gates: The Contract That Lets You Move Fast. Exact thresholds, the pre-commit hook, the cardinal rules.
- From Plan to Pull Request is the integration cairn — one realistic feature, end to end, with every prior cairn put to work.
If you only have an afternoon, read this cairn, Beads, the Backbone, Timbers, the Ledger, Quality Gates, and From Plan to Pull Request. That covers the philosophy, the two ledgers, the gates, and the daily loop. The other cairns are the texture.
This trail is the workflow. The companion trail, Osprey Strike — From Emergency to Resolution, is the product — what we actually build, why the architecture looks the way it does, and what an emergency callout is. Read this trail first if you want to start contributing; read that trail when you want to understand the domain you are contributing to.
Summary
- The shape of the work shifted up the stack. Briefing, reviewing, and judging are now the day; typing is the cheapest part of it. We hired you to be a designer, a PM, an architect, and a tech lead with execution delegated to agents.
- Trust the gates, not the vibes. When agents work in parallel, deterministic constraints are the contract that lets us move fast safely. Inside the gates the agent is free; outside them, the build fails.
- The old engineering rituals still apply. Specs, tickets, review, decision records, post-mortems — all of them. The participants are faster; the practices are the same.
- Your trust model is yours to build. Constructured uses BYOD; we will teach you the levers, not where to set them. Your Box and Your Trust Model has the mechanics; The Injection Problem has the threat side.
- The next eleven cairns are the practical content. Tools, gates, daily loop. If you only read five, read 1, 3, 4, 11, 12.
- What part of the "shape of the work has shifted up the stack" framing matches your experience, and what part doesn't? Where would you push back?
- Pick one piece of code-quality you currently enforce by review or convention. What would it take to turn it into a deterministic check the build can run without you?
- If you had to describe your own current trust model with agents in three sentences, what would it say? Which boundary would you most want to revisit after reading Your Box and Your Trust Model?
- Osprey Strike — From Emergency to Resolution — The companion trail. Read this when you want to understand the product domain we are working in: emergency callouts, fiber operations, the architectural choices that shape Strike.
- The Injection Problem — The threat-side companion to this trail's trust-model framing. LLM agents read instructions and data through the same channel; that has consequences for how much rope you give them.
- The Quiet Teammate — A separate cairn on what it is like to work alongside an agent in production for an extended period. Useful background reading for the trust-model section.
- Beads — The CLI issue tracker we use as the agent's persistent memory across sessions. Beads, the Backbone covers it in depth.
- Timbers — The development ledger we use to capture why a commit happened, not just what changed. Timbers, the Ledger covers it in depth.
- Gall's Law (Wikipedia) — The "complex systems evolve from simple ones" principle that anchors the project's prime directive and shows up repeatedly across our cairns.
Generated by Cairns · Agent-powered with Claude