The execution surplus

AI collapsed the cost of building software. It did not collapse the cost of knowing what to build. Every engineering organisation now sits on an execution surplus: virtually unlimited capacity to produce, with no corresponding increase in the judgement required to produce the right things. Most companies respond by shipping more. The ones that win will ship less.

The throughput trap

Faros AI's productivity report, drawn from telemetry across 10,000 developers on 1,255 teams, quantifies the surplus precisely. Teams with high AI adoption complete 21% more tasks and merge 98% more pull requests. Then the other numbers: PR review time increases 91%, bugs per developer rise 9%, and average PR size inflates by 154%.

The critical finding is what happens when you zoom out. No significant correlation exists between AI adoption and company-level improvements across throughput, DORA metrics, or quality KPIs. The gains at team level simply do not scale when aggregated. The execution bottleneck does not disappear. It migrates upstream to the humans who must review and decide.

That 9% bug increase per developer sounds modest in isolation. Compound it with the volume increase: if teams ship roughly twice as many pull requests, each carrying 9% more defects, total bug surface grows by approximately 120%. Review time doubling means reviewers become the constraint. Reviewer fatigue leads to rubber-stamping, which lets more bugs ship. The feedback loop is vicious.

CodeRabbit's independent analysis confirms the quality dimension. AI-generated pull requests contain roughly 1.7 times more issues than human-written ones (10.83 versus 6.45 issues per PR), with excessive I/O operations appearing eight times more frequently in AI-authored code. Engineers report saving 3.6 hours per week and shipping 60% more PRs, but production-impacting bugs that used to surface once or twice a year can become weekly events at higher volumes. Two-thirds of global organisations now face significant outage risk within the next year, and nearly half estimate that poor software quality costs them over $1 million annually.

The surplus is real. What organisations do with it determines everything.

The wrong thing, faster

Mozilla AI's essay When Shipping Software Becomes Too Easy captures the strategic implication: when the cost of building collapses, the cost of building the wrong thing increases dramatically. Frictionless execution means every product decision carries more weight, because the friction that once slowed bad decisions (long development cycles, resource constraints, prioritisation debates) has been removed.

Productboard's analysis reinforces this. When shipping becomes easy, velocity stops being a differentiator. The teams that succeed use AI to amplify judgement rather than bypass it. Most organisations are doing the latter.

The result is perfectly executed mediocrity. Features built on internal biases rather than validated needs. Backlogs full of things nobody asked for, all shipped with impressive velocity. AI does not introduce this failure mode. It removes the natural brake that used to limit its damage. When building was expensive, bad ideas died in prioritisation. When building is nearly free, bad ideas ship.

Taste is the new scarcity

Ravi Mehta, former CPO of Tinder and product leader at Facebook and TripAdvisor, frames the organisational shift in his Atlassian piece. Creating something has always required two things: taste (knowing what you want to make) and craft (the skill to make it). AI is rapidly democratising craft. When anyone can produce polished output, the competitive premium shifts to those with better judgement about what's missing, what feels right, what users will actually love, and when to stop adding.

This is a classic inversion, structurally identical to what happened when the printing press made copying cheap. Before Gutenberg, the scarce resource was reproduction: scribes, materials, time. After Gutenberg, reproduction became abundant and the scarce resource became curation and editorial judgement. The person who decided what was worth printing became more valuable than the person who could copy a manuscript.

We are living through the same transition in software. Execution was scarce, now it is abundant. Judgement felt abundant when execution constrained how much of it you needed. Now that the constraint is gone, judgement is revealed as the binding scarcity it always was. Most organisations are still optimising for the old scarcity, building muscle memory for volume when they need it for discernment.

The judgement pipeline

Harvard Business Review's February 2026 analysis by David Duncan highlights where this gets structurally dangerous. AI now handles the messy, repetitive work that once built judgement in junior employees. Senior people with deep experience get enormous productivity gains. Juniors often cannot tell whether AI-generated output is good or how to improve it.

The mechanism is straightforward. Judgement develops through exposure to consequences: making decisions, seeing them play out, adjusting. When AI handles the work that provided that exposure, junior employees lose the training ground. Organisations risk ending up with managers who have never done the underlying work and increasingly thin leadership pipelines. This is not a future problem. It is happening now, in every organisation that has handed AI tools to junior employees without redesigning how those employees learn.

Duncan argues that keeping humans in the loop is insufficient. Organisations must redesign work to build judgement deliberately: clarifying who makes decisions and why, exposing people to consequences rather than shielding them with AI assistance, and structuring graduated responsibility so that juniors develop the discernment that seniors built the hard way.

The broader pattern from Fortune and CIO reporting confirms this is not confined to engineering. Workers across functions are not getting their time back. Companies use AI-driven productivity gains to demand more output from the same people. What used to be an eight-hour workload becomes something larger. More work, not better work. MIT's research shows that AI adoption tends to hinder productivity in the short term, with measurable declines before organisations gradually shift toward more AI-compatible operations.

Investing the surplus

The companies that will extract real value from this transition share a common discipline: they invest the execution surplus in judgement infrastructure rather than output volume.

Better evaluation frameworks before committing to build. Deeper exploration of whether a problem is worth solving, not whether it can be solved. Deliberate development of taste and discernment at every level, from intern to executive. And the discipline to ship less code while delivering more actual value.

As someone who both writes code and decides what to build, I feel this shift daily. The building part got easier. The deciding part got harder. Nobody trained for it. Engineering culture still rewards output volume: story points completed, PRs merged, features shipped. We need to start rewarding judgement: what we chose not to build, what we killed before it shipped, what we validated before writing a single line of code.

Shipping is table stakes now. The organisations that recognise this earliest will compound their advantage, because good judgement about what to build is itself a learning loop. Better decisions produce better data about what works, which informs the next round of decisions. The organisations that keep optimising for throughput will discover, too late, that they were building the wrong things faster than anyone else.

The throughput trap

The wrong thing, faster

Taste is the new scarcity

The judgement pipeline

Investing the surplus

Stay up to date

More articles

The easy work was load-bearing

The competence penalty

Borrowed competence