Continuous Improvement Loops | Measurement Operating Model

The Problem With Most Improvement Efforts

Most engineering organisations have retrospectives. Most also have metrics dashboards. Very few have a genuine continuous improvement system. The retrospective produces a list of actions that nobody tracks. The dashboard produces a weekly email that nobody acts on. The two are never connected. The result is an organisation that measures a great deal and improves very little.

Continuous improvement is not a tool. It is a discipline - the systematic practice of noticing, understanding, experimenting, and learning as a regular part of how the organisation operates. It requires a feedback loop that is fast enough to be useful, structured enough to produce learning, and embedded enough to survive the pressure of delivery.

The PDCA Cycle and Its Engineering Application

The Plan-Do-Check-Act cycle, attributed to W. Edwards Deming, is the underlying structure of all improvement work. It is simple enough to be remembered and precise enough to be useful.

Plan - identify the problem, form a hypothesis about the cause, design an intervention
Do - run the intervention, usually small-scale and time-boxed
Check - measure the result against the hypothesis. Did it work? Did it work for the reason you expected?
Act - if it worked, standardise. If it did not, learn and repeat

The critical word is "Check". Most engineering teams skip it. They implement a change and assume it worked because nobody complained. Checking means going back to the data and asking whether the metric moved in the predicted direction by the predicted amount.

The PDSA variant replaces "Check" with "Study" - a deliberate acknowledgement that the goal is not just to verify but to understand. Understanding why something worked (or did not) is what produces transferable learning rather than local fixes.

Applying the Cycle in Practice

A cycle does not need to be complex to be valuable. A team that runs a one-week PDSA cycle on standup format, measures whether deployment frequency increased in the following sprint, and documents what they learned - that is a functioning continuous improvement system. It is also far more valuable than a quarterly retrospective with twenty action items nobody tracks.

Cycle length should match the change. Changes to a single ceremony might cycle in one sprint. Changes to release pipeline might need six to eight weeks. Changes to team structure need quarters. The mistake is running every improvement on the same cadence regardless of what you are changing.

Retrospectives as an Improvement Tool

The retrospective is the most widely used continuous improvement mechanism in engineering. It is also one of the most frequently broken.

A retrospective that produces value has four properties:

It focuses on the system, not the people. The question is not "who failed?" but "what in our process produced this outcome?" This is not soft - it is analytically correct. Individual mistakes are usually the result of system conditions that made the mistake easy to make.

It generates specific, owned, time-boxed actions. "We should improve our PR review process" is not an action. "Alice will draft a PR review checklist by next Tuesday and we will trial it for two sprints" is an action.

It tracks previous actions before generating new ones. Starting a retrospective without reviewing what happened to last time's actions is how you build a culture of performative improvement. Teams quickly learn that actions do not matter because nothing is ever followed up.

It has a nominated facilitator who is not the team lead. When the team lead facilitates, people edit their contributions. Rotate facilitation. The team lead should be a participant, not the person holding the microphone.

Why Most Retrospectives Are Ineffective

The most common failure mode is the retrospective that identifies the same problems every sprint. The team says "communication between frontend and backend is poor." It said the same thing last month. And the month before. This is not a communication problem - it is an improvement system problem. The retrospective is identifying symptoms but the organisation is not giving the team the authority, time, or support to address root causes.

The second most common failure is the action item graveyard. Teams generate eight actions, complete two, carry forward three, and quietly drop three more. The signal this sends is that improvement is performative. The fix is not to generate fewer actions - it is to generate ones that are actually achievable within the team's capacity and authority.

The Improvement Backlog

Treating improvement work the same way you treat product work - with a visible, prioritised backlog - is one of the highest-leverage changes an engineering organisation can make.

An improvement backlog contains:

Identified problems with evidence (not just complaints)
Hypotheses about root causes
Proposed interventions with owners
Success criteria - how you will know if it worked
Status of in-progress experiments

Making the improvement backlog visible does three things. First, it prevents the same issue from being raised repeatedly without action - once it is in the backlog, the question becomes when it gets prioritised, not whether anyone noticed. Second, it creates accountability for follow-through. Third, it builds organisational memory about what has been tried and what has worked.

Building Improvement Loops at Scale

Individual team retrospectives are not sufficient for an engineering organisation of any size. You need improvement loops at multiple levels.

Team level - sprint retrospectives, focused on the team's own process and practices. Fast cadence, high autonomy, owned and actioned by the team.

Domain or tribe level - monthly or quarterly cross-team retrospectives. Focused on patterns across teams, shared infrastructure, inter-team dependencies. Requires a facilitator and explicit time allocation.

Organisational level - quarterly engineering leadership retrospectives, focused on structural issues, resourcing, tooling investments, and systemic patterns that appear across domains. Connected to budget and planning cycles.

The key discipline is that each level addresses issues at its own level. Team-level retros should not spend time on problems that require organisational authority to fix - those get escalated with evidence. Organisational retros should not spend time on team-specific issues - those get delegated.

The Role of Leadership in Genuine Improvement

Continuous improvement fails when leadership treats it as a team-level activity that does not require their involvement or their change. Most improvement problems in engineering organisations are structural - they cannot be fixed by the teams experiencing them.

If teams repeatedly identify "we do not have enough time to address technical debt" and leadership's response is to generate a new action item for the team, the improvement loop is broken. The loop requires that structural issues reach the people with authority to change the structure.

Leadership's role in continuous improvement is:

Creating the psychological safety for teams to name problems honestly
Actively reviewing cross-team improvement data, not just delivery metrics
Acting on issues that require structural authority, visibly and with feedback
Protecting improvement capacity from delivery pressure - if teams have no time to improve, they will have no time to improve

The measure of a genuine improvement culture is not whether retrospectives happen. It is whether the organisation is measurably different six months from now than it is today - and whether anyone can point to the specific changes that produced the difference.

← Previous Data Quality and Metric Integrity