• Home
  • BVSSH
  • C4E
  • Playbooks
  • Frameworks
  • Good Reads
Search

What are you looking for?

Standard : Operational readiness is tested before every major release

Purpose and Strategic Importance

This standard ensures operational readiness is tested before every major release—covering monitoring, alerting, rollback plans, and support handovers. It builds confidence that systems can perform reliably under real conditions.

Aligned to our "Resilience Over Uptime" policy, this standard reduces post-release surprises and enables safer, faster delivery. Without it, releases carry hidden risks that impact users, teams, and operational stability.

Strategic Impact

  • Improved consistency and quality across teams
  • Reduced operational friction and delivery risks
  • Stronger ownership and autonomy in technical decision-making
  • More inclusive and sustainable engineering culture

Risks of Not Having This Standard

  • Slower time-to-value and increased rework
  • Accumulation of inconsistency and process debt
  • Reduced trust in engineering data, systems, or ownership
  • Loss of agility in the face of change or failure

CMMI Maturity Model

Level 1 – Initial

Category Description
People & Culture Readiness is assumed, not tested.
Teams operate with a reactive mindset.
Process & Governance No formal readiness activities before releases.
Reliance on hope and heroic recovery.
Technology & Tools Minimal tooling support for pre-release validation.
No structured rollout planning.
Measurement & Metrics Readiness checks are not measured or tracked.
Post-release incidents are common.

Level 2 – Managed

Category Description
People & Culture Some teams discuss readiness but lack shared definitions.
Process & Governance Readiness checks (e.g., monitoring, rollback) are applied inconsistently.
Technology & Tools Basic tools support some readiness tasks.
Alerting and dashboards exist but are incomplete.
Measurement & Metrics Teams may track readiness activities manually, but data is not analysed systematically.

Level 3 – Defined

Category Description
People & Culture Readiness is a standard expectation.
Teams take pride in releasing stable, supportable features.
Process & Governance Defined checklist covers monitoring, alerting, rollbacks, support handover, and system health.
Technology & Tools Tooling supports structured readiness checks and documentation.
Measurement & Metrics Readiness compliance is tracked pre-release and reviewed post-release.

Level 4 – Quantitatively Managed

Category Description
People & Culture Teams analyse readiness failures and refine pre-release practices.
Process & Governance Readiness is a release gate for major changes.
Automated checks validate key conditions.
Technology & Tools Dashboards display readiness status by team and environment.
Measurement & Metrics Metrics show the correlation between readiness and incident trends.
Coverage is tracked over time.

Level 5 – Optimising

Category Description
People & Culture Operational readiness is treated as a team learning tool.
Scenarios are rehearsed to improve capability.
Process & Governance Readiness is embedded in the Definition of Done.
Practices evolve through post-incident and success reviews.
Technology & Tools Simulations, chaos testing, and automated drills validate end-to-end readiness.
Measurement & Metrics Readiness scores are benchmarked across teams and products.
Failures trigger continuous improvement loops.

Key Measures

  • % of releases with readiness checks completed
  • Time to resolve gaps identified during readiness
  • % of incidents linked to missed readiness tasks
  • Audit trail completeness for operational handover
  • Maturity score of readiness practices across teams
Associated Policies
  • Resilience Over Uptime
Associated Practices
  • Custom Metrics Instrumentation
  • Runbooks and Playbooks
  • Log Correlation for RCA
  • User Session Replay Tools
  • On-Call Rotation Health Checks
  • Health Checks & Readiness Probes
  • Container Security Scanning
  • Vulnerability Management Dashboards
  • Threat Modelling Workshops
  • Data Encryption-in-Transit & at-Rest
  • Threat Intelligence Feeds
  • Secure API Gateways
  • Shadow Testing in Production
  • Ensemble Testing
  • End-to-End (E2E) Testing
  • Load & Performance Testing
  • Operational KPIs for Dev Teams
  • Design for Failure
  • Observability-Driven Design
  • Sprint Demos for Stakeholders
  • Immutable Infrastructure
  • Secure Code Training
  • Event Sourcing
  • Dependency Management Policies
  • Compliance-as-Code
  • Deployment Freeze Windows
  • Feedback Loops from Ops to Dev
  • Incident Response Playbooks

Technical debt is like junk food - easy now, painful later.

Awesome Blogs
  • LinkedIn Engineering
  • Github Engineering
  • Uber Engineering
  • Code as Craft
  • Medium.engineering