• Home
  • BVSSH
  • C4E
  • Playbooks
  • Frameworks
  • Good Reads
Search

What are you looking for?

Standard : Logging is embedded in design and operations

Purpose and Strategic Importance

This standard ensures that logging is a first-class concern throughout system design, development, and operations. By capturing consistent, structured, and actionable log data at every layer—from infrastructure and middleware to application code and user interactions—teams gain the visibility needed to detect, diagnose, and resolve issues rapidly.

Embedding logging practices up front reduces firefighting, improves system health, and supports compliance and auditability.

Strategic Impact

  • Faster detection and diagnosis through consistent log quality
  • Improved system reliability and proactive remediation
  • Enhanced auditability, compliance, and security
  • Better capacity planning and performance tuning using log insights

Risks of Not Having This Standard

  • Blind spots in production environments
  • Inconsistent log formats impede automation and monitoring
  • Slower incident response and root cause analysis
  • Compliance failures and audit gaps
  • High operational debt from fragmented logging practices

CMMI Maturity Model

Level 1 – Initial

Category Description
People & Culture Logging is informal and unstructured.
Developers rely on print statements and local debugging.
Process & Governance No agreed standards for what to log or how to log it.
Technology & Tools Logs are siloed, not aggregated or searchable.
Little or no centralised visibility.
Measurement & Metrics No metrics on log coverage, latency, or quality.

Level 2 – Managed

Category Description
People & Culture Teams begin to recognise the value of structured logging.
Process & Governance Logging expectations exist but are not uniformly enforced.
Technology & Tools Logs are centralised via a shared platform (e.g. ELK, Loki).
Some alerting is enabled.
Measurement & Metrics Key service logs are visible, but coverage and format are inconsistent.

Level 3 – Defined

Category Description
People & Culture Developers follow agreed logging conventions and schemas.
Process & Governance Logging is embedded in design and reviewed during PRs or architecture gates.
Technology & Tools Logs are emitted in structured formats (e.g. JSON).
Standard fields (correlation IDs, timestamps) are mandated.
Measurement & Metrics Log latency and completeness are measured.
Incident postmortems include log review.

Level 4 – Quantitatively Managed

Category Description
People & Culture Teams use logging insights to improve systems and guide decisions.
Process & Governance Log health is part of operational reviews.
Playbooks reference key logging signals.
Technology & Tools Logs drive automated alerts, anomaly detection, and dashboarding.
Measurement & Metrics Alert precision, MTTD, and structured log coverage are tracked across teams.

Level 5 – Optimising

Category Description
People & Culture Logging is treated as a product—teams iterate on its usefulness and design.
Process & Governance Logging practices are continuously refined via feedback loops and operational data.
Technology & Tools Advanced techniques like log sampling, tracing, and ML-based alerting are adopted.
Measurement & Metrics Insights from logs drive architecture decisions and pre-emptive fixes.

Key Measures

  • Structured Coverage: % of systems emitting logs using the agreed schema
  • MTTD (Mean Time to Detect): Average time from fault to detection based on logs
  • Log Latency: Time between an event occurring and it being available for search
  • Audit Completeness: % of compliance-relevant events captured in logs
  • Alert Precision: % of alerts from logs that are valid and require action
Associated Policies
Associated Practices
  • Incident Response Playbooks

Technical debt is like junk food - easy now, painful later.

Awesome Blogs
  • LinkedIn Engineering
  • Github Engineering
  • Uber Engineering
  • Code as Craft
  • Medium.engineering