Standard : Internal data is structured, governed, and accessible for safe and effective AI use
Purpose and Strategic Importance
AI systems derive value from data. Without reliable access to relevant internal data, AI initiatives remain superficial — relying on generic models that cannot reflect organisational context, knowledge, or operations. This standard ensures that internal data, knowledge artefacts, and operational signals are structured, catalogued, and made accessible through governed mechanisms that allow AI systems to retrieve, interpret, and reason over them safely and accurately.
Aligned to our "Data-Driven Decision-Making" policy, this standard drives the conditions necessary for AI to become a genuine organisational capability rather than a collection of isolated experiments. Accessibility must be balanced with governance: poorly managed access risks data leakage, compliance violations, and incorrect AI outputs. At maturity, data becomes an organisational asset continuously feeding intelligent systems that enhance both operational efficiency and strategic decision-making.
Strategic Impact
- AI tooling can leverage internal knowledge, reducing reliance on generic external models
- Automation, insight generation, and decision support are grounded in accurate organisational context
- Data quality improvements compound across every downstream AI and analytics use case
- Engineering and product teams move faster when access to relevant data is frictionless and trusted
- Organisational knowledge is preserved, discoverable, and reusable rather than siloed in individuals or teams
Risks of Not Having This Standard
- AI pilots remain disconnected from real organisational knowledge, producing generic or incorrect outputs
- Competitive disadvantage as other organisations leverage internal data to train and tune AI systems effectively
- Shadow data solutions emerge as teams work around access friction, creating quality and compliance risks
- Inability to scale AI use cases beyond small, curated proof-of-concept datasets
- Inconsistent data formats and quality erode trust in AI outputs across the organisation
CMMI Maturity Model
Level 1 – Initial
| Category |
Description |
| People & Culture |
Teams are unaware of internal data assets and their potential for AI use. Data ownership is unclear and rarely discussed. Reliance on external or generic data sources is the norm. |
| Process & Governance |
No formal processes exist for data cataloguing, access provisioning, or quality management. Data access is entirely ad hoc and manually negotiated. |
| Technology & Tools |
Data is fragmented across departmental systems with no integration layer or mechanisms for AI consumption. Formats are inconsistent and undocumented. |
| Measurement & Metrics |
No metrics track data accessibility, discoverability, or quality for AI purposes. The absence of internal data integration is not recognised as a strategic risk. |
Level 2 – Managed
| Category |
Description |
| People & Culture |
Selected teams have access to key datasets, but knowledge is siloed and inconsistently shared. Awareness of internal data value for AI begins to grow in pockets. |
| Process & Governance |
Formal request processes exist for data access. Partial documentation of data sources is maintained, though coverage is uneven and slow to update. |
| Technology & Tools |
Key datasets can be accessed but integration requires significant manual effort. Data quality varies widely between sources. Security concerns constrain availability, and AI pilots are restricted to small, curated datasets. |
| Measurement & Metrics |
Incremental progress toward data-driven AI is visible within individual teams, but uneven capability and shadow data solutions remain endemic. |
Level 3 – Defined
| Category |
Description |
| People & Culture |
Data and AI teams collaborate routinely. Clear permissions and usage policies are understood and followed. Cross-functional responsibility for data accessibility is established. |
| Process & Governance |
Internal data is organised, documented, and accessible through defined mechanisms that support AI integration. Catalogues of available datasets are maintained. Ongoing governance ensures data remains usable and compliant. |
| Technology & Tools |
Standardised data formats and interfaces are adopted across teams. Integration pipelines for AI applications are in place. Data quality management practices are established and tracked consistently. |
| Measurement & Metrics |
Data quality, coverage, and access latency are measured. Gaps in cataloguing and integration are identified and prioritised through regular governance reviews. |
Level 4 – Quantitatively Managed
| Category |
Description |
| People & Culture |
Data is treated as a strategic asset. Cross-functional ownership spans data engineering, product, and AI teams. Data stewardship is a recognised and rewarded engineering discipline. |
| Process & Governance |
Automated data ingestion and transformation is standard. Metadata and lineage tracking ensure traceability across all AI-accessible datasets. Privacy and compliance requirements are continuously aligned and audited. |
| Technology & Tools |
Integration spans organisational systems with support for real-time or near-real-time data. Continuous quality monitoring is automated, with alerts triggered by data drift or freshness violations. |
| Measurement & Metrics |
Significant productivity and insight gains from AI are demonstrable and attributed to improved data accessibility. Data management complexity is acknowledged and managed through platform investment. |
Level 5 – Optimising
| Category |
Description |
| People & Culture |
The organisation treats data as a continuously improving strategic asset. Continuous improvement of data accessibility, quality, and governance is embedded in engineering culture and rituals. |
| Process & Governance |
Unified access to relevant internal knowledge with minimal manual intervention. Strong safeguards for privacy and security are built into access patterns by default. Compliance is continuous, not periodic. |
| Technology & Tools |
Data flows seamlessly into AI systems, enabling real-time insights, automation, and adaptive decision-making across the organisation. Context-aware data retrieval for AI applications is standard. Ongoing investment ensures data integrity as the foundation for all AI capability. |
| Measurement & Metrics |
Transformational efficiency gains, enhanced organisational learning, and strong competitive advantage from AI are demonstrable. The organisation continuously benchmarks and improves data platform maturity to sustain AI effectiveness. |
Key Measures
- % of internal datasets catalogued and discoverable through a governed mechanism
- Data freshness and accuracy scores across AI-integrated data sources
- Time to provision access to a dataset for a new AI use case (access lead time)
- % of AI applications relying on internal versus external data sources
- Volume and frequency of shadow data solutions identified and retired
- Data quality incident rate (missing fields, schema violations, stale records) across AI pipelines
- Developer and data consumer satisfaction with internal data accessibility (survey-based)