Standard : Internal data is structured, governed, and accessible for safe and effective AI use

Purpose and Strategic Importance

AI systems derive value from data. Without reliable access to relevant internal data, AI initiatives remain superficial — relying on generic models that cannot reflect organisational context, knowledge, or operations. This standard ensures that internal data, knowledge artefacts, and operational signals are structured, catalogued, and made accessible through governed mechanisms that allow AI systems to retrieve, interpret, and reason over them safely and accurately.

Aligned to our "Data-Driven Decision-Making" policy, this standard drives the conditions necessary for AI to become a genuine organisational capability rather than a collection of isolated experiments. Accessibility must be balanced with governance: poorly managed access risks data leakage, compliance violations, and incorrect AI outputs. At maturity, data becomes an organisational asset continuously feeding intelligent systems that enhance both operational efficiency and strategic decision-making.

Strategic Impact

AI tooling can leverage internal knowledge, reducing reliance on generic external models
Automation, insight generation, and decision support are grounded in accurate organisational context
Data quality improvements compound across every downstream AI and analytics use case
Engineering and product teams move faster when access to relevant data is frictionless and trusted
Organisational knowledge is preserved, discoverable, and reusable rather than siloed in individuals or teams

Risks of Not Having This Standard

AI pilots remain disconnected from real organisational knowledge, producing generic or incorrect outputs
Competitive disadvantage as other organisations leverage internal data to train and tune AI systems effectively
Shadow data solutions emerge as teams work around access friction, creating quality and compliance risks
Inability to scale AI use cases beyond small, curated proof-of-concept datasets
Inconsistent data formats and quality erode trust in AI outputs across the organisation

CMMI Maturity Model

Level 1 – Initial

Category	Description
People & Culture	Teams are unaware of internal data assets and their potential for AI use. Data ownership is unclear and rarely discussed. Reliance on external or generic data sources is the norm.
Process & Governance	No formal processes exist for data cataloguing, access provisioning, or quality management. Data access is entirely ad hoc and manually negotiated.
Technology & Tools	Data is fragmented across departmental systems with no integration layer or mechanisms for AI consumption. Formats are inconsistent and undocumented.
Measurement & Metrics	No metrics track data accessibility, discoverability, or quality for AI purposes. The absence of internal data integration is not recognised as a strategic risk.

Level 2 – Managed

Category	Description
People & Culture	Selected teams have access to key datasets, but knowledge is siloed and inconsistently shared. Awareness of internal data value for AI begins to grow in pockets.
Process & Governance	Formal request processes exist for data access. Partial documentation of data sources is maintained, though coverage is uneven and slow to update.
Technology & Tools	Key datasets can be accessed but integration requires significant manual effort. Data quality varies widely between sources. Security concerns constrain availability, and AI pilots are restricted to small, curated datasets.
Measurement & Metrics	Incremental progress toward data-driven AI is visible within individual teams, but uneven capability and shadow data solutions remain endemic.

Level 3 – Defined

Category	Description
People & Culture	Data and AI teams collaborate routinely. Clear permissions and usage policies are understood and followed. Cross-functional responsibility for data accessibility is established.
Process & Governance	Internal data is organised, documented, and accessible through defined mechanisms that support AI integration. Catalogues of available datasets are maintained. Ongoing governance ensures data remains usable and compliant.
Technology & Tools	Standardised data formats and interfaces are adopted across teams. Integration pipelines for AI applications are in place. Data quality management practices are established and tracked consistently.
Measurement & Metrics	Data quality, coverage, and access latency are measured. Gaps in cataloguing and integration are identified and prioritised through regular governance reviews.

Level 4 – Quantitatively Managed

Category	Description
People & Culture	Data is treated as a strategic asset. Cross-functional ownership spans data engineering, product, and AI teams. Data stewardship is a recognised and rewarded engineering discipline.
Process & Governance	Automated data ingestion and transformation is standard. Metadata and lineage tracking ensure traceability across all AI-accessible datasets. Privacy and compliance requirements are continuously aligned and audited.
Technology & Tools	Integration spans organisational systems with support for real-time or near-real-time data. Continuous quality monitoring is automated, with alerts triggered by data drift or freshness violations.
Measurement & Metrics	Significant productivity and insight gains from AI are demonstrable and attributed to improved data accessibility. Data management complexity is acknowledged and managed through platform investment.

Level 5 – Optimising

Category	Description
People & Culture	The organisation treats data as a continuously improving strategic asset. Continuous improvement of data accessibility, quality, and governance is embedded in engineering culture and rituals.
Process & Governance	Unified access to relevant internal knowledge with minimal manual intervention. Strong safeguards for privacy and security are built into access patterns by default. Compliance is continuous, not periodic.
Technology & Tools	Data flows seamlessly into AI systems, enabling real-time insights, automation, and adaptive decision-making across the organisation. Context-aware data retrieval for AI applications is standard. Ongoing investment ensures data integrity as the foundation for all AI capability.
Measurement & Metrics	Transformational efficiency gains, enhanced organisational learning, and strong competitive advantage from AI are demonstrable. The organisation continuously benchmarks and improves data platform maturity to sustain AI effectiveness.

Key Measures

% of internal datasets catalogued and discoverable through a governed mechanism
Data freshness and accuracy scores across AI-integrated data sources
Time to provision access to a dataset for a new AI use case (access lead time)
% of AI applications relying on internal versus external data sources
Volume and frequency of shadow data solutions identified and retired
Data quality incident rate (missing fields, schema violations, stale records) across AI pipelines
Developer and data consumer satisfaction with internal data accessibility (survey-based)