Chapter 04. Ensuring Data Quality and Ethical Governance¶
Every AI system begins with data. But trust doesn’t begin at inference, it starts much earlier, in how data is planned, acquired, prepared, and released. Each stage introduces risks. Each decision leaves a governance trace, or a gap.
In this chapter, we follow the data lifecycle defined in ISO/IEC 5259-1, the international standard for AI data quality. It includes six core stages:
1. Data Requirements → 2. Data Planning → 3. Data Acquisition → 4. Data Preparation → 5. Data Provisioning → 6. Data Release
These steps are supported by continuous quality feedback loops, including:
- Data Quality Requirements
- Data Quality Assessment (profiling datasets, measuring quality)
- Data Quality Reports
🔍 Why We Focus on These Four Phases¶
While each stage is essential, this chapter focuses on the points where trust most often breaks down in practice, and where AI professionals can take concrete action:
- Acquisition (Section 4.1): Which data gets included in the dataset, and which gets ignored
- Preparation & Metadata (Section 4.2): How consent, lineage, and versioning are tracked
- Provisioning & Use (Section 4.3): When fairness audits catch bias before deployment
- Release & Lifecycle Decay (Section 4.4): How unmonitored data rots over time
Other stages like Data Requirements, Planning, and Quality Reporting are equally important for institutional governance and will be covered in the Advanced Level.
By the end of this chapter, you’ll understand where trust is built, where it breaks, and how to embed responsible governance into each stage of the AI data lifecycle.