- May 12, 2026
- Reading time: 8min
On 2 August 2026, Article 10 of the EU AI Act comes into full force. For teams building or deploying high-risk AI systems, this is not a distant regulatory milestone, rather a binding legal obligation with significant financial penalties for non-compliance under Article 99 of the Act.
What makes Article 10 particularly significant for AI development teams is the specificity of its language. Unlike many regulations that deal in vague principles, Article 10 names the exact data preparation processes it governs: annotation, labelling, cleaning, updating, enrichment and aggregation. In other words, the everyday work of building AI training datasets, and from August 2026, they are regulated.
This article is written for the teams doing that work: data leads, MLOps engineers, heads of AI, and compliance officers at companies whose AI systems fall into the high-risk category. We will walk through what Article 10 requires, what compliant data preparation actually looks like in practice, and what your team needs to have in place before the deadline.
This article reflects our interpretation of the EU AI Act as a data annotation provider. It does not constitute legal advice. We recommend consulting qualified legal counsel for your specific compliance obligations.
Note: On 7 May 2026, the European Parliament and Council reached a provisional political agreement on the Digital Omnibus on AI, which would defer Annex III high-risk obligations to 2 December 2027. This agreement is not yet formally adopted into law – until that formal adoption, 2 August 2026 remains the legally binding deadline. We recommend continuing compliance preparations under the original deadline, treating the expected extension as a possible benefit rather than a planning assumption. under the original deadline, treating the expected extension as a possible benefit rather than a planning assumption.
What is high-risk AI under the EU AI Act?
The EU AI Act classifies AI systems by the risk they pose to individuals’ rights, safety and access to services. Most AI applications remain lightly regulated or unregulated. High-risk systems, however, face a comprehensive set of obligations, and Article 10’s training data requirements apply specifically to this category.
Under Annex III of the Act, a system is classified as high-risk if it operates in one of the following domains:
- Biometric identification and categorisation – facial recognition, emotion recognition, remote biometric ID systems
- Critical infrastructure – AI managing energy grids, water systems, transport networks
- Education and vocational training – systems that determine admission, assess students, or influence professional certification
- Employment and worker management – CV screening, candidate ranking, performance monitoring, hiring and promotion decisions
- Essential private and public services – credit scoring, insurance risk assessment, social benefits eligibility
- Law enforcement – crime analytics, predictive policing, risk assessment tools used by authorities
- Migration and border management – asylum and visa processing, risk assessment at borders
- Administration of justice – AI supporting legal or judicial decisions
The Act also applies to AI systems that profile individuals, meaning automated processing of personal data to assess aspects of a person’s life such as behaviour, location, preferences or economic situation.
One of the most consequential aspects of the regulation is its geographic scope. Any organisation, wherever it is based, must comply if its AI systems are used within the EU or if their outputs affect EU residents.
A quick self-check: if your AI system processes personal data to make or inform decisions affecting people in the EU, assume you are in scope.
What does Article 10 actually require?
Article 10 applies to training, validation and testing datasets, meaning the entire model development lifecycle, not just deployment. The following requirements apply.
Relevance and representativeness – Datasets must reflect the geographic, contextual, behavioural and demographic conditions of real-world deployment. A medical AI trained on data from one hospital type in one country will struggle to satisfy this if deployed across diverse patient populations.
Quality – Datasets must be as complete and error-free as reasonably achievable, with documented quality management. Random sampling and informal checks are no longer sufficient.
Bias examination and mitigation – Providers must actively examine datasets for biases that could affect health, safety or fundamental rights, and document both the examination and any mitigation measures taken. Article 10(5) also permits processing of sensitive data categories specifically to detect and correct bias, under strict conditions.
Documentation of all data preparation processes – This is the requirement most teams underestimate. Article 10(2)(c) explicitly names the operations that must be documented, which brings us to the next section.
The six data preparation processes named in Article 10
Article 10(2)(c) explicitly names six data preparation operations, annotation, labelling, cleaning, updating, enrichment and aggregation, as examples of what must be documented.
Annotation – attaching labels, attributes or relationships to raw data points. In medical AI, this might mean a clinician marking lesion boundaries in an MRI. In automotive AI, it means bounding boxes around pedestrians in video frames.
Annotation must be documented with methodology records, annotator qualifications, and inter-annotator agreement (IAA) scores, the quantitative measure of label consistency across your team.
Labelling – assigning categorical classification labels at scale. Must be reproducible: different labellers given the same data point and guidelines should consistently arrive at the same label. Document your taxonomy, consistency evidence, and any corrections applied.
Cleaning – removal of errors, duplicates and corrupted entries. Requires a documented pipeline: what was removed, by what logic, in what sequence, authorised by whom. Version control is essential.
Updating – retraining on new or revised data must be documented each time. What changed, why, and how the update was validated. This connects directly to preventing model collapse, dataset quality issues in retraining cycles are one of its primary causes.
Enrichment – adding supplementary data to improve coverage or representativeness. Requires provenance documentation: where did the data come from, under what consent conditions, and does it introduce new biases while addressing existing ones?
Aggregation – combining multiple data sources into a single training corpus. Document the provenance, quality profile and bias implications of each source. This is where compliance failures are most likely to emerge undetected, particularly with synthetic data in the mix.
As our 2026 data annotation trends research shows, these documentation requirements are already reshaping annotation workflows across industries, teams that have started early are significantly better positioned.
Why annotator diversity is a compliance question, not just an ethical one
Article 10’s representativeness requirement has an operational implication that most compliance checklists skip over: the composition of your annotation workforce is itself a risk variable.
Homogeneous annotation teams produce systematically skewed data, not through carelessness, but because any single perspective has structural limits.
A hiring algorithm labelled by annotators from a narrow demographic will learn to replicate the biases of that demographic. A medical imaging model annotated by clinicians from one type of healthcare environment may underperform across others. These are documented failures in deployed systems, and they are precisely what Article 10’s bias and representativeness requirements are designed to prevent.
Humans in the Loop was built from the start around a different approach. As a social enterprise founded in 2017, our workforce draws from conflict-affected and displaced communities across Bulgaria, Turkey, Syria, Iraq, Afghanistan, Lebanon, Yemen, Portugal, Ukraine, Moldova, Kenya and the DRC – with over 50% women annotators. This is not a diversity initiative sitting alongside our work. It is structural to how we operate, grounded in our ethical AI approach and fair work policy.
The annotation quality case for this is substantive: annotators who bring different cultural contexts, different relationships to institutions, different lived experiences of the situations AI systems are being trained to navigate, they catch what homogeneous teams miss. That is exactly the kind of representativeness Article 10 is asking you to demonstrate.
We explore this in more depth in our piece on how human annotation drives responsible AI, and it is the reason we believe ethical data annotation and regulatory compliance are not separate conversations.
Humans in the Loop provides human (not AI-generated) data annotation by an ethically sourced, geographically diverse workforce. If you’re thinking about how your training data holds up against Article 10’s representativeness and bias requirements, start a free pilot or book a call with our team.
What compliant annotation looks like in practice
A compliant annotation workflow for a high-risk AI system has four non-negotiable elements:
Documented task specification. Before annotation begins, guidelines must define what is being labelled, how edge cases are handled, and what quality standards apply. This document is part of your compliance record.
Qualified annotators. Medical image annotation requires clinical knowledge. Automotive scene annotation requires understanding of road conditions across contexts. Annotator qualifications must be documented, because who labelled your data is part of what regulators will ask about.
Multiple annotators and IAA measurement. A single annotator per data point bakes that individual’s blind spots into the dataset. Multiple independent annotators, compared using Cohen’s Kappa or Fleiss’ Kappa, produce both better labels and the consistency evidence Article 10 compliance requires.
Audit logs. Every annotation decision, correction and quality check should be time-stamped and traceable. The audit log is what allows you, or a regulator, to reconstruct the history of any data point.
None of this is bureaucratic overhead invented by regulation. It is what rigorous annotation practice looks like. The EU AI Act has formalised standards that responsible annotation teams have been working toward for years.
Common mistakes AI teams make with training
Data relying on automated labelling without human oversight. Automated tools label patterns, they cannot catch the nuanced errors, culturally specific misclassifications, or edge cases that a trained human annotator would flag. Speed and scale do not substitute for judgement.
Treating bias as a model problem. Article 10 requires bias examination at the data level. If you are only testing model outputs after training, your documentation will not satisfy the regulation, and the underlying problem remains in the dataset.
No IAA measurement. Single-annotator workflows cannot produce the consistency evidence Article 10 requires. If you cannot quantify label consistency, you cannot demonstrate data quality.
No external documentation trail. Internal annotation can be high quality, but without formal records of methodology, annotator profiles and IAA scores, you may find yourself unable to produce what a conformity assessment body expects.
Starting compliance work in July 2026. Building a complete Article 10 audit trail retroactively is a significant undertaking. The teams addressing this now are not just less exposed, they are producing better training data in the process.
How to prepare before August 2026: a practical checklist
As of the date of this article (12 May 2026), a provisional political agreement on the Digital Omnibus on AI was reached on 7 May 2026, which would defer Annex III obligations to 2 December 2027. However, this agreement is not yet formally adopted – it must still be endorsed by both co-legislators and published in the Official Journal, which is expected to happen before 2 August 2026. Until that formal adoption, 2 August 2026 remains the legally binding deadline. Do not wind down compliance preparations on the assumption the extension is already in effect. The teams acting now are better positioned under any scenario.
- Map your AI systems against Annex III – identify which are high-risk and therefore in scope.
- Audit training datasets for representativeness across geographic, demographic and contextual dimensions.
- Assess your annotation team’s diversity – this is a compliance input, not just an HR consideration.
- Document all six Article 10 data preparation processes: methodology,
- personnel, quality standards.
- Implement IAA measurement as a standard pipeline step and retain the scores.
- Conduct and document a bias examination – even a clean result needs to be on record.
- Establish version control and provenance documentation for all dataset sources.
- Have legal counsel review your Article 10 documentation before August 2.
The bottom line
The EU AI Act is not asking for the impossible. It is asking for rigour, documentation and accountability in processes that should already be part of responsible AI development. For teams building training datasets informally, the path to compliance requires real investment. For teams already operating to high standards, it is largely about formalising what they do.
At Humans in the Loop, we have been making the case since 2017 that ethical data annotation, done by humans, drawn from diverse communities, with accountability built in, produces better AI. Article 10 has now made that case in law.
Book a call with our team to talk through your training data pipeline or start a free pilot to see our annotation workflow in action.
