Amazon built an AI recruiting tool trained on a decade of historical hiring data. Because most CVs submitted during that period came from men, the model learned to penalise CVs that contained words like 'women's' (as in 'women's chess club') and downgraded graduates of all-women's colleges. Amazon scrapped the tool.
Amazon built a machine learning system to automate CV screening, trained on 10 years of submitted CVs and hiring outcomes. The dataset reflected a male-dominated industry: the majority of applications and hires were male. The model generalised this pattern into a bias against female applicants. Amazon discovered in 2015 that the tool was penalising CVs that included the word 'women's' and down-ranking graduates of all-women's colleges. The tool was retrained multiple times but new biases kept emerging. Amazon ultimately scrapped it.
How the Production Safety Framework maps to this failure
A textbook D3 failure: the training data encoded historical discrimination, and no pre-deployment bias audit was performed. The D6 gap compounded the harm — automated ranking without human oversight meant the bias propagated at scale before it was detected internally. This case is foundational to understanding that data provenance (D3) is not just about privacy: it includes the fairness and representativeness of data used to train models that affect people.
Specific PSF controls mapped to each failure point
System scrapped. Reuters reported the story in October 2018, generating significant regulatory attention and public scrutiny of AI in hiring. EU AI Act now classifies hiring AI as high-risk.
The AIDA exam tests PSF knowledge across all 8 domains. Free to take, immediately verifiable.