HR & Employment AI Deployment Playbook
Employment AI is the only category explicitly listed as high-risk in the EU AI Act across hiring, promotion, and dismissal. NYC Local Law 144 mandates annual independent bias audits in the most common AI deployment city in the world. The EEOC has confirmed that algorithmic hiring tools can violate Title VII. This playbook covers the bias failure modes and the compliance architecture required before deployment.
The employer liability problem: US courts and the EEOC have confirmed that employers cannot transfer liability for discriminatory AI tools to vendors. If your third-party CV screening tool produces disparate impact, you are liable — even if you did not build the model, were not told how it works, and had no visibility into the training data.
Regulatory Landscape
Employment AI is one of the most heavily regulated AI application areas globally, and the regulatory surface is expanding. NYC Local Law 144 is already in force and enforcement has begun. The EU AI Act's Annex III designation for employment AI means full conformity assessment obligations for EU-deployed systems. State-level laws in Illinois, Maryland, and California add further requirements.
| Framework | Jurisdiction | HR AI Focus | PSF Domains |
|---|---|---|---|
| EU AI Act — Employment (Annex III) | EU | CV sorting, interview selection, promotion/dismissal decisions are listed high-risk in Annex III. Full conformity assessment required. | D1–D8 (all) |
| EEOC AI & Algorithms Guidance | US Federal | AI hiring tools that produce disparate impact on protected categories are actionable under Title VII. Employers retain liability for vendor tools. | D2, D6 |
| NYC Local Law 144 | New York City | Mandatory annual independent bias audit for automated employment decision tools. Results must be publicly posted. | D2, D4, D6 |
| OFCCP & Federal Contractors | US Federal | Federal contractors using AI in hiring must maintain selection rate data by race, sex, and ethnicity. Adverse impact analysis required. | D2, D4 |
| GDPR / UK GDPR — Art 22 | EU / UK | Employment decisions based solely on automated processing require human review, explanation right, and contest right. | D2, D6 |
| Illinois AIAA | Illinois | Video interview AI must disclose AI use, obtain consent, and limit retention. Annual bias audit required. | D2, D3, D6 |
| EU Transparent AI Directive (proposed) | EU | Broad worker data monitoring obligations and rights to explanation for automated workplace decisions | D4, D6 |
HR AI Systems and Their Risk Profiles
| AI System | Primary Risk | Known Cases / Precedent | Severity |
|---|---|---|---|
| CV / Resume Screening | Disparate impact on protected categories from historical hiring bias in training data | Amazon scrapped internal tool (2018) after gender bias discovered | Critical |
| Video Interview AI | Facial expression and tone analysis — no validated link to job performance; cultural and disability bias | HireVue class action threats; Illinois AIAA specifically enacted in response | High |
| Predictive Attrition | Can infer protected characteristics (pregnancy, illness) from behaviour patterns; self-fulfilling prophecy risk | Multiple EEOC investigations into retention prediction tools | High |
| Performance Scoring | Productivity metrics unfairly penalise workers with accommodations, part-time schedules, caregiving responsibilities | Amazon warehouse monitoring systems — congressional scrutiny 2022-23 | High |
| Workforce Planning / RIF AI | AI-assisted layoff selection must not produce disparate impact on protected categories | IBM age discrimination case involved algorithmic workforce management | Critical |
| Benefits Eligibility AI | Automated benefits decisions under ERISA; incorrect processing creates fiduciary liability | UnitedHealthcare AI benefits denial (2023) congressional investigation | High |
The Four Bias Mechanisms in Hiring AI
Understanding why hiring AI produces biased outcomes requires understanding the mechanism. Bias in hiring AI is not usually a single identifiable error — it emerges from structural features of how the systems are built and deployed.
Training data reflects past hiring decisions made by a workforce that was not representative. The model learns that certain profiles lead to success because past hires matching those profiles were given opportunities to succeed — not because the profiles are causally related to performance.
A CV screener trained on successful engineers at a historically male-dominated company learns that male indicators (pronouns in cover letters, certain universities, sports teams) correlate with hiring — and replicates that pattern.
The model uses features that appear neutral but correlate with protected characteristics. The model is technically not using gender or race as inputs, but the outputs produce systematic disparate impact.
ZIP code correlates with race in segregated geographies. A location-aware screening model can discriminate by race without ever receiving race as an input.
The model is retrained on its own decisions. If it screens out a group in round 1, those candidates never get hired, so there is no positive outcome data to correct the initial bias. The bias compounds with each retraining cycle.
An attrition predictor that flags a demographic group as high-risk leads to fewer promotions for that group, which produces more attrition, which reinforces the high-risk classification.
The model is trained to predict 'success' as defined by historical performance labels (manager ratings, promotion decisions) which were themselves subject to human bias. The AI is learning to replicate biased human judgement.
A performance scoring model trained on manager ratings that systematically rated remote workers lower during the 2020-21 period learns that remote work correlates with lower performance.
NYC Local Law 144: What Compliance Actually Requires
NYC LL144 applies to any employer or employment agency that uses an "automated employment decision tool" (AEDT) to screen candidates or employees in New York City. An AEDT is defined broadly: any computational process derived from machine learning that issues a simplified output used to assist or replace discretionary decision-making. This covers most modern CV screening, ranking, and scoring tools.
LL144 compliance checklist:
- Conduct an annual independent bias audit of each AEDT covering: selection rate by sex, race/ethnicity; scoring distribution by sex, race/ethnicity
- Publish the bias audit summary on the employer's website with the date of the audit and the name of the independent auditor
- Provide candidates with notice that an AEDT will be used at least 10 business days before use — including the job qualifications and characteristics the tool assesses
- Provide an accommodation process for candidates who request an alternative selection process (must be offered)
- Retain the bias audit results and supporting data for at least 3 years after the audit
The audit must be conducted by an independent party — the tool vendor\'s own bias testing does not satisfy the requirement. The definition of "independent" has been contested; the NYC DCWP guidance specifies that the auditor must not be the employer, the tool developer, or an entity that played a role in developing the AEDT.
PSF Domain Mapping for HR & Employment AI
HR AI systems ingest CVs, cover letters, job descriptions, performance notes, and survey responses — unstructured text written by humans with variable quality. Input governance must cover both the technical (sanitisation, schema) and the ethical (prohibited features).
- Define prohibited inputs: the AI system must not receive, infer from, or be allowed to use protected characteristic proxies (name-derived ethnicity inference, ZIP code, graduation year as age proxy)
- Standardise CV and application inputs through structured forms where possible — free-text inputs carry more bias than structured responses
- Log all inputs to AI hiring tools with retention policies that support bias audit requirements (NYC LL144, OFCCP)
Output validation for HR AI must include disparate impact analysis — not just technical output correctness. A CV screener that produces structurally valid output scores but shows 40% lower selection rates for female candidates has passed technical validation and failed legal compliance.
- Implement 4/5ths rule analysis: if any protected group's selection rate is less than 80% of the highest group's rate, the tool has disparate impact under EEOC guidance
- Validate that output scores are calibrated across demographic groups — model should not assign systematically different score distributions to equivalent profiles from different groups
- NYC LL144: engage an independent auditor before deploying any automated employment decision tool in New York City; publish results publicly
- Never use AI output as the sole basis for an employment decision — maintain documented human review in the decision chain
Employment AI processes some of the most sensitive personal data: medical information inferred from leave patterns, financial stress inferred from benefit selections, family status inferred from schedule requests, political views inferred from donations. This data requires elevated protection beyond general HR data practices.
- Audit what data the AI system actually processes — vendors often use more data than the stated feature set suggests
- Illinois AIAA: disclose AI use in video interviews, obtain explicit consent, delete video data within 30 days unless candidate consents to longer retention
- GDPR/UK GDPR: obtain explicit consent or identify a valid legal basis for processing special category data (health, biometric) in HR AI
- Prohibit the AI system from inferring protected characteristics from permitted inputs — technical prevention, not policy prohibition
- Data subject access requests must include AI-generated scores and the basis for automated decisions
NYC Local Law 144 makes observability a legal requirement: annual bias audits require documented outcome data by demographic group. Even outside New York, the inability to produce selection rate data by protected category is an EEOC and OFCCP compliance gap for federal contractors.
- Log all automated hiring decisions with outcome: final hire decision, protected category data (where lawfully collected), application stage at which AI recommendation was generated
- Generate selection rate reports quarterly — do not wait for annual audit to discover disparate impact
- Track the agreement rate between AI recommendations and final human decisions — high agreement indicates the human review step is not providing genuine independence
- Retain bias audit data for the longer of the statutory retention period or the life of any related employment litigation
HR AI model updates are changes that may require re-authorisation under the EU AI Act conformity assessment, new bias audit under NYC LL144, and OFCCP documentation updates. Model updates in hiring AI cannot be treated as routine software deployments.
- Treat every HR AI model update as a change event that requires bias re-evaluation before production deployment
- Maintain a separate bias evaluation dataset alongside functional evaluation — a new model version must pass bias tests, not just performance tests
- Document the change control pathway for HR AI model updates in the EU AI Act technical documentation (if in scope)
- Canary deployment for hiring AI: pilot a new screening model on a portion of roles before full deployment — compare selection rates between old and new model
GDPR Article 22 requires meaningful human review of automated employment decisions. The EU AI Act requires effective human oversight for high-risk systems including employment AI. The standard is genuine independence — not rubber-stamping AI recommendations. Automation bias (defaulting to AI recommendations) renders nominal oversight ineffective.
- Design HR workflows so that the human reviewer evaluates candidates before seeing the AI score — not after. Anchoring on the AI score eliminates the value of human review.
- Track reviewer disagreement rates: if human reviewers agree with AI recommendations 95%+ of the time, the oversight is nominal. Investigate root cause.
- Document the human decision basis independently of the AI recommendation — the decision record should be defensible without the AI score
- Provide appeal pathway: candidates must be able to contest AI-informed employment decisions and reach a human reviewer
HR AI systems are targets for adversarial gaming — candidates who discover the screening criteria attempt to optimise for them. This is not inherently a security problem, but deliberate deception and CV fraud automation are. The system should be hardened against coordinated manipulation.
- Monitor for CV patterns that suggest automated generation or coordinated manipulation (identical phrasing, suspiciously uniform formatting)
- Implement integrity checks for reference and credential verification — AI-assisted credential fraud is a documented and growing problem
- Protect the screening model's feature weights — exposure of exact criteria creates an optimisation target for bad-faith applicants
The HR tech vendor market is consolidating rapidly. Workday, SAP SuccessFactors, and Oracle HCM have all acquired or built AI hiring features that create deep platform lock-in. A vendor acquisition or AI feature deprecation can disrupt hiring pipelines. Additionally, vendor-provided bias audits may not meet independent audit requirements.
- Ensure your bias audit capability is independent of the AI vendor — a vendor-provided bias report does not satisfy NYC LL144 independent audit requirement
- Evaluate selection criteria portability: if you switch vendors, can you replicate the documented hiring criteria in the new system?
- Contract for data export rights: all candidate assessment data, model scores, and audit logs must be exportable on contract termination
Certify Your Production AI Expertise
The PSF covers the output validation and human oversight domains that underpin employment AI compliance. AIDA is free, and CPAP is recognised by practitioners deploying AI in regulated environments.
You understand the gaps.
Get the credential that proves it.
The AIDA examination tests applied PSF knowledge across all eight domains — exactly the gaps and strengths covered in this assessment. 15 minutes. No charge. Ever.
Get framework updates in your inbox
PSF assessments, deployment guides, and production AI analysis. Weekly. No hype.