Microsoft launched Tay, an AI chatbot designed to learn from Twitter conversations. Within 16 hours, coordinated users had exploited Tay's repeat-after-me feature and lack of input filtering to teach it to produce racist, antisemitic, and misogynistic content. Microsoft took Tay offline the same day.
Microsoft launched Tay on Twitter in March 2016 as a conversational AI experiment. The system was designed to learn from its interactions. A coordinated group of users rapidly discovered that Tay had a 'repeat after me' feature, combined with no effective content filtering on inputs. Within hours, users had fed Tay enough hateful content that it began producing inflammatory statements unprompted. Microsoft pulled Tay offline within 24 hours.
How the Production Safety Framework maps to this failure
Tay is the canonical D1 + D2 failure. The attack surface was open by design: no input governance, a feature that explicitly instructs the model to repeat arbitrary text, and no output safety layer on a public-facing bot. D5 also failed — no red-team exercise was conducted before launch, and no monitoring was in place to detect rapidly escalating harmful output. The incident established that AI systems exposed to adversarial public input without controls will be exploited.
Specific PSF controls mapped to each failure point
Tay taken offline within 16 hours. Microsoft apologised. The incident became a defining early example of adversarial AI failure and is referenced in virtually every discussion of content moderation and AI safety.
The AIDA exam tests PSF knowledge across all 8 domains. Free to take, immediately verifiable.