Every AI system will eventually fail. The question isn't whether — it's whether you have a plan. AI-specific incident response extends traditional incident management with considerations unique to AI: bias discoveries, drift events, and model failures.
AI incidents are different from traditional IT incidents. Classify them accordingly:
Model failure — The model produces incorrect, unreliable, or nonsensical outputs. May result from data drift, concept drift, or infrastructure issues.
Safety event — The AI system causes or contributes to physical harm, property damage, or endangerment. Triggers heightened reporting obligations.
Bias incident — The AI system is discovered to produce discriminatory outcomes against a protected group. May trigger legal and regulatory obligations.
Security breach — Unauthorized access to the AI system, training data, or model weights. Includes adversarial attacks, data poisoning, and model extraction.
Privacy violation — The AI system processes personal data in violation of privacy regulations or organizational policy. Includes data leakage through model outputs.
Scope creep / misuse — The AI system is used outside its intended purpose or operates beyond its designed scope. Includes both intentional misuse and unintended scope expansion.
Define severity levels before incidents occur:
Critical (P1) — Active harm occurring or imminent. Safety events, large-scale bias affecting protected groups, major data breaches. Response: immediate — within 1 hour.
High (P2) — Significant impact but not immediately dangerous. Model producing materially incorrect outputs, moderate bias discovered, compliance violation identified. Response: within 4 hours.
Medium (P3) — Noticeable degradation but limited impact. Performance drift approaching thresholds, minor fairness disparities, non-critical scope issues. Response: within 24 hours.
Low (P4) — Minimal impact. Minor accuracy fluctuations, cosmetic output issues, informational events. Response: within 72 hours.
AI incidents may require notifying multiple stakeholders:
Regulators — Required when regulatory reporting obligations are triggered (EU AI Act serious incident reporting, GDPR breach notification, sector-specific requirements).
Affected individuals — When individuals are harmed or their rights are affected, they should be informed of the incident, its impact on them, and available remedies.
Board and senior leadership — Critical and high-severity incidents must be escalated to leadership per the governance framework's escalation procedures.
Third parties — Vendors whose AI components are involved, business partners, and downstream users of the AI system.
Public — In cases of significant public impact or when public trust is at stake, consider proactive disclosure.
The EU AI Act requires providers and deployers to report serious incidents involving high-risk AI systems:
What qualifies: Incidents resulting in death, serious damage to health, property, or the environment, or a serious breach of fundamental rights.
Who reports: Both providers and deployers have reporting obligations.
Timeline: Report to the relevant market surveillance authority without undue delay, and in any case within 72 hours of becoming aware of the incident.
Content: Initial report identifying the system, describing the incident, and preliminary assessment. Detailed follow-up report within defined timelines.
Every AI incident should trigger a structured post-incident review:
Root cause analysis — What caused the incident? Data issues? Model limitations? Monitoring gaps? Human oversight failure?
Impact assessment — Who was affected? What was the scope? What was the duration of harm?
Corrective actions — What changes will prevent recurrence? Model updates? Additional monitoring? Process changes?
Governance review — Were existing governance controls adequate? Were they followed? Where did the governance framework fail?
Lessons learned — What should the organization learn from this incident? How should governance be updated?
Documentation — Record the complete incident lifecycle: detection, response, investigation, resolution, and lessons learned.