Courses/ AIGP Certification Prep/ Day 27

Day 27 of 30

AI Incident Response and Reporting

⏱ 18 min 📊 Medium AIGP Certification Prep

Every AI system will eventually fail. The question isn't whether — it's whether you have a plan. AI-specific incident response extends traditional incident management with considerations unique to AI: bias discoveries, drift events, and model failures.

AI Incident Taxonomy

AI incidents are different from traditional IT incidents. Classify them accordingly:

Model failure — The model produces incorrect, unreliable, or nonsensical outputs. May result from data drift, concept drift, or infrastructure issues.

Safety event — The AI system causes or contributes to physical harm, property damage, or endangerment. Triggers heightened reporting obligations.

Bias incident — The AI system is discovered to produce discriminatory outcomes against a protected group. May trigger legal and regulatory obligations.

Security breach — Unauthorized access to the AI system, training data, or model weights. Includes adversarial attacks, data poisoning, and model extraction.

Privacy violation — The AI system processes personal data in violation of privacy regulations or organizational policy. Includes data leakage through model outputs.

Scope creep / misuse — The AI system is used outside its intended purpose or operates beyond its designed scope. Includes both intentional misuse and unintended scope expansion.

Knowledge Check

An AI chatbot begins providing specific medical dosage recommendations, which was never part of its intended functionality. This incident is BEST classified as:

This is scope creep — the AI system has expanded beyond its intended purpose without authorization. It's not a model failure (it's working, just outside scope), not a security breach (no unauthorized access), and not a privacy violation (though it could lead to one).

Incident Severity Classification

Define severity levels before incidents occur:

Critical (P1) — Active harm occurring or imminent. Safety events, large-scale bias affecting protected groups, major data breaches. Response: immediate — within 1 hour.

High (P2) — Significant impact but not immediately dangerous. Model producing materially incorrect outputs, moderate bias discovered, compliance violation identified. Response: within 4 hours.

Medium (P3) — Noticeable degradation but limited impact. Performance drift approaching thresholds, minor fairness disparities, non-critical scope issues. Response: within 24 hours.

Low (P4) — Minimal impact. Minor accuracy fluctuations, cosmetic output issues, informational events. Response: within 72 hours.

Stakeholder Notification

AI incidents may require notifying multiple stakeholders:

Regulators — Required when regulatory reporting obligations are triggered (EU AI Act serious incident reporting, GDPR breach notification, sector-specific requirements).

Affected individuals — When individuals are harmed or their rights are affected, they should be informed of the incident, its impact on them, and available remedies.

Board and senior leadership — Critical and high-severity incidents must be escalated to leadership per the governance framework's escalation procedures.

Third parties — Vendors whose AI components are involved, business partners, and downstream users of the AI system.

Public — In cases of significant public impact or when public trust is at stake, consider proactive disclosure.

EU AI Act Serious Incident Reporting

The EU AI Act requires providers and deployers to report serious incidents involving high-risk AI systems:

What qualifies: Incidents resulting in death, serious damage to health, property, or the environment, or a serious breach of fundamental rights.

Who reports: Both providers and deployers have reporting obligations.

Timeline: Report to the relevant market surveillance authority without undue delay, and in any case within 72 hours of becoming aware of the incident.

Content: Initial report identifying the system, describing the incident, and preliminary assessment. Detailed follow-up report within defined timelines.

Knowledge Check

A high-risk AI system deployed in the EU causes a serious incident resulting in significant property damage. Under the EU AI Act, when must the initial report be filed?

The EU AI Act requires serious incidents to be reported to the relevant market surveillance authority within 72 hours of becoming aware. This is an initial report — detailed follow-up can come later. Waiting for investigation completion is not acceptable for serious incidents.

Post-Incident Analysis

Every AI incident should trigger a structured post-incident review:

Root cause analysis — What caused the incident? Data issues? Model limitations? Monitoring gaps? Human oversight failure?

Impact assessment — Who was affected? What was the scope? What was the duration of harm?

Corrective actions — What changes will prevent recurrence? Model updates? Additional monitoring? Process changes?

Governance review — Were existing governance controls adequate? Were they followed? Where did the governance framework fail?

Lessons learned — What should the organization learn from this incident? How should governance be updated?

Documentation — Record the complete incident lifecycle: detection, response, investigation, resolution, and lessons learned.

Real-World Scenario

In February 2024, Canada's Civil Resolution Tribunal ruled against Air Canada after its customer service chatbot fabricated a bereavement fare discount policy that did not exist. A passenger, Jake Moffatt, asked the chatbot about bereavement travel discounts and was told he could book a full-fare ticket and then request a retroactive bereavement discount within 90 days. When Moffatt followed these instructions and submitted a refund request, Air Canada denied it, stating no such policy existed. The airline argued the chatbot was a "separate legal entity" responsible for its own actions. The tribunal rejected this defense, ruling that Air Canada was liable for all information provided on its website, including chatbot-generated content, and ordered the airline to pay Moffatt the difference in fare plus damages.

This case is a landmark AI incident for several reasons. First, it established legal precedent that organizations cannot disclaim liability for their AI systems' outputs — the "the chatbot said it, not us" defense failed. Second, it demonstrated a classic AI hallucination incident in a customer-facing deployment: the chatbot generated a plausible but entirely fabricated policy. Third, Air Canada's incident response was inadequate — rather than acknowledging the error and compensating the customer, the airline attempted to deflect responsibility, escalating a minor incident into a widely reported legal ruling that damaged the company's reputation.

For the AIGP exam, the Air Canada chatbot case illustrates the importance of having an AI-specific incident response plan that includes immediate triage of AI-generated misinformation, clear liability frameworks that acknowledge organizational responsibility for AI outputs, and stakeholder notification protocols that prioritize resolution over deflection. It also demonstrates why generative AI deployments require ongoing accuracy monitoring and human escalation paths for consequential interactions like fare quotes and policy inquiries.

Final Check

A post-incident review reveals that the AI system's monitoring dashboard showed warning signs for two weeks before the incident, but no one investigated. The PRIMARY governance improvement should focus on:

The monitoring detected the issue — the failure was in the human response. The governance improvement should focus on the escalation process: ensuring alerts trigger investigation, defining clear ownership for investigation, and establishing accountability for timely response. More metrics or a better dashboard won't help if alerts are ignored.

🎯

Day 27 Complete

"Every AI system will eventually fail. Build incident classification, notification protocols, and post-incident review into your governance framework before you need them. EU AI Act requires serious incident reports within 72 hours."

Go Deeper

Want to see these concepts applied to full case studies? Check out AIGP Scenarios — 10 real-world governance simulations mapped to the AIGP exam domains.

Next Lesson

Updating, Retraining, and Retiring AI Systems

→