All Lessons Course Details All Courses Enroll
Courses/ AIGP Certification Prep/ Day 21
Day 21 of 30

Model Evaluation, Testing, and Validation

⏱ 18 min 📊 Advanced AIGP Certification Prep

Testing isn't just a technical activity — it's a governance activity. The AIGP exam tests whether you can define testing requirements, understand fairness metrics, and know when to apply red teaming.

Fairness Metrics

Multiple mathematical definitions of fairness exist — and they can be mutually exclusive. Governance professionals must understand the tradeoffs:

Demographic parity — The AI system's positive outcome rate should be equal across demographic groups. Example: The hiring AI recommends candidates at the same rate regardless of gender.

Equalized odds — The true positive rate AND false positive rate should be equal across groups. Example: A medical diagnostic AI detects disease equally accurately for all demographic groups.

Individual fairness — Similar individuals should receive similar outcomes. More intuitive but harder to measure — requires defining "similarity."

Predictive parity — The precision (positive predictive value) should be equal across groups. Example: When the AI flags a transaction as fraud, it's equally likely to be actual fraud regardless of the customer's demographic.

Key exam point: These metrics can conflict. A system that achieves demographic parity may not achieve equalized odds. The governance decision is: which fairness metric is most appropriate for this use case?

Knowledge Check
A loan approval AI achieves demographic parity — the approval rate is 40% for all racial groups. However, the false positive rate (approving loans that default) is significantly higher for one group. Which fairness metric reveals this disparity?
Equalized odds considers both true positive and false positive rates across groups. Demographic parity (equal approval rates) is satisfied, but equalized odds reveals that the false positive rates differ — meaning the model's errors are distributed unequally across groups.

Robustness Testing

AI systems must function reliably under adverse conditions:

Adversarial inputs — Deliberately crafted inputs designed to fool the AI. Examples: modified images that cause misclassification, perturbed text that changes sentiment analysis results.

Edge cases — Unusual but legitimate inputs that the AI may not have seen during training. Examples: rare medical conditions, unusual financial transactions.

Distribution shift — The input data in production differs from training data. Examples: seasonal changes in consumer behavior, new product categories.

Stress testing — How does the AI perform under high load, noisy data, or degraded conditions?

Governance requirement: Define robustness testing requirements proportionate to the risk level. High-risk AI systems should undergo comprehensive adversarial testing.

Security Testing for AI

AI introduces unique security vulnerabilities beyond traditional cybersecurity:

Prompt injection — Manipulating AI inputs to bypass safety controls or extract sensitive information. Critical for generative AI systems.

Data poisoning — Contaminating training data to cause the model to learn incorrect patterns. A supply chain attack on AI development.

Model extraction — Using the AI's outputs to reconstruct a copy of the model, potentially stealing trade secrets.

Membership inference — Determining whether a specific data point was in the training dataset, potentially revealing private information.

Model inversion — Using the model to reconstruct training data, potentially exposing personal information.

Knowledge Check
An attacker sends carefully crafted queries to a public AI API and uses the responses to build their own copy of the model. This attack is known as:
Model extraction uses the AI's inputs and outputs to reconstruct a functional copy of the model. This differs from prompt injection (manipulating input behavior), data poisoning (contaminating training data), and membership inference (determining if specific data was used for training).

Red Teaming for Generative AI

Red teaming involves adversarial testing by a dedicated team trying to make the AI system fail or produce harmful outputs.

For generative AI, red teaming focuses on:

- Harmful content generation — Can the system be prompted to produce dangerous, illegal, or toxic content?

- Bias and stereotypes — Does the system generate biased or stereotypical responses?

- Factual accuracy — Does the system confabulate (hallucinate) false information?

- Privacy leakage — Does the system reveal training data or personal information?

- Safety bypass — Can safety filters be circumvented through creative prompting?

Governance requirements for red teaming:

- Red team must be independent from the development team

- Testing must include diverse perspectives (demographics, expertise, languages)

- Results must be documented and addressed before deployment

- Red teaming is ongoing, not just pre-launch

Final Check
A generative AI system passes all standard performance benchmarks with high scores. Is red teaming still necessary?
Benchmarks measure expected performance under normal conditions. Red teaming specifically tests unexpected, adversarial, and edge-case scenarios that benchmarks don't cover. A model can score perfectly on benchmarks while still being vulnerable to prompt injection, bias, or harmful content generation.
🎯
Day 21 Complete
"Fairness metrics can conflict — the governance decision is which metric fits the use case. Red teaming tests adversarial scenarios that benchmarks miss. AI security testing must cover prompt injection, data poisoning, and model extraction."
Next Lesson
Documentation — Model Cards, Data Sheets, and AI Impact Assessments