When an LLM deployed for deviation triage produces an incorrect root cause, the quality team faces a question the existing QMS has no answer for: what kind of error was it?
"The AI was wrong" is the equivalent of classifying every deviation as "equipment malfunction." It tells you nothing actionable and prevents meaningful trending.
Probabilistic AI systems can fail in fundamentally different ways — fabrication, misinterpretation, contextual misapplication, confidence miscalibration, boundary violation, population bias — and each has different root causes, different risk profiles, and different corrective actions.
This paper introduces a two-dimensional error taxonomy: Error Type × Origin. Six classes of failure mapped against six origination points (training data, retrieval/RAG, model inference, human-AI interface, agent orchestration, supplier). The structure is FMEA-shaped and anchored in GxP context: ICH Q9(R1), GAMP 5 Second Edition, the GAMP AI Guide, and the FDA's draft credibility assessment framework.
The taxonomy is meant to give quality teams something to hold onto when the failure they're investigating doesn't have a stable identity; which, in probabilistic systems, is most of the time.
If you've encountered a failure that doesn't fit this structure, or if you think the structure itself needs reconsidering, I want to hear about it. Refinements will be published with attribution.
#AIValidation #GxP #AIGovernance #GAMP5 #RiskManagement #ICHQ9 #LifeSciences #QualityByDesign