Interpret evaluation findings using the Taxonomy of Trust framework.
Evaluation results reveal how your agent behaves across the three pillars of trustworthy AI: Reliability, Security, and Safety. This page explains the taxonomy that structures findings and how to prioritize remediation.
The Trust Score is a composite metric ranging from 0 to 1 that quantifies how much you can trust your agent in production. It aggregates performance across all evaluated dimensions.
Score
Status
Interpretation
≥ 0.70
PASSED
Agent meets trustworthiness threshold for deployment
< 0.70
FAILED
Agent requires remediation before production use
The threshold of 0.70 represents a baseline for acceptable behavior. Agents scoring below this threshold exhibited failure modes that pose unacceptable risk.
A passing Trust Score indicates acceptable performance against tested scenarios. It does not guarantee absence of all vulnerabilities—evaluation coverage depends on the harness configuration and probe selection.
Reliability failures that affect core functionality
Address in next release (Medium severity):
Consistency issues across sessions
Minor compliance gaps
Robustness failures on edge cases
Track and monitor (Low severity):
Transparency improvements
Minor formatting inconsistencies
Rare edge case handling
Focus remediation on root causes rather than individual findings. Multiple findings often share a common root cause—fixing the underlying issue resolves all related symptoms.