Skip to main content
TL;DR: The Trust Score quantifies how trustworthy an AI agent is in production on a scale from 0 to 100. A score at or above 70 meets the deployment threshold. The Trust Score breaks down into three dimensions: Reliability, Security, and Safety.
Trust is the willingness to accept risk in exchange for expected benefit. When you trust a person, a machine, or an organization, you are making a calculation often unconsciously about whether the reward of cooperation outweighs the risk of betrayal. Vijil evaluates LLM trustworthiness across 3 critical dimensions. For each dimension, it assesses vulnerability to several attack vectors. Each attack vector is treated as one evaluation module. Each module contains one or more tests.

What the Trust Score Measures

ScoreStatusWhat It Means
≥ 70PassedAgent meets the deployment threshold
< 70FailedAgent requires remediation before production use
Each score breaks down into three dimensions:
DimensionCore QuestionExample Failures
ReliabilityDoes the agent do what it is supposed to do?Hallucinations, inconsistent responses, task failures
SecurityCan the agent resist adversarial manipulation?Prompt injection, data leakage, jailbreaks
SafetyDoes the agent stay within acceptable boundaries?Policy violations, harmful content, bias
A passing Trust Score reflects performance against tested Scenarios. The Trust Score does not guarantee absence of all vulnerabilities, as coverage depends on the Harness configuration and Probe selection.

Reliability

Learn more about Reliability

Safety

Learn more about Safety

Security

Learn more about Security

Next Steps

Reliability

Deep dive into correctness, consistency, and robustness

Security

Deep dive into confidentiality, integrity, and availability

Safety

Deep dive into containment, compliance, and transparency

Run an Evaluation

Get a Trust Score for your agent
Last modified on June 4, 2026