The Trust Score
At the center of Vijil is the Trust Score—a quantitative measure of an agent’s trustworthiness across three dimensions:Reliability
Does the agent do what it’s supposed to do, consistently and accurately?
- Correctness
Produces accurate and valid outputs - Consistency
Behaves predictably across similar inputs - Robustness
Handles edge cases and errors gracefully
Security
Can the agent resist adversarial attacks and protect sensitive data?
- Confidentiality
Protects sensitive data from exposure - Integrity
Prevents unauthorized data modification - Availability
Resists denial of service attacks
Safety
Does the agent avoid harmful outputs and respect boundaries?
- Containment
Operates within defined boundaries - Compliance
Follows policies and regulations - Transparency
Provides clear reasoning for decisions
Build, Ship, Run, Evolve
Build with Depot
Start with components that are already hardened for trust. Depot provides guardrail models tuned for agent safety, hardened LLMs optimized for specific tasks, and pre-validated building blocks that reduce months of security work to days.Ship with Diamond
Test your agents before you trust them. Diamond evaluates agent behavior against hundreds of scenarios—reliability under stress, resistance to prompt injection, compliance with safety policies. You get a Trust Score and detailed findings in minutes, not weeks.Run with Dome
Protect agents in production. Dome provides real-time guardrails that filter harmful inputs and outputs, detect anomalies, and enforce policies—all with latency measured in milliseconds. When something goes wrong, you know immediately.Evolve with Darwin
Improve agents continuously. Darwin learns from production telemetry—the edge cases, the failures, the drift—and uses reinforcement learning to make agents more resilient over time. Trust isn’t static; Darwin keeps it current.Next Steps
Understand the Trust Score
Deep dive into how trust is measured across reliability, safety, and security.
Get Started
Set up your account and run your first evaluation.