Skip to main content
Vijil evaluations produce quantified risk data that maps directly to enterprise risk management frameworks. This page explains how to interpret the Treemap visualization, apply the Impact ร— Likelihood matrix, and integrate findings into your risk register.

The Treemap of Trust

The Treemap of Trust provides an interactive visualization of your agentโ€™s risk profile across all trust dimensions.
Treemap of Trust showing hierarchical risk visualization with color-coded categories

Reading the Treemap

Each rectangle represents a specific risk category:
  • Size โ€” Corresponds to risk weight (likelihood ร— impact). Larger rectangles indicate higher-priority concerns.
  • Color โ€” Indicates trust score for that category:
    • Red tones (0โ€“0.55): Critical to poor performance
    • Neutral tones (0.55โ€“0.70): Fair performance, needs attention
    • Blue tones (0.70โ€“1.0): Good to excellent performance
  • Label โ€” Shows the category name and numerical score
Hover over any rectangle to see detailed metadata: category, subcategory, likelihood rating (1โ€“5), and impact rating (1โ€“5).

Using the Treemap

The treemap enables rapid identification of priority areas:
  1. Find the largest red rectangles โ€” These represent high-risk failures requiring immediate attention
  2. Trace the hierarchy โ€” Click into dimensions to see subcategory breakdowns
  3. Compare across agents โ€” Use the agent selector to compare risk profiles between different agents or versions

Risk Framework

Vijil quantifies risk using a standard Impact ร— Likelihood matrix. Each failure mode receives ratings on both dimensions, producing a risk score that determines severity.

Impact Levels

LevelRatingDescription
Critical5Severe business impact: regulatory violation, data breach, safety incident
High4Significant impact: service disruption, reputation damage, compliance gap
Moderate3Noticeable impact: degraded user experience, operational inefficiency
Low2Minor impact: inconvenience, cosmetic issues

Likelihood Levels

LevelRatingDescription
Frequent5Expected to occur regularly in normal operation
Likely4Will probably occur under typical conditions
Occasional3May occur periodically
Rare2Unlikely but possible under specific conditions

Risk Matrix

The intersection of Impact and Likelihood determines the risk score:
Rare (2)Occasional (3)Likely (4)Frequent (5)
Critical (5)10152025
High (4)8121620
Moderate (3)691215
Low (2)46810

Severity Levels

Risk scores map to four severity levels that drive remediation priority:
SeverityRisk ScoreAction Required
1 (Critical)20โ€“25Fix immediately before deployment
2 (High)15โ€“19Fix in near-term; deploy with mitigating controls
3 (Medium)9โ€“14Monitor and address in normal development
4 (Low)4โ€“8Track and address as resources permit

High-Priority Risks by Category

The following failure modes typically score at Severity 1 (Critical) based on Vijilโ€™s default risk weights:

Security Risks

Failure ModeImpactLikelihoodRiskSeverity
Prompt InjectionCritical (5)Frequent (5)251
Access Control BypassCritical (5)Likely (4)201
Data Privacy ViolationCritical (5)Likely (4)201
User Privacy ExposureCritical (5)Likely (4)201
Training Data LeakageCritical (5)Likely (4)201
Adversarial RobustnessHigh (4)Frequent (5)201

Reliability Risks

Failure ModeImpactLikelihoodRiskSeverity
Factual Accuracy (Hallucinations)High (4)Frequent (5)201
Distributional RobustnessHigh (4)Likely (4)162
Contextual RobustnessModerate (3)Frequent (5)152

Safety Risks

Failure ModeImpactLikelihoodRiskSeverity
Scope BoundariesCritical (5)Occasional (3)152
Policy ComplianceCritical (5)Occasional (3)152
AccountabilityCritical (5)Occasional (3)152

Mitigation Priorities

Use severity levels to structure your remediation roadmap:

Severity 1: Fix Immediately (Risk โ‰ฅ 20)

These issues block deployment:
  • Prompt injection and context hijacking vulnerabilities
  • User privacy and re-identification risks
  • Factual accuracy failures (hallucinations)
  • Adversarial robustness gaps (evasion, jailbreaks)
  • Training data leakage

Severity 2: Fix Near-Term (Risk 15โ€“19)

Address before scaling or high-stakes deployment:
  • Inference leakage vulnerabilities
  • Dataset reconstruction risks
  • Scope expansion and policy compliance gaps
  • Explainability issues
  • Model extraction vulnerabilities

Severity 3: Monitor and Improve (Risk 9โ€“14)

Include in regular development cycles:
  • Denial-of-service attack surface
  • Cross-session consistency issues
  • Self-consistency variations
  • Norm compliance gaps

Severity 4: Track (Risk 4โ€“8)

Monitor metrics, address opportunistically:
  • Graceful degradation improvements
  • Resilience enhancements
  • User controllability features

Integrating with Your Risk Register

Vijil findings map to standard risk register fields:
Risk Register FieldVijil Data Source
Risk IDFinding ID from Trust Report
Risk CategoryTaxonomy path (e.g., Security > Integrity > Adversarial Robustness)
Risk DescriptionFailure mode description + probe that triggered it
LikelihoodLikelihood rating (1โ€“5)
ImpactImpact rating (1โ€“5)
Risk ScoreLikelihood ร— Impact
Risk OwnerAssign based on taxonomy (Security โ†’ CISO, Safety โ†’ Compliance)
Mitigation StatusTrack through evaluation delta

Exporting Risk Data

Export evaluation results via the API for integration with GRC platforms:
curl -X GET "https://api.vijil.ai/v1/evaluations/{evaluation_id}/risks" \
  -H "Authorization: Bearer $VIJIL_API_KEY" \
  -H "Content-Type: application/json"
The response includes all findings with impact, likelihood, risk score, and severity for each.

Customizing Risk Weights

Vijil provides default impact and likelihood ratings based on industry benchmarks. You can override these defaults to reflect your organizationโ€™s specific risk tolerance:
  • Regulated industries may increase impact ratings for compliance-related failures
  • Consumer-facing applications may increase likelihood ratings for adversarial attacks
  • Internal tools may decrease impact ratings for transparency gaps
Contact your Vijil representative to configure custom risk weights for your organization.

Tracking Risk Over Time

Run evaluations periodically to track risk trends:
MetricQ1Q2Q3Trend
Severity 1 Findings520Improving
Severity 2 Findings1284Improving
Average Risk Score14.211.89.1Improving
Increasing severity counts or rising average risk scores indicate regression. Investigate recent changes to agent configuration, model updates, or expanded capabilities.

Next Steps