> ## Documentation Index
> Fetch the complete documentation index at: https://docs.vijil.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Understand Results

> Interpret evaluation findings using the Dimensions of Trust framework.

Evaluation results reveal how your [Agent](/owner-guide/register-agents/what-is-an-agent) behaves across the three pillars of trustworthy AI: Reliability, Security, and Safety.

## The Trust Score

The Trust Score is a composite metric ranging from 0 to 100 that quantifies how much you can trust your [Agent](/owner-guide/register-agents/what-is-an-agent) in production. Vijil aggregates performance across all Evaluated Dimensions.

| Score | Status                              | Interpretation                                       |
| ----- | ----------------------------------- | ---------------------------------------------------- |
| ≥ 70  | <Badge color="green">PASSED</Badge> | Agent meets trustworthiness threshold for deployment |
| \< 70 | <Badge color="red">FAILED</Badge>   | Agent requires remediation before production use     |

The threshold of **70** represents a baseline for acceptable behavior. [Agents](/owner-guide/register-agents/what-is-an-agent) scoring below this threshold exhibited failure modes that pose unacceptable risk.

<Warning>
  A passing Trust Score indicates acceptable performance against tested [Scenarios](/concepts/evaluation-components/scenario). The report does not guarantee absence of all vulnerabilities: Evaluation coverage depends on the [Harness](/concepts/evaluation-components/harness) configuration and [Probe](/concepts/evaluation-components/probe) selection.
</Warning>

## Dimensions of Trust

Vijil organizes [Agent](/owner-guide/register-agents/what-is-an-agent) behavior into a three-level taxonomy:

<CardGroup cols={3}>
  <Card title="Reliability" icon="circle-check" href="/concepts/trust-score/reliability" arrow="true">
    <ul className="card-no-bullets" style={{ paddingLeft: 0, marginTop: '0.5rem' }}>
      <li>Correctness</li>
      <li>Consistency</li>
      <li>Robustness</li>
    </ul>
  </Card>

  <Card title="Security" href="/concepts/trust-score/security" arrow="true" icon="lock">
    <ul className="card-no-bullets" style={{ paddingLeft: 0, marginTop: '0.5rem' }}>
      <li>Confidentiality</li>
      <li>Integrity</li>
      <li>Availability</li>
    </ul>
  </Card>

  <Card title="Safety" href="/concepts/trust-score/safety" arrow="true" icon="shield">
    <ul className="card-no-bullets" style={{ paddingLeft: 0, marginTop: '0.5rem' }}>
      <li>Containment</li>
      <li>Compliance</li>
      <li>Transparency</li>
    </ul>
  </Card>
</CardGroup>

Each pillar addresses a distinct aspect of trustworthy AI. Failures in any pillar can render an Agent unsuitable for production deployment.

### Reliability

Reliability measures whether your Agent produces correct, consistent, and robust outputs.

| Subcategory     | What It Tests                                                                         |
| --------------- | ------------------------------------------------------------------------------------- |
| **Correctness** | Factual accuracy, logical validity, task alignment, goal satisfaction                 |
| **Consistency** | Self-consistency, cross-session stability, temporal stability, inter-user consistency |
| **Robustness**  | Contextual handling, distributional generalization, operational stability             |

### Security

Security measures whether your Agent resists attacks on confidentiality, integrity, and availability.

| Subcategory         | What It Tests                                                      |
| ------------------- | ------------------------------------------------------------------ |
| **Confidentiality** | Data leakage resistance, access control, data/user/model privacy   |
| **Integrity**       | Adversarial robustness, manipulation resistance, tamper resistance |
| **Availability**    | DoS resistance, graceful degradation, resilience                   |

### Safety

Safety measures whether your Agent operates within acceptable boundaries.

| Subcategory      | What It Tests                                                      |
| ---------------- | ------------------------------------------------------------------ |
| **Containment**  | Scope boundaries, capability boundaries, self-modification control |
| **Compliance**   | Policy compliance, norm compliance, ethical behavior               |
| **Transparency** | Explainability, accountability, user controllability               |

## Reading the Trust Report

Each evaluation produces a Trust Report, a structured PDF that moves from a high-level verdict down to individual Probe results and actionable remediation guidance. You can download a [sample report](/assets/vijil-console-eval-report.pdf) to follow along. The report has six sections.

### Entering the Trust Report

The cover page shows:

* **Agent name** and evaluation type (for example, *Behavioral Safety Assessment*)
* A <Badge color="green">PASSED</Badge> or <Badge color="red">FAILED</Badge> badge against the Trust Score threshold
* The numeric **Trust Score**
* An **Evaluation ID** for tracking and sharing the report
* The generation timestamp in UTC

### Executive Summary

A brief overview that states which [Harnesses](/concepts/evaluation-components/harness) were run, the overall pass/fail result, and the final Trust Score against the threshold. Use this section to share findings with stakeholders who do not need the full detail.

### Agent Specification

Confirms exactly what was evaluated:

| Field           | Description                                                      |
| --------------- | ---------------------------------------------------------------- |
| Agent Name      | The name you registered in [Diamond](/concepts/platform/diamond) |
| Agent URL       | The endpoint [Diamond](/concepts/platform/diamond) probed        |
| Model           | The underlying model identifier                                  |
| Rate Limit      | Requests per minute used during the evaluation                   |
| Request Timeout | Per-request timeout in seconds                                   |

A **Harnesses Evaluated** table lists each Harness by name, type, and a short description.

### Evaluation Results

**Overall Score** displays a visual gauge with your Trust Score plotted against the pass threshold, making the pass/fail outcome immediately legible.

**Per-Harness Breakdown** lists one card per [Harness](/concepts/evaluation-components/harness) showing its individual score and PASS/FAIL result. When multiple Harnesses are run, a Harness can fail while the overall score passes, or vice versa, depending on weighting. Check each card to identify which dimension drove the outcome.

### Detailed Analysis

The primary diagnostic section, with one subsection per Harness. Each subsection contains:

**Risk Assessment**: States the overall risk level (Low, Moderate, High, or Critical) and the total count of failure patterns broken down by severity (for example, "22 failure patterns identified: 12 Critical, 5 High, 4 Moderate, 1 Low").

**Probe Scores**: A table of every Probe run, grouped by Scenario, with its numeric score and severity rating. Lower scores mean the Agent failed more of that Probe's test cases. The severity label reflects how dangerous the failure pattern is, not just how often it occurred.

**Identified Failure Patterns**: Each pattern that exceeded the failure threshold gets its own entry with:

* A **code** (for example, `MUT-001`, `SEC-007`) for tracking across evaluations
* A short **issue title** and **severity** badge
* A **description** of the behavior Diamond observed
* **Implications**: what could go wrong in production as a result
* **Mitigations**: concrete remediation steps such as system prompt changes, Guardrail configuration, or architectural changes

Failure patterns aggregate multiple Probes into a single named finding. Addressing one pattern can resolve failures across many individual Probes.

### Conclusion

A deployment recommendation states plainly whether the Agent can be deployed or requires remediation first. If the Agent failed, it lists the steps to take before re-evaluating.

### Appendix

Records the exact evaluation configuration for reproducibility:

* **Evaluation Configuration**: request parameters (evaluation type, Agent URL, model, rate limit, timeout) and a Harnesses table with final scores
* **Scoring Methodology**: the pass/fail threshold applied
* **Harness Definitions**: plain-language definitions of what each Harness type measures

## Understanding Red Team Results

Red Team results are campaign evidence, not a Trust Score. Open a Red Team result from **Tests** → **Evaluation Results** by selecting an evaluation with type **Red Team**.

The result page has three main areas:

* **Run summary**: Current status, phase, cost, progress, and wave information
* **Waves**: Per-wave seeds, attackers, transcripts, strategies, and judgments
* **Final Report**: Aggregated findings across the full campaign

### Run Summary

The summary at the top of the result page tells you where the campaign is in its lifecycle.

| Field                   | What It Means                                                                                |
| ----------------------- | -------------------------------------------------------------------------------------------- |
| **Run Status**          | Whether the Red Team run is pending, running, succeeded, failed, or stopped                  |
| **Phase**               | The current stage of the run, such as planning, attacking, judging, reflecting, or reporting |
| **Elapsed**             | How long the run has been active                                                             |
| **Running Total Cost**  | The accumulated cost for the campaign so far                                                 |
| **Attackers Completed** | How many attacker runs finished                                                              |
| **Attackers Errored**   | How many attacker runs failed                                                                |
| **Current Wave**        | Which wave is currently active or most recently completed                                    |

Use this section to understand progress and cost. Use the Waves and Final Report sections to interpret findings.

### Waves

A Red Team campaign runs in waves. The **Waves** section lists every wave in the evaluation and shows basic information for each one. Click a wave to inspect what Red Team generated and what attackers did during that wave.

Each wave detail view has two main sections:

| Section                | Fields                                         | How To Read It                                                                                                 |
| ---------------------- | ---------------------------------------------- | -------------------------------------------------------------------------------------------------------------- |
| **Seeds in This Wave** | Seed ID, Instructions, Risk Type, Risk Outcome | Shows the attack goals generated for that wave. Each seed connects a taxonomy risk to an attacker instruction. |
| **Live Attackers**     | Seed ID, Status, Transcripts                   | Shows the attacker runs launched for those seeds and whether transcript evidence is available.                 |

Review wave details to see whether later waves became more focused based on earlier discoveries. Strong campaigns often show broad exploration early, then more targeted attacks after reflections identify promising strategies.

### Live Attacker Details

Click a Live Attacker to inspect the evidence behind one seed. The detail view is organized into several sections:

| Result Detail        | How To Read It                                                                             |
| -------------------- | ------------------------------------------------------------------------------------------ |
| **Seed Info**        | Shows the seed Instructions, Risk Type, Risk Outcome, and Cost                             |
| **Final Strategy**   | Shows the Persona, Context, Approach, and Conversation Plan the attacker used              |
| **Final Transcript** | Shows the conversation evidence selected as the most relevant transcript for this attacker |
| **Judgment**         | Shows Harmful Content Delivered and Potential Harm                                         |

### Judgments and Harm

After an attack completes, a judge reviews the transcript. In the UI, focus on **Harmful Content Delivered** and **Potential Harm**.

**Harmful Content Delivered** shows whether the Agent fulfilled the attack goal or leaked a verified sensitive artifact:

| Judgment Field | Meaning                                                                                                     |
| -------------- | ----------------------------------------------------------------------------------------------------------- |
| **FULL**       | The agent fulfilled the attack goal or leaked a verified sensitive artifact                                 |
| **PARTIAL**    | The agent produced some harmful content, but the goal was not fully met or the leak needs more verification |
| **NONE**       | The agent refused or responded safely                                                                       |

**Potential Harm** helps separate verified harm from findings that need owner review. Treat a verified policy violation or verified leaked artifact as real harm that needs remediation. Treat potential or unverified harm as something a human owner should check against the actual Agent design, policies, and data access.

Leaked artifacts are internal details the Agent disclosed, such as system prompt fragments, tool names, private endpoints, credentials, or operational procedures.

### Final Red Team Report

The **Final Report** section shows a short summary of the evaluation. Click **Open full report view** to inspect the details used to create that summary.

The full report view includes:

| Section                   | What It Shows                                                         |
| ------------------------- | --------------------------------------------------------------------- |
| **Summary**               | Text summary, Waves, Seeds, Successful Judgments, and Run Total Cost  |
| **Vulnerabilities**       | Distinct weaknesses discovered across waves                           |
| **Policy Violations**     | Confirmed policy violations when policies were available to the judge |
| **Leaked Artifacts**      | Internal details disclosed during attacks                             |
| **Successful Strategies** | Attacker approaches that worked and should inform future testing      |

The report summary is aggregated from all waves, seeds, transcripts, and judgments. Use it to decide which issues need product changes, prompt or policy updates, tool permission changes, or Dome Guardrails.

## Prioritizing Remediation

Use severity and taxonomy to prioritize fixes:

**Address immediately (Critical/High severity):**

* Security vulnerabilities (prompt injection, data leakage)
* Safety violations (harmful content, scope violations)
* Reliability failures that affect core functionality
* Red Team findings with FULL harmful-content judgments

**Address in next release (Medium severity):**

* Consistency issues across sessions
* Minor compliance gaps
* Robustness failures on edge cases
* Red Team findings with PARTIAL harmful-content judgments

**Track and monitor (Low severity):**

* Transparency improvements
* Minor formatting inconsistencies
* Rare edge case handling

<Tip>
  Focus remediation on root causes rather than individual findings. Multiple findings often share a common root cause, and fixing the underlying issue resolves all related symptoms.
</Tip>

## Comparing Evaluations

Run evaluations before and after changes to track improvement:

| Metric            | Before | After | Change |
| ----------------- | ------ | ----- | ------ |
| Trust Score       | 62     | 78    | +16    |
| Critical Findings | 3      | 0     | -3     |
| High Findings     | 7      | 2     | -5     |

A rising Trust Score with decreasing critical findings indicates effective remediation. A declining score signals regression, so investigate recent changes.

## Next Steps

<CardGroup cols={2}>
  <Card title="Configure Guardrails" icon="shield" href="/owner-guide/protect-in-production/configuring-guardrails">
    Add runtime protection with Dome
  </Card>

  <Card title="Quantifying Risk" icon="chart-pie" href="/owner-guide/audit-agent-behavior/quantifying-risk">
    Translate findings into risk assessments
  </Card>

  <Card title="Trust Score Harness" icon="shield-check" href="/owner-guide/simulate-environment/harnesses/trust-score">
    Learn about the standard evaluation
  </Card>

  <Card title="Run Evaluations" icon="play" href="/owner-guide/run-evaluations/running-evaluations">
    Launch and monitor evaluations
  </Card>
</CardGroup>