Quickstart

This quickstart evaluates your agent against Vijil’s Trust Score harness. You’ll wrap your existing agent function, run an evaluation, and see results—all in about 15 minutes.

Prerequisites

Python 3.9+
A Vijil API key (get one here)
An OpenAI API key (or another LLM provider)

Install

pip install vijil

Set Credentials

export VIJIL_API_KEY="your-vijil-key"
export OPENAI_API_KEY="your-openai-key"

Evaluate Your Agent

Create a file called evaluate.py:

from vijil import Vijil
from openai import OpenAI

# Define your agent as a function
def my_agent(prompt: str) -> str:
    client = OpenAI()
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt}
        ]
    )
    return response.choices[0].message.content

# Evaluate it
vijil = Vijil()

local_agent = vijil.local_agents.create(
    agent_function=my_agent,
    agent_name="my-first-agent"
)

vijil.local_agents.evaluate(
    agent_name="my-first-agent",
    harnesses=["trust_score"]
)

Run it:

python evaluate.py

The evaluation takes 10–15 minutes. When complete, you’ll see output like:

Trust Score: 0.82
├── Reliability: 0.91
├── Security: 0.74
└── Safety: 0.85

High-severity findings: 2
Medium-severity findings: 5

What Just Happened?

Vijil wrapped your agent in a temporary HTTP server using ngrok
Diamond sent probes — adversarial prompts testing for hallucinations, prompt injection, jailbreaks, and more
Detectors analyzed responses — checking if your agent leaked data, followed malicious instructions, or violated safety policies
Results aggregated into a Trust Score with specific findings

Your agent code wasn’t modified. The evaluation ran against your actual implementation.

View Detailed Results

Open the Vijil Console to see:

Per-probe results with the exact prompts and responses
Failure explanations with remediation guidance
Comparison with previous evaluations

What’s Next?

Framework Guides

Integrate with LangChain, Google ADK, or custom frameworks

Add Protection

Block attacks at runtime with Dome guardrails

CI/CD Integration

Run evaluations on every pull request

Understanding Results

Interpret scores and prioritize fixes

Get Started

Use Frameworks

Evaluate Agents

Protect Agents

Deploy with CI/CD

Deploy On-Premises

API Reference

Prerequisites

Install

Set Credentials

Evaluate Your Agent

What Just Happened?

View Detailed Results

What’s Next?

Framework Guides

Add Protection

CI/CD Integration

Understanding Results

Get Started

Use Frameworks

Evaluate Agents

Protect Agents

Deploy with CI/CD

Deploy On-Premises

API Reference

​Prerequisites

​Install

​Set Credentials

​Evaluate Your Agent

​What Just Happened?

​View Detailed Results

​What’s Next?

Framework Guides

Add Protection

CI/CD Integration

Understanding Results

Prerequisites

Install

Set Credentials

Evaluate Your Agent

What Just Happened?

View Detailed Results

What’s Next?