Evaluate Agents

Evaluations send adversarial Probes to your Agent and return a Trust Score. Harnesses define which Probes are sent.

Harnesses

Command	Description
`vijil harness list`	List standard Harnesses
`vijil harness custom-create`	Create a custom Harness
`vijil harness custom-list`	List custom Harnesses
`vijil harness custom-get`	Get a custom Harness
`vijil harness custom-prompts`	Get prompts from a custom Harness
`vijil harness custom-cancel`	Cancel a Harness being generated
`vijil harness custom-delete`	Delete a custom Harness

`vijil harness list`

List standard Harnesses available for Evaluations.

vijil harness list
vijil harness list --json

Standard Harnesses include safety, security, reliability, privacy, toxicity, and ethics.

`vijil harness custom-create`

Create a custom Harness for a specific Agent. Vijil generates Probes based on the Agent’s purpose and any Personas or Policies you attach.

vijil harness custom-create \
  --name "Customer Support Harness" \
  --agent-id "$AGENT_ID"

Flag	Description	Required
`--name`	Harness display name	Yes
`--agent-id`	Agent ID to generate Probes for	Yes
`--description`	Harness description
`--persona-ids`	Persona IDs to include (JSON array)
`--policy-ids`	Policy IDs to include (JSON array)
`--system-prompt`	Agent description or system prompt for generation
`--json`	Output as JSON

`vijil harness custom-list`

List custom Harnesses for the active team.

vijil harness custom-list
vijil harness custom-list --agent-id "$AGENT_ID" --status completed

Flag	Description
`--agent-id`	Filter by Agent ID
`--status`	Filter by status
`--limit`	Maximum number of results (default 10)
`--offset`	Number of results to skip
`--json`	Output as JSON

`vijil harness custom-get`

Get a specific custom Harness by ID.

vijil harness custom-get <harness_id>

`vijil harness custom-prompts`

Retrieve the generated Probes for a custom Harness.

vijil harness custom-prompts <harness_id> --json

`vijil harness custom-cancel`

Cancel a Harness that is still being generated.

vijil harness custom-cancel <harness_id>

`vijil harness custom-delete`

Delete a custom Harness.

vijil harness custom-delete <harness_id>
vijil harness custom-delete <harness_id> --yes   # skip confirmation

Evaluations

Command	Description
`vijil eval run`	Start an Evaluation
`vijil eval status`	Check Evaluation status
`vijil eval results-detail`	Get full Evaluation results
`vijil eval list`	List Evaluations
`vijil eval report`	Generate a Trust Report
`vijil eval logs`	Get Evaluation logs
`vijil eval cancel`	Cancel a running Evaluation
`vijil eval delete`	Delete an Evaluation
`vijil eval results-list`	List completed Evaluations
`vijil eval list-all`	List all team Evaluations
`vijil eval summary-get`	Get an Evaluation summary
`vijil eval summary-by-agent`	Get latest summaries per Agent
`vijil eval summary-delete`	Delete an Evaluation summary

`vijil eval run`

Start a trust evaluation against an Agent.

vijil eval run \
  --agent-id "$AGENT_ID" \
  --harness-names '["safety", "security"]' \
  --sample-size 50 \
  --wait

Flag	Description	Required
`--agent-id`	UUID of the Agent to evaluate	Yes
`--harness-names`	JSON array of Harness names to run	Yes
`--sample-size`	Probes to run per Harness (1–1000); omit to run all
`--harness-type`	`standard` (default) or `custom`
`--evaluation-id`	Use a specific UUID for the evaluation
`--wait`	Poll until the evaluation completes
`--json`	Output as JSON

Use --sample-size 10 for fast iteration during development. Run the full Harness before releasing to production.

`vijil eval status`

Check the status of an evaluation.

vijil eval status <evaluation_id>

Status values progress through: starting → pending → running → completed → saving → saved. The status may also be failed or canceled.

vijil eval status <evaluation_id> --json

`vijil eval results-detail`

Get the full results for a completed evaluation.

vijil eval results-detail <evaluation_id>
vijil eval results-detail <evaluation_id> --json | jq '.scores'

Returns Trust Scores per Harness, per-Probe results, and identified failure patterns.

`vijil eval list`

List evaluation summaries for the active team.

vijil eval list
vijil eval list --agent-id "$AGENT_ID" --status completed

Flag	Description
`--agent-id`	Filter by Agent ID
`--status`	Filter by status (`running`, `completed`, `failed`, `cancelled`)
`--harness-type`	Filter by Harness type (`standard` or `custom`)
`--tested-by`	Filter by tool that ran the evaluation
`--limit`	Maximum number of results
`--offset`	Number of results to skip
`--json`	Output as JSON

`vijil eval report`

Generate a Trust Report for a completed evaluation.

vijil eval report <evaluation_id>
vijil eval report <evaluation_id> --force-regenerate

Flag	Description
`--force-regenerate`	Regenerate even if a cached report exists
`--json`	Output as JSON

`vijil eval logs`

Get execution logs for an evaluation.

vijil eval logs <evaluation_id> --json

`vijil eval cancel`

Cancel a running evaluation.

vijil eval cancel <evaluation_id>

`vijil eval delete`

Delete an evaluation and its results.

vijil eval delete <evaluation_id>
vijil eval delete <evaluation_id> --yes   # skip confirmation

`vijil eval results-list`

List completed Evaluations with results.

vijil eval results-list --limit 20 --json

`vijil eval list-all`

List all Evaluations across the team without filtering.

vijil eval list-all --json

`vijil eval summary-get`

Get the summary for a specific Evaluation.

vijil eval summary-get <evaluation_id> --json

`vijil eval summary-by-agent`

Get the latest Evaluation summary for each Agent in the team.

vijil eval summary-by-agent --json

`vijil eval summary-delete`

Delete an Evaluation summary.

vijil eval summary-delete <evaluation_id>
vijil eval summary-delete <evaluation_id> --yes

​Harnesses

​vijil harness list

​vijil harness custom-create

​vijil harness custom-list

​vijil harness custom-get

​vijil harness custom-prompts

​vijil harness custom-cancel

​vijil harness custom-delete

​Evaluations

​vijil eval run

​vijil eval status

​vijil eval results-detail

​vijil eval list

​vijil eval report

​vijil eval logs

​vijil eval cancel

​vijil eval delete

​vijil eval results-list

​vijil eval list-all

​vijil eval summary-get

​vijil eval summary-by-agent

​vijil eval summary-delete

Harnesses

`vijil harness list`

`vijil harness custom-create`

`vijil harness custom-list`

`vijil harness custom-get`

`vijil harness custom-prompts`

`vijil harness custom-cancel`

`vijil harness custom-delete`

Evaluations

`vijil eval run`

`vijil eval status`

`vijil eval results-detail`

`vijil eval list`

`vijil eval report`

`vijil eval logs`

`vijil eval cancel`

`vijil eval delete`

`vijil eval results-list`

`vijil eval list-all`

`vijil eval summary-get`

`vijil eval summary-by-agent`

`vijil eval summary-delete`