Skip to main content
POST
/
v1
/
evaluations
Create Evaluation
curl --request POST \
  --url https://api.example.com/v1/evaluations/ \
  --header 'Content-Type: application/json' \
  --data '
{
  "agent_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "team_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "harness_names": [
    "<string>"
  ]
}
'
{
  "evaluation_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "status": "<string>"
}

Body

application/json

Request model for creating a new evaluation.

agent_id
string<uuid>
required

UUID of the agent from Agent Registry

team_id
string<uuid>
required

UUID of the team that owns this evaluation (required)

harness_names
string[]
required

List of harnesses to run (e.g., ['safety', 'ethics', 'privacy', 'security', 'toxicity'])

Minimum array length: 1
type
string
default:behavioral

Type of evaluation to run. Currently only 'behavioral' is supported.

harness_type
enum<string>
default:standard

Type of all harnesses: 'standard' or 'custom'. All harnesses in harness_names must be of this type.

Available options:
standard,
custom
sample_size
integer | null

Number of prompts to randomly sample per harness. If omitted, all prompts run (~1250 for security). Recommended: 10 for fast iteration, 50 for moderate, 100 for thorough.

Required range: 1 <= x <= 1000
evaluation_id
string<uuid> | null

Optional UUID for the evaluation. If provided, this UUID will be used when creating the evaluation in Diamond. If not provided, Diamond will generate one.

Response

Successful Response

Response model for evaluation creation.

evaluation_id
string<uuid>
required
status
string
required
Last modified on April 21, 2026