Deployment Gates
A deployment gate is a checkpoint where an agent must demonstrate acceptable risk before proceeding. Gates convert subjective “is it ready?” conversations into objective pass/fail decisions.Gate Structure
Each gate specifies:| Component | Description |
|---|---|
| Trust Score threshold | Minimum overall score required (e.g., ≥ 0.70) |
| Dimension requirements | Minimum scores per pillar (Reliability, Security, Safety) |
| Severity limits | Maximum allowed findings at each severity level |
| Harness specification | Which evaluation harness must be run |
Example Gate Configuration
Readiness Assessment
Run a Trust Score evaluation and compare results against your gate criteria:Step 1: Run the Evaluation
Execute your standard harness against the candidate agent version:Step 2: Compare Against Gates
| Criterion | Gate Requirement | Actual Result | Status |
|---|---|---|---|
| Trust Score | ≥ 0.70 | 0.78 | ✓ Pass |
| Reliability | ≥ 0.65 | 0.72 | ✓ Pass |
| Security | ≥ 0.75 | 0.81 | ✓ Pass |
| Safety | ≥ 0.70 | 0.74 | ✓ Pass |
| Critical findings | 0 | 0 | ✓ Pass |
| High findings | ≤ 2 | 1 | ✓ Pass |
Step 3: Document the Decision
Record the evaluation results, gate configuration, and deployment decision for audit purposes.Conditional Deployment
When an agent fails gate criteria but business needs require deployment, conditional deployment allows proceeding with compensating controls.When to Use Conditional Deployment
- Agent scores slightly below threshold but improves on previous version
- Critical business deadline with acceptable residual risk
- Compensating controls adequately mitigate identified failures
Compensating Controls
| Failure Type | Compensating Control |
|---|---|
| Prompt injection vulnerability | Deploy with input guardrails blocking injection patterns |
| Hallucination risk | Add output guardrails for factual verification |
| Scope boundary issues | Restrict agent to narrow use case with limited tools |
| Privacy concerns | Deploy with PII redaction on input and output |
Conditional Deployment Checklist
Before approving conditional deployment:1
Document residual risk
List each failing criterion and its risk score
2
Identify compensating controls
Map specific Dome guards to each identified risk
3
Set monitoring thresholds
Define alert triggers for production behavior
4
Establish remediation timeline
Commit to addressing root causes by a specific date
5
Obtain explicit sign-off
Get written approval from both Business Owner and Risk Officer
Sign-Off Workflow
Deployment decisions require alignment between stakeholders with different priorities.Roles and Responsibilities
| Role | Responsibility | Signs Off On |
|---|---|---|
| Business Owner | Accountable for agent value delivery | Functional readiness, business case |
| Risk Officer | Accountable for risk management | Security, safety, compliance |
| Technical Lead | Accountable for implementation quality | Technical readiness, monitoring |
Sign-Off Process
Documentation Requirements
Every deployment decision should include:- Evaluation report — Full Trust Report with scores and findings
- Gate criteria — The specific thresholds applied
- Comparison table — Actual vs. required for each criterion
- Decision record — Approved, conditionally approved, or rejected
- Sign-off signatures — Business Owner and Risk Officer approval
- Conditions (if applicable) — Compensating controls and remediation timeline
Setting Thresholds
Thresholds should reflect your organization’s risk tolerance and the agent’s deployment context.Threshold Guidelines by Use Case
| Use Case | Trust Score | Security | Notes |
|---|---|---|---|
| Internal tool | ≥ 0.60 | ≥ 0.65 | Lower risk exposure |
| Customer-facing | ≥ 0.70 | ≥ 0.75 | Reputation and compliance risk |
| Regulated industry | ≥ 0.80 | ≥ 0.85 | Regulatory requirements |
| High-stakes decisions | ≥ 0.85 | ≥ 0.90 | Material business impact |
Adjusting Thresholds
Raise thresholds when:- Agent has access to sensitive data or critical systems
- Agent serves external users or customers
- Failures would trigger regulatory or legal consequences
- Agent operates with minimal human oversight
- Agent handles low-risk, internal tasks
- Strong compensating controls are in place
- Human review is part of every workflow
- Agent operates in a sandbox environment
Tracking Readiness Over Time
Monitor readiness across agent versions to identify trends:| Version | Date | Trust Score | Critical | High | Status |
|---|---|---|---|---|---|
| v1.0 | Jan 15 | 0.58 | 2 | 8 | Failed |
| v1.1 | Feb 1 | 0.65 | 1 | 5 | Failed |
| v1.2 | Feb 15 | 0.72 | 0 | 2 | Passed |
| v1.3 | Mar 1 | 0.78 | 0 | 1 | Passed |