Protection Overview

Evaluation catches vulnerabilities you know to test for. But attackers will try things you did not anticipate like new prompt injection techniques, novel encoding tricks, social engineering patterns that emerge after your last evaluation. Dome is Vijil’s runtime protection system. It intercepts every input and output, applies configurable Guardrails, and blocks attacks before they reach your agent or your users. When Diamond identifies vulnerabilities you cannot immediately fix, Dome provides defense-in-depth while you remediate.

How Dome Works

Dome wraps your agent with configurable Guardrails:

Dome defense flow showing input Guardrail, agent, and output Guardrail

Component	Purpose
Guardrail	Pipeline of Guards (input or output)
Guard	Group of Detectors of one type
Detector	Individual detection method

Protection Types

Security Guards

Detect and block adversarial attacks:

Detector	What It Catches
`prompt-injection-mbert`	Injected instructions in user input
`prompt-injection-deberta-v3-base`	Advanced injection attempts
`encoding-heuristics`	Base64, Unicode, and encoding attacks
`security-embeddings`	Semantic similarity to known attacks

Moderation Guards

Filter harmful and inappropriate content:

Detector	What It Catches
`moderation-flashtext`	Fast keyword-based toxicity
`moderation-deberta`	Neural toxicity classification
`moderations-oai-api`	OpenAI Moderation API
`moderation-llamaguard`	Llama Guard safety model

Privacy Guards

Prevent exposure of sensitive data:

Detector	What It Catches
`privacy-presidio`	PII (names, emails, SSN, etc.)
`detect-secrets`	API keys, passwords, credentials

Quick Start

You can protect your agents with default Guards. The default configuration includes:

Input: Prompt injection detection, encoding heuristics, moderation
Output: Moderation, PII detection

Configuration Sources

You can pull configurations securely from your registered Vijil agent. Alternatively, you can define configurations using dictionaries or TOML files.

Scan Results

Every scan process returns a comprehensive result object describing whether the content is safe, the safe fallback message if flagged, and the execution trace. When content is flagged:

A signal specifies the content was rejected (is_safe becomes false)
It provides a safe fallback message
The trace shows which Detector flagged it and why

Framework Integrations

Dome is designed to be integrable with popular frameworks and runtimes. You can see the specific framework developer guides for integration patterns during the developer access phase.

Performance Options

You can configure protection controls to stop processing when the first Guard flags content, or run Guards concurrently for maximum efficiency and parallel execution.

Work in Progress

The programmatic protection capabilities are currently in private preview and subject to change.

Next Steps

Configure Guardrails

Detailed Guard configuration options

Use Guardrails

Runtime patterns and best practices

Custom Detectors

Build your own detection methods

Observability

Monitoring and tracing setup

Documentation Index

​How Dome Works

​Protection Types

​Security Guards

​Moderation Guards

​Privacy Guards

​Quick Start

​Configuration Sources

​Scan Results

​Framework Integrations

​Performance Options