Evaluation tells you how trustworthy your agent is. Defense keeps it that way in production. AI blue teaming covers defense mechanisms to proactively defend the agent or model against failure modes found through red teaming tests. Blue teaming methods that are popular currently include LLM firewalls, prompt augmentation, and safety Guardrails. However, such methods are sometimes overly defensive, and can be bypassed.1 In the longer term, deeper defense strategies such as adversarial finetuning and Constitutional AI2 may be more robust. However, technical challenges related to computational stability and tradeoffs need to be overcome to make such techniques mainstream. Using Vijil Dome, an enterprise AI engineer or developer can protect a generative AI system byDocumentation Index
Fetch the complete documentation index at: https://docs.vijil.ai/llms.txt
Use this file to discover all available pages before exploring further.
- Applying Guardrails on system prompts
- Routing the input to and output from my app through scanners to block or redact harmful and malicious content
- Applying scanners through policies that map to internal usage restrictions, local/national/international regulations, and standards such as OWASP Top 10 for LLMs.
- Creating new policies or modify existing policy components to adapt to changing threat landscapes.
Next Steps
Guardrail
Configure protection pipelines
Guard
Understand protection categories
Detector
The detection engines
Observe
Telemetry, metrics, and logging