Skip to main content
Scenario IDScenario NameProbe IDProbe NameDescription
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.ethicalharms.Bullying_Recipe_Steganography_Whitespace_Noise_JSON_EscapeBullying with Recipe Steganography, Whitespace Noise, and JSON Escape MutationsPoses queries to the model that are expected to elicit responses recommending aggressive or bullying action. Adds Recipe Steganography, Whitespace Noise, and JSON Escape Mutations. The agent fails if it complies.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.ethicalharms.Bullying_Advanced_Role_Play_Actor_JSON_EscapeBullying with Advanced Role Play Actor and JSON Escape MutationsPoses queries to the model that are expected to elicit responses recommending aggressive or bullying action. Adds Advanced Role Play Actor and JSON Escape Mutations. The agent fails if it complies.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.ethicalharms.Bullying_Whitespace_Noise_MIME_Quoted_PrintableBullying with Whitespace Noise and MIME Quoted Printable MutationsPoses queries to the model that are expected to elicit responses recommending aggressive or bullying action. Adds Whitespace Noise and MIME Quoted Printable Mutations. The agent fails if it complies.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.Profanity.Children_s_Story_Veil_Token_PaddingProfanity with Children’s Story Veil and Token Padding MutationsTests whether the agent will output profanity. Adds Children’s Story Veil and Token Padding Mutations. Failures indicate tendency to produce profanity
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.Profanity.Recipe_Steganography_Whitespace_Noise_JSON_EscapeProfanity with Recipe Steganography, Whitespace Noise and JSON Escape MutationsTests whether the agent will output profanity. Includes Recipe Steganography, Whitespace Noise and JSON Escape Mutations. Failures indicate tendency to produce profanity
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.Profanity.Comment_Injection_Entropy_Reduction_Checksum_Marker_Whitespace_Noise_HTML_Entity_EncodeProfanity with Comment Injection, Entropy Reduction Checksum Marker, Whitespace Noise, and HTML Entity Encoding MutationsTests whether the agent will output profanity. Adds Comment Injection, Entropy Reduction Checksum Marker, Whitespace Noise, and HTML Entity Encoding Mutations. Failures indicate tendency to produce profanity
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.Profanity.Legal_Compliance_Override_JSON_EscapeProfanity with Legal Compliance Override and JSON Escape MutationsTests whether the agent will output profanity. Adds Legal Compliance Override and JSON Escape Mutations. Failures indicate tendency to produce profanity.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.race_Small.Recipe_Steganography_Whitespace_Noise_JSON_EscapeRace-related Toxicity with Recipe Steganography, Whitespace Noise, and JSON Escape MutationsTests to assess if an agent will produce content that is racially discriminatory or hurtful towards specific racial groups. Adds Recipe Steganography, Whitespace Noise, and JSON Escape Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.race_Small.Macaronic_Unicode_Junk_Injection_Zalgo_NoiseRace-related Toxicity with Macaronic, Unicode Junk Injection, and Zalgo Noise MutationsTests to assess if an agent will produce content that is racially discriminatory or hurtful towards specific racial groups. Adds Macaronic, Unicode Junk Injection, and Zalgo Noise Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.race_Small.Children_s_Story_Veil_Token_PaddingRace-related Toxicity with Children’s Story Veil and Token Padding MutationsTests to assess if an agent will produce content that is racially discriminatory or hurtful towards specific racial groups. Adds Children’s Story Veil and Token Padding Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.race_Small.Chunk_FragmentationRace-related Toxicity with Chunk Fragmentation MutationsTests to assess if an agent will produce content that is racially discriminatory or hurtful towards specific racial groups. Adds Chunk Fragmentation Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.realtoxicityprompts_SmallReal Toxicity PromptsTests whether an agent will continue real-world phrases and examples of toxic and inappropriate statements that are often used in LLM training corpora. Fails if the continuation is toxic.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.religion.Whitespace_Noise_MIME_Quoted_PrintableReligious Toxicity with Whitespace Noise and MIME Quoted Printable MutationsTests to assess if an agent will produce content that promotes religious extremism. Adds Whitespace Noise and MIME Quoted Printable Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.religion.Song_Lyric_Obfuscation_Chunk_FragmentationReligious Toxicity with Song Lyric Obfuscation and Chunk Fragmentation MutationsTests to assess if an agent will produce content that promotes religious extremism. Adds Song Lyric Obfuscation and Chunk Fragmentation Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.religion.Recipe_Steganography_Whitespace_Noise_JSON_EscapeReligious Toxicity with Recipe Steganography, Whitespace Noise, and JSON_Escape MutationsTests to assess if an agent will produce content that promotes religious extremism. Adds Recipe Steganography, Whitespace Noise, and JSON_Escape Mutations
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.religion.Recipe_Steganography_Zalgo_NoiseReligious Toxicity with Recipe Steganography and Zalgo Noise MutationsTests to assess if an agent will produce content that promotes religious extremism. Adds Recipe Steganography and Zalgo Noise Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.stereo.advprefixStereotype Agreement with Adversarial PrefixA collection of tests that checks if an agent is able to refrain from producing hateful or stereotypical content about various demographics of people under adversarial conditions. Failures in these tests indicate that the agent is vulnerable to producing hateful content fueled by stereotypes if maliciously prompted.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.pornography_sexual_minors.Enhancement_Unicode_Junk_Injection_Zalgo_NoiseUnderage Pornography with Enhancement, Unicode Junk Injection, and Zalgo Noise MutationsTests to assess if an agent will produce content involving sexual minors. Adds Enhancement, Unicode Junk Injection, and Zalgo Noise Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.pornography_sexual_minors.Song_Lyric_Obfuscation_Chunk_FragmentationUnderage Pornography with Song Lyric_Obfuscation and Chunk Fragmentation MutationsTests to assess if an agent will produce content involving sexual minors. Adds Song Lyric_Obfuscation and Chunk Fragmentation Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.pornography_sexual_minors.Whitespace_Noise_MIME_Quoted_PrintableUnderage Pornography with Whitespace Noise and MIME Quoted Printable MutationsTests to assess if an agent will produce content involving sexual minors. Adds Whitespace Noise and MIME Quoted Printable Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.pornography_sexual_minors.Song_Lyric_ObfuscationUnderage Pornography with Song Lyric Obfuscation MutationsTests to assess if an agent will produce content involving sexual minors. Adds Song Lyric Obfuscation Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.violence.Song_Lyric_ObfuscationViolence with Song Lyric Obfuscation MutationsTests to assess if an agent will produce content that is violence. Adds Song Lyric Obfuscation Mutation.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.violence.Song_Lyric_Obfuscation_Chunk_FragmentationViolence with Song Lyric Obfuscation and Chunk Fragmentation MutationsTests to assess if an agent will produce content that is violent. Adds Song Lyric Obfuscation and Chunk Fragmentation Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.violence.Recipe_Steganography_Whitespace_Noise_JSON_EscapeViolence with Recipe Steganography, Whitespace Noise, and JSON Escape MutationsTests to assess if an agent will produce content that is violent. Adds Recipe Steganography, Whitespace Noise, and JSON Escape Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.violence.Recipe_Steganography_Zalgo_NoiseViolence with Recipe Steganography and Zalgo_Noise MutationsTests to assess if an agent will produce content that is violent. Adds Recipe Steganography and Zalgo_Noise Mutations
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.sexualisation.Zero_Width_Injection_Zalgo_NoiseSexual Objectification with Zero Width Injection and Zalgo Noise MutationsTests to assess if an agent will produce content that is sexually objectifying. Adds Zero Width Injection and Zalgo Noise Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.sexualisation.EnhancementSexual Objectification with Enhancement MutationTests to assess if an agent will produce content that is sexually objectifying. Adds Enhancement Mutation.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.sexualisation.Song_Lyric_Obfuscation_Chunk_FragmentationSexual Objectification with Song Lyric Obfuscation and Chunk Fragmentation MutationsTests to assess if an agent will produce content that is sexually objectifying. Adds Song Lyric Obfuscation and Chunk Fragmentation Mutations.
vijil.scenarios.safety_compliance_normcomplianceNorm Compliancevijil.probes.sexualisation.Advanced_Role_Play_Actor_JSON_Escape_Unicode_Junk_InjectionSexual Objectification with Advanced Role Play Actor, JSON Escape, and Unicode Junk Injection MutationsTests to assess if an agent will produce content that is sexually objectifying. Adds Advanced Role Play Actor, JSON Escape, and Unicode Junk Injection Mutations.
vijil.scenarios.safety_compliance_policycompliancePolicy Compliancevijil.probes.business_ethics.Government_Compliance_Directive_MIME_Quoted_PrintableBusiness Ethics with Government Compliance Directive and MIME Quoted Printable MutationsTests to assess if an agent will produce content that is business_ethics. Adds Government Compliance Directive and MIME Quoted Printable Mutations.
vijil.scenarios.safety_compliance_policycompliancePolicy Compliancevijil.probes.business_ethics.Recipe_Steganography_Chunk_FragmentationBusiness Ethics with Recipe Steganography and Chunk Fragmentation MutationsTests principles common in business code of conducts. Adds Recipe Steganography and Chunk Fragmentation Mutations.
vijil.scenarios.safety_compliance_policycompliancePolicy Compliancevijil.probes.business_ethics.Recipe_Steganography_Entropy_Reduction_Checksum_MarkerBusiness Ethics with Recipe Steganography and Entropy Reduction Checksum Marker MutationsTests principles common in business code of conducts. Adds Recipe Steganography and Entropy Reduction Checksum Marker Mutations.
vijil.scenarios.safety_compliance_policycompliancePolicy Compliancevijil.probes.business_ethics.Recipe_Steganography_Zalgo_NoiseBusiness Ethics with Recipe Steganography and Zalgo Noise MutationsTests principles common in business code of conducts. Adds Recipe Steganography and Zalgo Noise Mutations.
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.winobias.baseProfessional BiasAssesses if the agent generates gender-biased inferences about occupations. A high score indicates a lack of bias.
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.HarmBench_hallucination.char_codeHarmbench Hallucination Char_CodeTests whether an agent will help a user to generate misinformation. A high score indicates that the agent refuses to help produce false information. Adds character encoding jailbreak.
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.HarmBench_hallucination.embed_jsonHarmbench Hallucination Embed_JsonTests whether an agent will help a user to generate misinformation. A high score indicates that the agent refuses to help produce false information. Adds JSON embedding jailbreak.
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.HarmBench_hallucination.logical_appealHarmbench Hallucination Logical_AppealTests whether an agent will help a user to generate misinformation. A high score indicates that the agent refuses to help produce false information. Adds logical appeal jailbreak.
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.HarmBench_hallucination_Small.baseHarmBench: HallucinationTests whether an agent will help a user to generate misinformation. A high score indicates that the agent refuses to help produce false information.
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.social_engineering.Recipe_Steganography_Whitespace_Noise_JSON_EscapeSocial Engineering with Recipe Steganography, Whitespace Noise and JSON Escape MutationsTests to assess if an agent will produce content that is social_engineering. Adds Recipe Steganography, Whitespace Noise and JSON Escape Mutations.
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.social_engineering.Recipe_Steganography_Zalgo_NoiseSocial Engineering with Recipe Steganography and Zalgo Noise MutationsTests to assess if an agent will produce content that is social_engineering. Adds Recipe Steganography and Zalgo Noise Mutations.
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.social_engineering.Government_Compliance_Directive_MIME_Quoted_PrintableSocial Engineering with Government Compliance Directive and MIME Quoted Printable MutationsTests to assess if an agent will produce content that is social_engineering. Adds Government Compliance Directive and MIME Quoted Printable Mutations.
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.social_engineering.Zero_Width_Injection_Zalgo_NoiseSocial Engineering with Zero Width Injection and Zalgo Noise MutationsTests to assess if an agent will produce content that is social_engineering. Adds Zero Width Injection and Zalgo Noise Mutations
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.malicious_content.misinformation_homoglyphMisinformation / HomoglyphTests for misinformation using homoglyph mutation
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.malicious_content.misinformation_zero_widthMisinformation / Zero WidthTests for misinformation using zero width mutation
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.malicious_content.misinformation_spacesMisinformation / SpacesTests for misinformation using spaces mutation
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.malicious_content.misinformation_fullwidthMisinformation / FullwidthTests for misinformation using fullwidth mutation
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.malicious_content.misinformation_soft_mutationMisinformation / Soft MutationTests for misinformation using soft mutation mutation
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.cbrn.Whitespace_Noise_MIME_Quoted_PrintableCBRN with Whitespace Noise and MIME Quoted Printable MutationsTests to assess if an agent will produce content that helps to produce chemical, biological, radiological, and nuclear threats. Adds Whitespace Noise and MIME Quoted_Printable Mutations.
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.cbrn.Zero_Width_Injection_Zalgo_NoiseCBRN with Zero Width Injection and Zalgo Noise MutationsTests to assess if an agent will produce content that helps to produce chemical, biological, radiological, and nuclear threats. Adds Zero Width Injection and Zalgo Noise Mutations.
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.cbrn.Recipe_Steganography_Zalgo_NoiseCBRN with Recipe Steganography and Zalgo Noise MutationsTests to assess if an agent will produce content that helps to produce chemical, biological, radiological, and nuclear threats. Adds Recipe Steganography and Zalgo Noise Mutations.
vijil.scenarios.safety_compliance_ethicalbehaviorEthical Behaviorvijil.probes.cbrn.Recipe_Steganography_Whitespace_Noise_JSON_EscapeCBRN with Recipe Steganography, Whitespace Noise, and JSON Escape MutationsTests to assess if an agent will produce content that helps to produce chemical, biological, radiological, and nuclear threats. Adds Recipe Steganography, Whitespace Noise, and JSON Escape Mutations.
Last modified on June 2, 2026