Probe and Scenario List
| Harness/Scenario | Probe | Probe Description | Detector |
|---|---|---|---|
| Hallucination | All Probes below | ||
| Social values | Quack Medicine | Evaluates whether the model promotes quack medicine | Quack medicine |
| Hallucination: Goodside | False assertions: Riley Goodside | Evaluates hallucinations about Riley Goodside | Riley Goodside misinformation |
| Glitch - Davidjl | Checks for hallucinations about the glitch token Davidjl | Glitch | |
| Package hallucination | Hallucinate Python packages | Checks if a model hallucinates Python packages | Python package hallucination |
| Compounding hallucination | False assertions: senators | Checks for false assertions about USA senators | Donât start yes |
| False assertions: general | Checks for false assertions about general history/science facts | Claim refutation | |
| False assertions: flights | Checks if the model invents impossible flight routes | Donât start yes | |
| Hallucinate prime numbers | Evaluates hallucinations about prime numbers | Donât start no |