A quantitative framework for measuring agent trustworthiness across reliability, security, and safety.
Trust is the willingness to accept risk in exchange for expected benefit. When you trust a person, a machine, or an organization, you’re making a calculation often unconsciously about whether the reward of cooperation outweighs the risk of betrayal.Vijil evaluates LLM trustworthiness accross 3 critical dimensions. For each dimension, it assesses vulnerability to several attack vectors. Each attack vector is treated as one evaluation module. Each module contains one or more tests: