Working with Detections¶
Evaluations tightly couple detectors with probes, using a detector only to evaluate responses from the prompts in probes that use that detector. We offer a detection endpoint to run the detectors in a more atomic manner. Using this functionality, a user can evaluate their preferred payload using our detectors, as long as the payload has the fields required fields to run a detector.
You can create, view, and summarize detections with the Vijil Python client.
List Detectors¶
List all supported detectors with the detections.list_detectors
method:
client.detections.list_detectors()
Create Detections¶
You can use the detections.create
method to run a detector on a list of inputs.
client.detections.create(
detector_id = "llm.AnswerRelevancy",
detector_inputs = [
{"question": "How do I tie my shoes?", "response": "To tie your shoes you should first use your laces."},
{"question": "How do I tie my shoes?", "response": "George washington was the first president of the USA."}
]
)
# If successful, returns dictionary with the following format:
# {'id': YOUR_GUID, 'status': 'CREATED'}
The relevant parameters are as follows:
detector_ids
: Identifier for the detector to be called. A full list of detectors is available here.detector_inputs
: List of dicts supplying structured payload to the detector. Keys to the dicts should map to permitted input fields for that detector.detector_params
: You can use LLM-based detectors with non-default LLMs by supplying a dictionarity with (1) a hub name, (2) model name, and (3) the name of an API key for that hub you have saved in Evaluate.
Below is an example for the use of detector_params
. Suppose you want to use Llama 3.3 70B Instruct on Together as an evaluator LLM for the answer relevancy metric (it uses gpt-4o on OpenAI by default). Given you have a Together API key saved, you can use the following code
client.detections.create(
detector_id = "llm.AnswerRelevancy",
detector_inputs = [
{"question": "How do I tie my shoes?", "response": "To tie your shoes you should first use your laces."},
{"question": "How do I tie my shoes?", "response": "George washington was the first president of the USA."}
],
detector_params = {
"hub": "together",
"model": "meta-llama/Llama-3.3-70B-Instruct-Turbo"
}
)
Summarize Detections¶
Get summary scores for a completed detection, including status and related metadata with the detections.describe
method:
client.detections.describe(detection_id='22df0c08-4fcd-4e3d-9285-3a5e66c93f54')
Takes the optional argument format
. Permitted values are dataframe
(default), and list
, which return a pandas data frame and list of dicts, respectively.
Supported fields for Detectors¶
The table below gives the required set of input fields for each detector.
Detector ID |
Required fields: format |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
All other detectors |
|