Running Evaluations
Run an Evaluation
Section titled “Run an Evaluation”agentv eval evals/my-eval.yamlResults are written to .agentv/results/eval_<timestamp>.jsonl.
Common Options
Section titled “Common Options”Override Target
Section titled “Override Target”Run against a different target than specified in the eval file:
agentv eval --target azure_base evals/**/*.yamlRun Specific Eval Case
Section titled “Run Specific Eval Case”Run a single eval case by ID:
agentv eval --eval-id case-123 evals/my-eval.yamlDry Run
Section titled “Dry Run”Test the harness flow with mock responses (does not call real providers):
agentv eval --dry-run evals/my-eval.yamlOutput to Specific File
Section titled “Output to Specific File”agentv eval evals/my-eval.yaml --out results/baseline.jsonlValidate Before Running
Section titled “Validate Before Running”Check eval files for schema errors without executing:
agentv validate evals/my-eval.yamlAll Options
Section titled “All Options”Run agentv eval --help for the full list of options including workers, timeouts, output formats, and trace dumping.