Files
supabase/apps/studio/evals
Matt Rossman 4553f09bb5 feat(assistant): hallucination scorers + corrective measures for storage versioning answers (#41655)
* feat: "Docs Faithfulness" scorer

* feat: test case for storage object restoration

* feat: "Factuality" scorer

* feat: "Factuality" -> "Correctness"

* feat: update Storage recovery test case

* feat: finishReason in task output

* feat: encourage parallel tool calls + docs search, discourage superfluous context gathering

* prompt tuning (tool selection strategy)

* add data recovery section in chat prompt

* test: S3 versioning support correctness

* refactor: derive stepsSerialized/textOnly from shared steps data

* fix: input in correctness scorer
2026-01-15 14:22:46 +07:00
..