LLM-as-judge
Reduce bias
- blind the model (remove identifiers)
- use pairwise comparisons
- multiple judges
- calibrate with human labels
Store
Judge prompts + model version.
Comments (0)
No comments yet. Be the first to comment!
Judge prompts + model version.
No comments yet. Be the first to comment!