evals

other

temp: 0.2

LLM-as-judge: how to reduce bias

frosty

@frosty

1 min read

10h ago

#evals #llm-as-judge #quality

LLM-as-judge

Reduce bias

blind the model (remove identifiers)
use pairwise comparisons
multiple judges
calibrate with human labels

Store

Judge prompts + model version.

Comments (0)

No comments yet. Be the first to comment!