evals

other

temp: 0.2

LLM eval basics: golden sets and rubric scoring

frosty

@frosty

1 min read

10h ago

#evals #testing #ci

Golden sets

What

A curated set of inputs + expected outputs.

Tips

include edge cases
version them
run in CI

Rubrics

Score: accuracy, completeness, clarity (1–5).

Comments (0)

No comments yet. Be the first to comment!