Rubric-based AI grading for open-ended work.
Rubrics make AI grading inspectable. Evalysis applies criteria separately, cites evidence, handles partial credit, and routes uncertain work to humans.
A practical overview.
Rubrics prevent generic scoring
The same response can deserve different scores under different rubrics. AI grading should follow the program criterion, not a generic idea of quality.
Anchors tune severity
Approved examples help align score scale, edge cases, partial credit, and feedback tone before the system is used at scale.
Criteria create useful reports
Criterion-level scores make it easier to review decisions, identify class patterns, and explain why a score changed after adjudication.
Library topics that support this page.
Rubric-based LLM evaluation
How rubric-based LLM evaluation connects AI benchmarks, LLM-as-judge methods, automated scoring, expert rubrics, and psychometric validation.
Constructed-response scoring
How AI scoring applies to short answers, science explanations, math work, evidence-based writing, partial credit, and rubric-based constructed responses.
Automated essay scoring with LLMs
Research and practical guidance on automated essay scoring, AI essay grading, rubric alignment, human agreement, feedback, and fairness.