The infrastructure behind serious AI scoring.
Operators and technical buyers need to know how submissions enter, how rubrics and anchors shape scoring, what the APIs return, and how cloud, private cloud, and on-prem deployments differ.
What the platform actually does.
Evalysis turns messy student work into reviewable scoring data, feedback, analytics, and audit records that teams can operate at classroom, district, or exam-program scale.
Submission intake
Accept pasted responses, PDFs, scans, photos, worksheets, diagrams, speech files, and exported answer data. Normalize everything into a response package before scoring.
Scoring workspace
Rubric, anchors, calibration samples, response viewer, criterion-level scores, confidence, feedback, and human review queues in one operations surface.
Agent orchestration
Specialist raters, critics, adjudicators, calibrators, fairness reviewers, and audit loggers are assembled according to item type and risk level.
Reporting layer
Produce per-student feedback, class summaries, item analytics, confidence bins, score distributions, subgroup checks, and replayable audit artifacts.
Built for reviewers, scoring leaders, and technical teams.
The workspace should feel closer to an assessment operations console than a chatbot. Teachers and SMEs see the response and feedback; program teams see routing, item behavior, and quality control.
Response, rubric, decision, report.
Every workflow is organized around four objects: the original student response, the criteria being applied, the scored decision, and the reporting package that makes the decision usable.
Use the UI, the API, or both.
Some customers want a hosted reviewer console. Others want Evalysis embedded inside an LMS, exam platform, or internal scoring pipeline. The platform supports both while preserving the scoring contract.
Try Now API
The cloud trial follows the production contract: upload student work, pass rubric context, receive a structured SME scoring report.
Batch scoring API
Submit a class set or exam batch, track job status, retrieve scores, feedback, routing decisions, and audit IDs.
Reporting API
Pull score distributions, item stats, confidence bands, human-review queues, and trace artifacts into your own systems.
{
"score": "4 / 5",
"confidence": "Medium-high",
"criteria": [{ "name": "Reasoning", "score": "3 / 4" }],
"feedback": "Student-facing feedback",
"escalation": ["Review source use"],
"audit_id": "trace_..."
}Cloud for speed, on-prem for control.
Deployment is a product decision, not a footnote. High-stakes exams often need stronger data boundaries, local audit custody, and program-controlled operations.
Cloud
For tutoring schools, internal benchmarks, curriculum teams, and quick pilots. Upload rubrics and responses, then use the hosted workspace and APIs.
Private cloud / VPC
For institutions that need SSO, role controls, private storage, network boundaries, and controlled integration with existing data systems.
On-prem
A major option for sensitive exams. Keep responses, rubrics, anchor sets, logs, and scoring outputs inside the customer-controlled environment.
Zero-shot
Use the rubric directly and grade immediately. Best for formative use, pilots, and lower-stakes feedback.
Human-in-loop
Select representative samples for teachers or scoring leaders to label, align the panel, then score the rest with confidence routing.
Fine-tuned
Tune to approved anchors and program-specific criteria, then deliver a full alignment and technical report.
Every score should be inspectable.
The trace is the connective tissue between product and validation: it shows what source material was reviewed, which agents disagreed, why the final score landed where it did, and what should be reviewed by a human.
Essay · evidence-based response (Grade 9 RLA)
Prompt: 'Drawing on the passage, explain whether the narrator's choice was justified.'
