AI aptitude testing for hiring

Stop hiring on hunches. See how candidates actually think with AI.

Calibr puts candidates in a realistic three-pane workspace — tasks, editor, AI chat — and scores every interaction. You get an evidence-linked report instead of a vibe.

How it works

Three layers of judgement, not one end-of-test pass.

  1. 01

    Per-event judges

    A cheap classifier scores every message, edit, and AI exchange the moment it happens. Nothing is reconstructed after the fact.

  2. 02

    Rolling summarizer

    A scheduled pass condenses signals into a moving picture of how the candidate is working — not just what they produced.

  3. 03

    Final evaluator

    On submit, the top model writes an evidence-linked report. Every score points back to a specific message or event you can replay.

What we measure

Six dimensions, scored 0–100, every score backed by evidence.

Creativity

Does the candidate explore unusual framings before committing?

Collaboration balance

Are they driving the AI, or being driven by it?

AI core skill

Quality of prompting, decomposition, and tool use under pressure.

Source criticism

Do they verify the AI's claims, or accept them at face value?

Steering

Course-correction when the AI goes off-track.

Output sharpness

Final artifact quality — clarity, correctness, defensibility.

FAQ

Common questions.

Can candidates cheat by using AI?
We expect them to. The point is to measure how well they work with AI on realistic problems — not to catch them using it.
How long does a test take?
Tests are configured per role. Most run 45–90 minutes. Candidates can pause and resume.
What does the report look like?
Six dimension scores plus weighted total, each with quoted evidence and links to the exact message or event. You can replay the session.
Is this a coding test?
No. Tasks are realistic knowledge work — debugging, writing, analysis, planning. Coding is one type of task among many.

Want to try Calibr on a real role?

We're onboarding a small set of design partners. Tell us the role and we'll build a test for it.

hello@calibr.so