Creativity
Does the candidate explore unusual framings before committing?
AI aptitude testing for hiring
Calibr puts candidates in a realistic three-pane workspace — tasks, editor, AI chat — and scores every interaction. You get an evidence-linked report instead of a vibe.
How it works
A cheap classifier scores every message, edit, and AI exchange the moment it happens. Nothing is reconstructed after the fact.
A scheduled pass condenses signals into a moving picture of how the candidate is working — not just what they produced.
On submit, the top model writes an evidence-linked report. Every score points back to a specific message or event you can replay.
What we measure
Does the candidate explore unusual framings before committing?
Are they driving the AI, or being driven by it?
Quality of prompting, decomposition, and tool use under pressure.
Do they verify the AI's claims, or accept them at face value?
Course-correction when the AI goes off-track.
Final artifact quality — clarity, correctness, defensibility.
FAQ
We're onboarding a small set of design partners. Tell us the role and we'll build a test for it.
hello@calibr.so