Calibration Drift in AI IELTS Mocks: Why Scores Creep Up
Mock inflation · Model drift · May 2026
Direct answer
Calibration drift is when AI IELTS mock bands rise without real skill gains. Causes include model updates, repeated prompts you have practised, chat history that “knows” your weaknesses, and tools defaulting to encouragement. After six weeks on one app, Band 7 mocks are common while official or human mocks stay at 6. Reset with blind tasks, fresh sessions, and a fixed offset from human checks—see score inflation over time.
Three drivers of calibration drift
Model drift Vendor updates change strictness without warning
Prompt leakage You recognise topics from earlier mocks
Session bias Long chats reward polish, not timed first drafts
Signs your mocks have drifted
| Signal | Likely cause |
|---|---|
| +0.5 band in 3 weeks, same errors | Tool or prompt change |
| AI 7, human mock 6 | Inflation—false confidence |
| Scores vary ±1 in new chats | No fixed rubric state |
Monthly recalibration protocol
- One blind Writing Task 2 and one Speaking Part 2—no outlines.
- Score in a new session; log tool name and date.
- Compare to human mock or last official band.
- Apply offset; track on calibration guide.
Key takeaways
- Drift means higher AI scores without examiner-level improvement.
- Model updates and familiar prompts are the main hidden drivers.
- Blind first drafts in fresh sessions slow inflation.
- Human or official checks set the offset—AI alone cannot.
FAQ
When the same quality of work scores higher over time because the tool, prompt, or your familiarity changed—not because your IELTS skill improved.
At least monthly: one blind timed task per skill, scored in a fresh session, compared to a human mock or past official result.
Yes—model updates can shift baselines overnight. Log tool version and apply a fixed offset after blind checks.
Reset mock inflation before you book the exam.
Get IELTS Reality Check →