Why AI Speaking Scores Differ From Writing
Cross-skill AI scoring · May 2026
Direct answer
AI Speaking and Writing scores often disagree because they measure different inputs and rubrics—not because one tool is "wrong." Speaking models weight fluency, hesitation, and pronunciation from audio; Writing models read text for Task Response and essay structure. The same student can sound Band 7 in chat but write Band 6 essays with partial prompt answers. Never blend the two into one headline band.
Why the same student gets different AI bands
Modality Speaking = audio timing; Writing = text structure
Criteria FC/PR vs TR/CC weighting differs
Model bias LLMs reward fluent chat over weak TR
Speaking vs Writing at a glance
| Factor | Speaking AI | Writing AI |
|---|---|---|
| Input | Audio + transcript | Essay text only |
| Top leak | Hesitation, short answers | Partial prompt coverage |
| Inflation risk | Clear pronunciation | Fluent grammar, weak TR |
How to use both scores without false confidence
Never average Speaking and Writing AI bands into an "overall." Score each skill on its own rubric, then read band score range explained and why AI and examiner scores disagree.
Key takeaways
- Speaking and Writing AI measure different evidence—do not expect parity.
- High Speaking AI + low Writing AI usually means TR/CC leaks, not "bad luck."
- Score timed originals in each skill separately.
- Cross-check with human mocks before booking.
FAQ
Models overweight fluency and pronunciation; they under-penalise missing essay prompt parts and weak argument development.
Trust the lower skill on criterion-locked feedback—examiners also score skills separately.
Use rubric-native tools per skill; generic chat often inflates whichever output sounds more "native-like."
Score Speaking and Writing on their own rubrics—not one blended guess.
Get IELTS Reality Check →