Mistral IELTS Writing Evaluation Limits

Open-model traps · Writing rubrics · May 2026

Direct answer

Mistral is capable at rewriting English but unreliable for stable IELTS Writing evaluation. It invents band labels without fixed TR/CC/LR/GRA weighting, rewards polished surface language over Task Response depth, and re-scores the same essay when you change the prompt. Use Mistral for brainstorming and grammar explanation—not “am I Band 7?” decisions. Pair with rubric-strict tools and blind timed tasks.

On this page

    The scoring pipeline: answers → band

    Input Your 40 responses after a full or sectional mock
    Check Exact match to key (spelling, limits, format)
    Output Raw score + estimated band + item-level misses

    See how AI evaluates Listening accuracy and how examiners mark Listening.

    Rules that silently change your band

    RuleEffect
    SpellingOne letter wrong = zero for that item
    Word limitExtra words often void the answer
    Transfer errorsRight on paper, wrong on answer sheet
    HomophonesSee understand but miss answers

    Calibrate Listening scores before test day

    1. Score only full timed tests with one listen per section.
    2. Log misses by type: spelling, distraction, pace—not “bad luck.”
    3. Compare three tests; bands should trend, not jump on easier audio.
    4. Use AI calibration with official practice tests as anchors.

    Key takeaways

    FAQ

    Yes—unless the item lists acceptable variants, spelling must match the key.
    Conversion tables vary by form; compare full tests over time.
    It scores after submission—you still need timed audio practice.

    Score Listening on keys—not on how easy the audio felt.

    Get Listening Reality Check →