Mistral AI has quietly rolled out Mistral Small 3.2, a refined successor to its popular Small 3.1 model. Released on June 20, 2025, this update focuses on improving instruction following, reducing repetition errors, and strengthening function-calling robustness, making it an even more reliable choice for developers running large language models locally (simonwillison.net, huggingface.co).
Mistral AI recommends running Small 3.2 with a low temperature—around 0.15—to strike the best balance between creativity and reliability. A suggested system prompt reminding the model of its knowledge cutoff (“last updated on 2023-10-01”) is also provided for more consistent outputs (simonwillison.net).
| Metric | Small 3.1 | Small 3.2 |
|---|---|---|
| Wildbench v2 Instruction Following | 55.60% | 65.33% |
| Arena Hard v2 Instruction Following | 19.56% | 43.10% |
| Internal Accuracy (IF) | 82.75% | 84.78% |
| Infinite Generation Rate (Lower Is Better) | 2.11% | 1.29% |
| MMLU Pro (5-shot CoT) | 66.76% | 69.06% |
| MBPP Plus – Pass@5 | 74.63% | 78.33% |
These gains demonstrate that while Small 3.2 is a “minor” bump in version number, its real-world usability—especially for code generation and instruction-driven tasks—sees a clear boost (huggingface.co).
You can find Mistral Small 3.2 on Hugging Face under the mistralai/Mistral-Small-3.2-24B-Instruct-2506 repository:
https://huggingface.co/mistralai/Mistral-Small-3.2-24B-Instruct-2506
The model supports both FP16 and FP8 GGUF formats, making it feasible to run on machines with as little as 16 GB of RAM when using quantized versions (simonwillison.net).
Mistral recommends using vLLM (>=0.9.1) for best performance:
pip install vllm --upgrade
vllm serve mistralai/Mistral-Small-3.2-24B-Instruct-2506 --tokenizer_mode mistral \
--config_format mistral --load_format mistral --enable-auto-tool-choice \
--limit_mm_per_prompt 'image=10' --tensor-parallel-size 2
Running on GPU in fp16/bf16 requires roughly 55 GB of GPU memory. Alternatively, you can use the transformers library with minimal code changes.
With its 24 billion parameters, Mistral Small remains one of the most accessible yet capable open-source LLMs, striking a balance between performance and hardware requirements. Version 3.2’s refinements make it an even stronger candidate for:
As the open-source community continues to push boundaries, having a dependable, locally runnable model like Mistral Small 3.2 is invaluable for both hobbyists and enterprise teams.