Microsoft recently unveiled Phi‑4‑mini‑flash‑reasoning, a compact yet powerful AI model designed for advanced mathematical and logical reasoning in constrained environments like mobile and edge devices. This 3.8 billion‑parameter transformer delivers next‑generation speed and efficiency while retaining strong reasoning capabilities. (Microsoft Azure)
Phi‑4‑mini‑flash‑reasoning isn’t just fast—it’s also accurate:
| Benchmark | Phi‑4‑mini‑flash | Phi‑4‑mini‑reasoning | Larger Models |
|---|---|---|---|
| AIME24 | 52.29% | 48.13% | ~53–55% |
| AIME25 | 33.59% | 31.77% | – |
| Math500 | 92.45% | 91.20% | ~92–93% |
| GPQA‑Diamond | 45.08% | 44.51% | ~47–49% |
These results show it rivals much larger models in mathematical and graduate‑level problem‑solving (Hugging Face).
Phi‑4‑mini‑flash‑reasoning is available now from:
Model cards, code samples, and a technical paper are offered for deeper insights. Integration into existing frameworks such as vLLM is seamless thanks to support for Flash‑Attention and common tools like PyTorch and Transformers (Hugging Face).
Microsoft emphasizes trust and safety, using methods like Supervised Fine‑Tuning, Direct Preference Optimization, and Reinforcement Learning from Human Feedback (RLHF). These align with Microsoft’s broader AI principles—accountability, transparency, fairness, privacy, and security (Microsoft Azure).