Qwen3-Coder-Next: Alibaba’s Ultra-Sparse 80B Coding Agent

February 24, 2026Provided by Utku Ege Tuluk

Alibaba’s Qwen team has released Qwen3-Coder-Next, a new open-weight coding model that achieves remarkable efficiency through an ultra-sparse Mixture-of-Experts (MoE) architecture — activating only 3 billion of its 80 billion total parameters per forward pass. Released in February 2026 under the Apache 2.0 license, the model is purpose-built for coding agents and local development, scoring 70.6% on SWE-Bench Verified while delivering throughput comparable to models with 10–20× more active parameters.

Ultra-sparse mixture of experts neural network visualization representing Qwen3-Coder-Next architecture — Illustration generated by AI

An Ultra-Sparse Architecture Designed for Efficiency

Qwen3-Coder-Next is built on the Qwen3-Next-80B-A3B-Base foundation, which introduces a hybrid attention and MoE design that dramatically reduces inference cost without sacrificing capability. Its 48-layer architecture follows a specific layout: 12 blocks each containing three Gated DeltaNet layers followed by one Gated Attention layer, with each paired to a shared MoE block.

The MoE configuration is notably sparse: out of 512 total experts, only 10 are activated per token (plus 1 shared expert), with a compact expert intermediate dimension of just 512. The attention layers use 16 query heads with only 2 key-value heads (grouped query attention), keeping memory bandwidth low during inference. The model natively supports a 256K token context window (262,144 tokens), making it well-suited for large codebases and long agentic reasoning chains.

Specification	Detail
Total Parameters	80B (79B non-embedding)
Activated Parameters	3B per token
Architecture	Hybrid Attention + MoE
Context Length	262,144 tokens
Total Experts	512 (10 activated + 1 shared)
License	Apache 2.0

Agentic Training at Scale

What sets Qwen3-Coder-Next apart from standard instruction-tuned models is its agentic training methodology. Rather than relying on parameter scaling alone, the Qwen team built around 800,000 verifiable tasks paired with executable environments, enabling the model to learn from real environment interactions and reinforcement learning signals.

This training approach focuses on skills critical for real-world agent use: long-horizon reasoning, complex tool usage, and recovery from execution failures. The result is a model that handles dynamic, multi-step coding tasks — not just isolated completions — making it well-suited for integration into CLI and IDE-based agent frameworks such as Claude Code, Qwen Code, Cline, Kilo, and Trae.

On SWE-Bench Verified (using the SWE-Agent scaffold), Qwen3-Coder-Next scores 70.6%, placing it competitively among frontier coding agents. It also achieves 62.8% on SWE-Bench Multilingual and 44.3% on the more demanding SWE-Bench Pro, all while activating just 3B parameters — a fraction of what competing models require.

What This Means for Developers

The practical implications of Qwen3-Coder-Next’s architecture are significant. By activating only 3B parameters per forward pass, the model achieves roughly 10× higher throughput compared to dense models of equivalent quality — a major advantage for developers running local inference or managing agent fleets at scale.

The model is deployable via popular serving frameworks. With SGLang (v0.5.8+) and vLLM (v0.15.0+), Qwen3-Coder-Next can be served with tool-calling support enabled out of the box using the qwen3_coder parser. Four model variants are available on Hugging Face, including quantized (FP8) and base versions, with AMD Instinct GPU support available from day one.

Recommended inference parameters are temperature=1.0, top_p=0.95, and top_k=40. The model operates in non-thinking mode only — there are no <think> reasoning blocks — keeping outputs clean and directly usable by agent scaffolds.

For the open-source AI community, Qwen3-Coder-Next represents an important step toward making frontier-grade coding agents viable on consumer and prosumer hardware. With Apache 2.0 licensing and commercial use permitted for enterprises and indie developers alike, it lowers the barrier to building sophisticated, locally-hosted AI coding workflows.

Related Coverage

Alibaba Unveils Qwen3-Coder: A New Era for Agentic Code Generation — the predecessor model that introduced the Qwen3-Coder family
Introducing Qwen-Code: Alibaba’s Open-Source CLI for Agentic Coding with Qwen3-Coder — the CLI tool built around the Qwen3-Coder family
Major Announcement: Qwen3-ASR & Qwen3-ForcedAligner Open Sourced — Alibaba’s recent speech intelligence release in the Qwen3 family

An Ultra-Sparse Architecture Designed for Efficiency

Agentic Training at Scale

What This Means for Developers

Related Coverage

Sources