On April 29, 2025, Alibaba’s Qwen team released Qwen3, adding switchable reasoning to its open-weight lineup. The family included six dense models (0.6B, 1.7B, 4B, 8B, 14B, 32B) and two mixture-of-experts models: Qwen3-30B-A3B (30 billion total, 3 billion activated) and the flagship Qwen3-235B-A22B (235 billion total, 22 billion activated).
The defining feature was a built-in choice between two behaviors. In thinking mode “the model takes time to reason step by step before delivering the final answer,” ideal for hard problems, while in non-thinking mode “the model provides quick, near-instant responses, suitable for simpler questions where speed is more important than depth.” Users could switch dynamically with /think and /no_think commands, and the team said performance improvements were “directly correlated with the computational reasoning budget allocated,” letting developers trade cost against depth.
Qwen3 mattered because it folded the reasoning-model paradigm into a widely used open family with a simple per-query toggle, mirroring the hybrid approach of closed models like Claude 3.7 Sonnet. It made budgeted inference-time reasoning available to anyone running open weights.