Think Deeper, Act Faster with Qwen3 an Open-Weight Models with Advanced Features
The Qwen Team has launched Qwen3, the latest in their large language model series, offering competitive performance in coding, math, and general tasks. With open-weight models, hybrid thinking modes, and support for 119 languages, Qwen3 aims to advance global AI research and development. This release enhances efficiency and versatility, empowering researchers and developers.
Open-Weight Models
Qwen3 includes two Mixture of Experts (MoE) models which are Qwen3-235B-A22B (235 billion total parameters, 22 billion activated) and Qwen3-30B-A3B (30 billion total, 3 billion activated) and six other dense models: Qwen3-32B, 14B, 8B, 4B, 1.7B, and 0.6B. All are open-weighted under the Apache 2.0 license. The Qwen3-30B-A3B outperforms QwQ-32B with ten times fewer activated parameters, and Qwen3-4B rivals Qwen2.5-72B-Instruct. Pre-trained and post-trained models, like Qwen3-30B-A3B and its base version, are available on Hugging Face, ModelScope, and Kaggle.
Hybrid Thinking Modes
Qwen3 features two problem-solving modes: Thinking Mode for step-by-step reasoning on complex tasks and Non-Thinking Mode for near-instant responses to simple queries. Users can switch modes using “/think” or “/no_think” prompts in multi-turn conversations, enabling stable thinking budget control. Performance scales with computational resources, balancing cost and inference quality for diverse tasks.
119 Languages Supported
Qwen3 supports 119 languages and dialects, including Indo-European (e.g., English, Russian), Sino-Tibetan (e.g., Chinese), Afro-Asiatic (e.g., Arabic), Austronesian (e.g., Indonesian), Dravidian (e.g., Tamil), Turkic (e.g., Turkish), and others like Japanese and Swahili. This multilingual capability supports global applications, from chatbots to research.
Robust Pretraining
Qwen3’s pretraining dataset spans 36 trillion tokens across 119 languages, nearly double Qwen2.5’s 18 trillion. It includes web content, PDF-extracted text via Qwen2.5-VL, and synthetic math/coding data from Qwen2.5-Math and Qwen2.5-Coder. The three-stage process involved:
Post-Training Pipeline
Qwen3’s four-stage post-training process enables hybrid capabilities:
Qwen3’s open-weight models, hybrid thinking, and multilingual support empower researchers and developers. Open-sourcing fosters collaboration, driving AI innovation. Explore Qwen3 on Qwen Chat or Hugging Face to see its potential.
Future Plans
Qwen3 is a step toward AGI and ASI, with plans to refine architectures, scale data, increase model size, extend context length, broaden modalities, and advance reinforcement learning for long-horizon reasoning. The focus will shift to training agents with environmental feedback.
About the Author
Mia Cruz
Mia Cruz is an AI news correspondent from United States of America.
Recent Articles
Subscribe to Newsletter
Enter your email address to register to our newsletter subscription!