Meta Launches Llama 4: Smarter, Faster, Multimodal AI for Everyone

Liang Wei

Translate this article

Updated:

April 7, 2025

Recently, Meta announced the release of Llama 4, its next-generation suite of open-weight, multimodal AI models designed to power more personalized and intelligent experiences. The first two models in the series which include Llama 4 Scout and Llama 4 Maverick are now available for download, while a more advanced model, Llama 4 Behemoth and Llama 4 Reasoning , remains in training.

Introducing Llama 4 Scout and Llama 4 Maverick

Built on a mixture-of-experts (MoE) architecture, these models balance efficiency and performance:

Llama 4 Scout (17B active parameters, 16 experts) is optimized for efficiency, extremely fast, fitting on a single NVIDIA H100 GPU while supporting an industry-leading 10 million token context window. It outperforms comparable models like Gemma 3 and Mistral 3.1 in reasoning, coding, and multimodal tasks.
Llama 4 Maverick (17B active parameters, 128 experts) rivals GPT-4o and Gemini 2.0 Flash in benchmarks, it's much smaller and efficiency than DeepSeek 3, delivering strong performance in reasoning, coding, and image understanding at a fraction of the computational cost.

Both models are natively multimodal, integrating text and vision seamlessly through early fusion, enabling richer interactions with images and videos.

The Power Behind Llama 4: Llama 4 Behemoth

The smaller models were distilled from Llama 4 Behemoth, a 288B-parameter MoE model still in training. Early benchmarks show it surpassing GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro in STEM-focused tasks. Meta plans to share more details as training progresses.

Key Innovations

Efficiency & Scalability: MoE architecture ensures only a subset of parameters are activated per inference, reducing costs.
Extended Context: Llama 4 Scout’s 10M-token support enables deep analysis of long documents, codebases, and user activity logs.
Improved Training Techniques: Meta introduced MetaP, a method for optimizing hyperparameters, and refined reinforcement learning to enhance reasoning and coding abilities.
Bias Mitigation: Llama 4 shows reduced political bias compared to Llama 3, with refusal rates on debated topics dropping from 7% to under 2%.

Safety & Accessibility

Meta continues its commitment to open AI development, releasing safeguards like:

Llama Guard (input/output safety filtering)
Prompt Guard (jailbreak detection)
CyberSecEval (cybersecurity risk assessment)

Developers can download Llama 4 Scout and Maverick today on llama.com and Hugging Face, with integrations rolling out across WhatsApp, Messenger, Instagram, and Meta.AI.

Download the models and explore the future of AI.

Artificial IntelligenceResearch and Innovation

About the Author

Liang Wei

Liang Wei is our AI correspondent from China