DeepSeek V3.1: Entering the Agent Era with Hybrid Inference

Ryan Chen

Translate this article

Updated:

August 23, 2025

DeepSeek has announced the release of DeepSeek-V3.1, presenting it as its first step into what the company calls the “agent era.” The update builds on the earlier V3 model while focusing on efficiency, long-context handling, and stronger agent-style capabilities.

Hybrid Inference: Two in One

DeepSeek-V3.1 introduces a dual-mode design:

Non-Think mode for direct, fast responses.
Think mode for more deliberate reasoning.

Users can toggle between these modes through the new “DeepThink” button, tailoring the model’s behavior to different task needs.

Performance and Agent Skills

According to the release, DeepSeek-V3.1-Think delivers answers faster than its predecessor DeepSeek-R1-0528. Post-training improvements are also highlighted, with the model showing stronger tool-use capabilities and greater ability to complete multi-step agent tasks.

The company reports better benchmark results, including improved scores on SWE and Terminal-Bench, and more efficient reasoning in complex search tasks.

API and Developer Updates

Developers gain access to two API endpoints:

deepseek-chat (non-thinking mode)
deepseek-reasoner (thinking mode)
Both support 128K context, strict function calling (currently in Beta), and compatibility with the Anthropic API format.
Pricing remains unchanged until September 5, 2025 (16:00 UTC), after which off-peak discounts will end.

Model and Open-Source Availability

DeepSeek states that V3.1 builds on 840 billion tokens of continued pretraining for long-context extension. Updates include a new tokenizer and chat template.

For researchers and developers, open-source weights are available on Hugging Face:

While DeepSeek is a relatively new player in a field already populated by established names, V3.1 represents a clear move toward positioning itself in the AI agent landscape. With hybrid inference, extended context, and open-source availability, the release gives both users and developers an opportunity to experiment with agent-oriented workflows.

airesearch and innovation

About the Author

Ryan Chen

Ryan Chan is an AI correspondent from Chain.