Google Unveils Gemini 2.5 Computer Use: A Pro Model Built for UI Agent Control
Translate this article
Google has introduced Gemini 2.5 Computer Use, a refined version of its Gemini 2.5 Pro model designed specifically for agents that interact directly with user interfaces on the web and mobile platforms. The model demonstrates strong results in web control benchmarks, offering improved speed and lower latency compared to other systems.
Key Capabilities
How It Functions
Gemini 2.5 Computer Use follows a continuous loop process — it captures a screenshot, references prior interactions, predicts the next action, executes it, and repeats the cycle. Its action library includes standard UI behaviors such as typing, scrolling, selecting from dropdown menus, and handling logins. For sensitive operations, it requests explicit user confirmation before proceeding.
Benchmark Results
Testing shows Gemini 2.5 Computer Use outperforming competing models in several key areas:
Early Applications
Google has already deployed the model internally across projects such as UI testing, Firebase Agent, Project Mariner, and AI Mode. External partners like Poke.com, Autotab, and Google Payments are also using it to enhance automation reliability and system recovery rates.
Availability
Gemini 2.5 Computer Use is now available for public preview through the Gemini API on AI Studio and Vertex AI. Developers can experiment with it in Browserbase or build custom agents using Playwright locally or via cloud-based setups.
Recent Articles
Subscribe to Newsletter
Enter your email address to register to our newsletter subscription!