Meta’s New AI Model Can Isolate Any Object in an Image or Video

Meta’s New AI Model Can Isolate Any Object in an Image or Video—Just Ask It

Ryan Chen

Translate this article

Updated:

November 21, 2025

Telling a computer to find a specific object in a photo or video clip has always required technical skill. Whether for a creative project, an e-commerce listing, or scientific research, precisely isolating elements has been a complex task. Meta’s latest release, the Segment Anything Model 3 (SAM 3), is designed to change that by making advanced visual recognition as simple as typing a phrase.

Announced alongside an interactive demo website called the Segment Anything Playground, SAM 3 allows users to identify, outline, and track objects using intuitive prompts. In a significant move for open science, Meta is also publicly releasing the model's core components for researchers and developers.

Moving Beyond a Fixed List of Objects

Unlike earlier systems trained to recognize a limited set of common items, SAM 3 understands a vast vocabulary. This shift allows it to find nuanced objects based on natural language.

Text and Image Prompts: You can ask it to find "the red umbrella" or "a person sitting on a bench" using a text prompt. Alternatively, you can provide an example image of an object and have SAM 3 find all similar instances in your media.
Unified Model for Multiple Tasks: SAM 3 is a single model that handles detection (finding the object), segmentation (outlining its precise shape), and tracking (following it across a video).
Performance Leap: Meta reports that SAM 3 delivers a twofold performance gain over existing systems on its new benchmark for this type of task.

From the Lab to Real-World Applications

This technology is already being integrated into practical tools and features:

Simplified Video Editing: Soon in Instagram’s Edits app, creators will be able to apply dynamic effects to specific people or objects with a single tap, streamlining a traditionally complex process.
Enhanced Shopping: The new "View in Room" feature on Facebook Marketplace, powered by SAM 3 and its 3D counterpart, helps shoppers visualize how home decor items will look and fit in their actual space.
Scientific Research: In partnership with conservation groups, Meta used SAM 3 to help create a public dataset of wildlife videos, with every animal in every frame identified and outlined—a valuable resource for ecologists.

Try It Yourself with the Segment Anything Playground

For those without a technical background, the easiest way to experience this technology is through the new Segment Anything Playground.

User-Friendly Interface: The platform allows anyone to upload an image or video and experiment with prompts.
Practical and Creative Templates: Users can quickly try pre-set actions like pixelating faces or license plates, or apply fun effects like motion trails and spotlights to specific objects.
No Expertise Needed: The site is designed to be the simplest way for the public to interact with and understand the capabilities of cutting-edge visual AI.

For developers, the release of the model's building blocks, training code, and datasets provides a powerful foundation for building new applications and conducting further research. While challenges remain—such as handling highly specialized terminology—SAM 3 represents a significant leap toward making sophisticated visual analysis an intuitive and accessible tool for everyone.

airesearch and innovationemerging trends

About the Author

Ryan Chen

Ryan Chan is an AI correspondent from Chain.