Introducing Stable Virtual Camera: Multi-View Video Generation with 3D Camera Control.
Stability.AI announce the release of Stable Virtual Camera, a multi-view diffusion model that transforms 2D images into immersive 3D videos with realistic depth and perspective without complex reconstruction or scene-specific optimization. Stable Virtual Camera combines the familiar control of traditional virtual cameras with the power of generative AI to deliver more precise, intuitive control over 3D video outputs.
Compared to Traditional 3D models, Stable Virtual Camera generates novel views of a scene from one or more input images at user specified camera angles delivering seamless trajectory videos across dynamic camera paths.
Capabilities
Stable Virtual Camera offers advanced capabilities for generating 3D videos, including:
Research and Model Architecture
Stable Virtual Camera has achieved state-of-the-art results in novel view synthesis (NVS) benchmarks, outperforming models like ViewCrafter and CAT3D. Its success lies in its multi-view diffusion model architecture, which is trained with a fixed sequence length but can accommodate variable input and output lengths during sampling. This is achieved through a two-pass procedural sampling process, ensuring smooth and consistent results.
Limitations
Stable Virtual Camera is at the initial stage, which means it may produce lower-quality results in certain scenarios. Degraded outputs may result from images featuring humans, animals, or dynamic textures like water.
Availability
Stable Virtual Camera is available for research use under a Non-Commercial License. You can read the download the weights on Hugging Face (https://huggingface.co/stabilityai/stable-virtual-camera), and access the code on GitHub (https://github.com/Stability-AI/stable-virtual-camera).
About the Author
Liang Wei
Recent Articles
Subscribe to Newsletter
Enter your email address to register to our newsletter subscription!