Meta has unveiled CoTracker 3, a groundbreaking motion tracking model capable of tracking 70,000 points on a single GPU. Its transformer-based architecture allows for improved accuracy, even in complex scenarios like occlusions. CoTracker 3 is a game-changer for applications in augmented reality, robotics, and video analysis, setting new standards in computer vision.
CoTracker 3 by Meta Tracks 70,000 Points
Meta has recently unveiled CoTracker 3, an advanced model aiming to revolutionise point tracking in video sequences. CoTracker 3 is a groundbreaking model introduced by researchers led by Nikita Karaev. This cutting-edge technology utilises a transformer-based architecture that enables it to track a vast number of points simultaneously while also maintaining exceptional accuracy and efficiency.
CoTracker 3 is remarkable for its ability to track up to 70,000 points on a single GPU jointly. Unlike traditional tracking methods that operate independently, CoTracker 3 considers the relationships between different points, significantly enhancing performance. This model can effectively track points even when they are occluded or leave the camera’s view, marking a substantial advancement in computer vision.
The core insight behind CoTracker 3 is its emphasis on joint tracking. By accounting for the dependencies among tracked points, the model can maintain continuity and accuracy in challenging scenarios. This innovative approach contrasts sharply with older models that often struggle with occlusions or when points exit the frame.
Meta has released a demo on Hugging Face.
CoTracker 3 operates using a transformer architecture that processes video frames in overlapping short windows. It employs techniques like token proxies to enhance memory efficiency and speed. The model is trained using a semi-supervised approach, generating pseudo-labels from real videos to improve its training data without extensive human annotation. This allows CoTracker 3 to learn effectively from smaller datasets while still achieving high performance.
The model is designed as an online algorithm, meaning it processes data causally, maintaining tracks over long periods even when points are temporarily obscured or outside the field of view. This capability is crucial for applications requiring long-term tracking.
Image- CoTracker successfully tracks the driver’s head even when it exists the camera view
In benchmark tests against other existing state-of-the-art trackers, CoTracker 3 has demonstrated superior performance across various metrics like Average Jaccard and Occlusion Accuracy. Its ability to maintain accurate tracks over extended periods sets a new standard in point tracking technology.
The advancements brought by CoTracker 3 are vital for numerous applications requiring precise motion tracking, including robotics, augmented reality, and video analysis. By improving how we track moving objects in videos, this technology not only enhances existing capabilities but also opens new avenues for research and practical applications in computer vision.
The potential applications of CoTracker 3 are very vast. Its ability to track numerous points simultaneously while also maintaining high accuracy suggests it could serve as a foundational tool for future developments in areas like 3D reconstruction and complex visual tasks. The simplicity and efficiency of its design will surely inspire further innovations in point-tracking technology which will make it an exciting area for current research.
Meta releases generative AI video tools for ads on Instagram and Facebook
This post was last modified on October 17, 2024 4:52 am
Perplexity AI Voice Assistant is a smart tool for Android devices that lets users perform…
Meta AI is a personal voice assistant app powered by Llama 4. It offers smart,…
On April 23, 2025, current President Donald J. Trump signed an executive order to advance…
Google is launching The Android Show: I/O Edition, featuring Android ecosystem president Sameer Samat, to…
The top 11 generative AI companies in the world are listed below. These companies have…
Google has integrated Veo 2 video generation into the Gemini app for Advanced subscribers, enabling…