News

Meta’s CoTracker 3 Sets a New Benchmark in Video Motion Tracking with 70K Point Accuracy

Meta has unveiled CoTracker 3, a groundbreaking motion tracking model capable of tracking 70,000 points on a single GPU. Its transformer-based architecture allows for improved accuracy, even in complex scenarios like occlusions. CoTracker 3 is a game-changer for applications in augmented reality, robotics, and video analysis, setting new standards in computer vision.

Meta has recently unveiled CoTracker 3, an advanced model aiming to revolutionise point tracking in video sequences. CoTracker 3 is a groundbreaking model introduced by researchers led by Nikita Karaev.  This cutting-edge technology utilises a transformer-based architecture that enables it to track a vast number of points simultaneously while also maintaining exceptional accuracy and efficiency.

What’s New:

CoTracker 3 is remarkable for its ability to track up to 70,000 points on a single GPU jointly. Unlike traditional tracking methods that operate independently, CoTracker 3 considers the relationships between different points, significantly enhancing performance. This model can effectively track points even when they are occluded or leave the camera’s view, marking a substantial advancement in computer vision.

Key Insight:

The core insight behind CoTracker 3 is its emphasis on joint tracking. By accounting for the dependencies among tracked points, the model can maintain continuity and accuracy in challenging scenarios. This innovative approach contrasts sharply with older models that often struggle with occlusions or when points exit the frame. 

Meta has released a demo on Hugging Face.

How This Works:

CoTracker 3 operates using a transformer architecture that processes video frames in overlapping short windows. It employs techniques like token proxies to enhance memory efficiency and speed. The model is trained using a semi-supervised approach, generating pseudo-labels from real videos to improve its training data without extensive human annotation. This allows CoTracker 3 to learn effectively from smaller datasets while still achieving high performance.

The model is designed as an online algorithm, meaning it processes data causally, maintaining tracks over long periods even when points are temporarily obscured or outside the field of view. This capability is crucial for applications requiring long-term tracking.

                          Image- CoTracker successfully tracks the driver’s head even when it exists the camera view

Result:

In benchmark tests against other existing state-of-the-art trackers, CoTracker 3 has demonstrated superior performance across various metrics like Average Jaccard and Occlusion Accuracy. Its ability to maintain accurate tracks over extended periods sets a new standard in point tracking technology.

Why This Matters:

The advancements brought by CoTracker 3 are vital for numerous applications requiring precise motion tracking, including robotics, augmented reality, and video analysis. By improving how we track moving objects in videos, this technology not only enhances existing capabilities but also opens new avenues for research and practical applications in computer vision.

We’re Thinking:

The potential applications of CoTracker 3 are very vast. Its ability to track numerous points simultaneously while also maintaining high accuracy suggests it could serve as a foundational tool for future developments in areas like 3D reconstruction and complex visual tasks. The simplicity and efficiency of its design will surely inspire further innovations in point-tracking technology which will make it an exciting area for current research. 

Meta releases generative AI video tools for ads on Instagram and Facebook

This post was last modified on October 17, 2024 4:52 am

Bilal Abbas

Bilal Abbas holds a Master’s in International Relations from Jamia Millia Islamia, Delhi, and a Bachelor’s in Economics from the University of Lucknow. A creative yet logical thinker, Bilal is deeply curious about the intricacies of the global economy and international politics. His interest in technology has led him to explore and write on fintech topics, blending his academic expertise with a passion for innovation. Bilal also finds joy in nature and appreciates the serenity of greenery. In his leisure time, Bilal can be found sketching, or immersed in a good book.

Recent Posts

Perplexity AI Voice Assistant: How to Use and Benefits for iOS and Android Phones

Perplexity AI Voice Assistant is a smart tool for Android devices that lets users perform…

May 10, 2025

Meta AI App: How to Download? Check Its Key Features and Benefits

Meta AI is a personal voice assistant app powered by Llama 4. It offers smart,…

May 10, 2025

AI in U.S. Education for American Youth by President DONALD TRUMP

On April 23, 2025, current President Donald J. Trump signed an executive order to advance…

May 10, 2025

Google is moving Android news to a virtual event before I/O

Google is launching The Android Show: I/O Edition, featuring Android ecosystem president Sameer Samat, to…

April 29, 2025

Top Generative AI Companies of the World 2025

The top 11 generative AI companies in the world are listed below. These companies have…

April 28, 2025

Veo 2 extends access to more Gemini Advanced Users

Google has integrated Veo 2 video generation into the Gemini app for Advanced subscribers, enabling…

April 25, 2025