NVIDIA showcased groundbreaking AI innovations at CVPR 2024, earning Best Paper nominations and an Innovation Award. Explore new developments in image generation, 3D scene editing, and autonomous driving.
Discover NVIDIA's Latest AI Advancements in Image Generation and Self-Driving Tech
In Short
At the Computer Vision and Pattern Recognition Conference (CVPR) held this week in Seattle, NVIDIA researchers presented several novel ideas and developments in visual generative AI models and methods. It ranges from the generation of custom images to 3D scene editing, understanding of the new visual language, and self-driving car perception.
Out of the well over fifty research projects that NVIDIA has funded, two projects’ papers have made the list on CVPR’s Best Papers list. The first one discusses the training of the diffusion models, whereas the second one relies on HD maps for self-driving cars. Moreover, NVIDIA has claimed the CVPR Autonomous Grand Challenge’s End-to-End Driving at Scale category and received an Innovation Award from CVPR with over 450 competitors worldwide.
Jan Kautz, VP of learning and perception research at NVIDIA, stated that “Artificial intelligence, and generative AI, in particular, represents a pivotal technological advancement. At CVPR, NVIDIA Research is sharing how we’re pushing the boundaries of what’s possible — from powerful image generation models that could supercharge professional creators to autonomous driving software that could help enable next-generation self-driving cars.”
Among the interesting experiments, we find JeDi, a new method for quick adjustment of diffusion models, which is currently the best-known solution for text-to-image conversion. This means that instead of fine-tuning JeDi on specific objects or characters which would require numerous pictures, one can draw out an object or character using several pictures and complete the fine-tuning there.
Nvidia introduces G-Assist, an AI chatbot designed for gamers
Another novel contribution is FoundationPose: a model of foundation that can learn and estimate geometrically robust 3D pose of objects in videos without training each object separately. This model has now become a reference and has the ability of going further than AR or robotics applications.
Other researchers from NVIDIA have also provided a NeRFDeformer that is the method of moving the 3D scene captured by NeRF using a single photograph. Its functionalities can be, at least to some extent, extended to graphics, robotics, Digital Twins, and may well include the concept of simplification of editing of 3D scenes.
To expand the sphere of innovative visual language comprehension, NVIDIA together with MIT introduced a new set of models named VILA. VILA can be considered as the new fundamental model for comprehensive image and video analysis and hierarchy reasoning required for text to image/ picture to text conversion, which was used by VILA in the context of meme parsing.
Also Read Nvidia’s Next-Gen AI Platform, Rubin, Set for 2026 Debut To Manage ‘Computation Inflation’
The AI research at NVIDIA spans diverse disciplines as this industry giant has published over a dozen articles on new methods for AV perception, mapping, and planning. I remember seeing Sanja Fidler, the Vice President of NVIDIA’s AI Research, talk about the VLMs in the context of self-driving cars.
The applications of generative AI at NVIDIA’s areas of CVPR showcase potential applications of generative AI across various industries. Such improvements might enhance the performance of creators, enhance the pace at manufacturing and Healthcare tech, and boost self-driving vehicles and robotics. To NVIDIA, the conference is the factor that can offer an opportunity
Also Read: Why did NVIDIA Acquire GPU Orchestration Software Run AI?
This post was last modified on June 18, 2024 12:01 pm
Rish Gupta is an Indian entrepreneur who serves as the chief executive officer (CEO) of…
Are you looking to advance your engineering career in the field of robotics? Check out…
Artificial intelligence is a topic that has recently made internet users all over the world…
Boost your learning journey with the power of AI communities. The article below highlights the…
Demystify the world of Artificial Intelligence with our comprehensive AI Glossary and Terminologies Cheat Sheet.…
Scott Wu is the co-founder and Chief Executive Officer of Cognition Labs, an artificial intelligence…