OpenAI recently announced a flagship model that can reason across audio, vision, and text in real time. Read this article to know about GPT-4o, its capabilities, evaluations and more.
All About OpenAI's GPT-4o
OpenAI recently launched GPT-4o, an iteration of the GPT-4 model. To advance AI technology and ensure it is accessible and beneficial to everyone, GPT-4o will be rolling out more intelligence and advanced tools to ChatGPT for free. This updated model “is much faster” and improves “capabilities across text, vision, and audio,” OpenAI CTO Mira Murati said in a livestream announcement on Monday. It’ll be free for all users, and paid users will continue to “have up to five times the capacity limits” of free users, Murati added.
Also, OpenAI CEO Sam Altman posted that the model is “natively multimodal,” which means the model could generate content or understand commands in voice, text, or images.
GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction, it accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs. This newest flagship model provides GPT-4-level intelligence but is much faster and improves on its capabilities across text, voice, and vision. Also, t can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in a conversation.
Also, in the future, improvements will allow for more natural, real-time voice conversation and the ability to converse with ChatGPT via real-time video. For example, you could show ChatGPT a live sports game and ask it to explain the rules to you. We plan to launch a new Voice Mode with these new capabilities in alpha in the coming weeks, with early access for Plus users as we roll out more broadly.
Developers can also now access GPT-4o in the API as a text and vision model. It is 2x faster, half the price, and has 5x higher rate limits compared to GPT-4 Turbo. We plan to launch support for GPT-4o’s new audio and video capabilities to a small group of trusted partners in the API in the coming weeks.
Google AI Essentials Course and Certification: Check Fees, Modules, Trainers, and How to Enroll?
Sam Altman states that GPT-4o is fast, smart, fun, natural, and helpful. In his blog, he said that this new model is a key part of a mission to put very capable AI tools in the hands of people for free (or at a great price).
Secondly, the new voice (and video) mode of the GPT-4o is the best computer interface. It feels like AI from the movies, and it’s still a bit surprising to me that it’s real. Getting to human-level response times and expressiveness turns out to be a big change.
OpenAI is the brainchild of GPT-4o. It is making more capabilities available for free in ChatGPT. Anyone with access to ChatGPT can switch to GPT-4o in the API. The benefits and features of GPT-4o are available in three different tiers, such as:
Free Tier | Limit access to messages using advanced tools. |
Plus and Team | 5x greater message limits than free users |
Enterprise | High-speed access to GPT-4o and GPT-4oEnterprise-grade security and privacy features Higher message limits. |
GPT-4o has safety built into its design across modalities, through techniques such as filtering training data and refining the model’s behaviour post-training. It has also undergone extensive external red teaming with 70+ external experts in domains such as social psychology, bias, fairness, and misinformation to identify risks that are introduced or amplified by the newly added modalities.
What is GPT-4 Turbo in the OpenAI API? How To Access It?
This post was last modified on May 14, 2024 3:27 am
Rish Gupta is an Indian entrepreneur who serves as the chief executive officer (CEO) of…
Are you looking to advance your engineering career in the field of robotics? Check out…
Artificial intelligence is a topic that has recently made internet users all over the world…
Boost your learning journey with the power of AI communities. The article below highlights the…
Demystify the world of Artificial Intelligence with our comprehensive AI Glossary and Terminologies Cheat Sheet.…
Scott Wu is the co-founder and Chief Executive Officer of Cognition Labs, an artificial intelligence…