OpenAI introduced new AI reasoning models called o3 and o4-mini. These models are intended to pause and consider problems before answering.
The company claims that o3 is its most sophisticated thinking model to date, surpassing its earlier models on tests that gauge aptitude in science, arithmetic, coding, reasoning, and visual comprehension. OpenAI claims that o4-mini provides a competitive trade-off between price, speed, and performance, three aspects that developers frequently take into account when selecting an AI model to power their apps.
O3 and O4-mini, in contrast to earlier reasoning models, can produce responses through ChatGPT’s online surfing, Python code execution, image processing, and picture-generating capabilities. Subscribers to OpenAI’s Pro, Plus, and Team plans can now access the models as well as a variation of o4-mini known as “o4-mini-high,” which takes extra time to create replies in order to increase its reliability.
Also Read: OpenAI intends to publish a new “open” AI language model in the coming months
In the fiercely competitive global AI race, OpenAI is attempting to outperform Google, Meta, xAI, Anthropic, and DeepSeek with the new models. Although OpenAI was the first to release an AI reasoning model, o1, rivals swiftly followed behind with their models that either matched or outperformed OpenAI’s portfolio. As AI labs try to get more performance out of their systems, reasoning models have started to take center stage in the field.
In ChatGPT, O3 almost wasn’t launched. In February, OpenAI CEO Sam Altman hinted that the business planned to invest more in a more advanced solution that included o3’s technologies. However, OpenAI appears to have changed its mind in the end due to pressure from competitors.
According to OpenAI, with a score of 69.1% on the SWE-bench validated (without custom scaffolding) test, which gauges coding skills, o3 achieves state-of-the-art performance. With a score of 68.1%, the o4-mini model performs similarly. Claude 3.7 Sonnet scored 62.3% on the test, while OpenAI’s next-best model, o3-mini, scored 49.3%.
Also Read: OpenAI introduces image generation for ChatGPT users powered by GPT-4o
OpenAI claims that o3 and o4-mini are its first models that can “think with images.” Users can contribute images to ChatGPT, such as PDF schematics or whiteboard designs, and the models will examine the images as part of their “chain-of-thought” phase before responding. O3 and O4-mini can now comprehend low-quality and fuzzy images and use this newfound capacity to zoom in on or rotate images as they see fit.
In addition to processing images, the o3 and o4-mini can search the web for current events and run and execute Python code right in your browser with ChatGPT’s Canvas feature.
Together with ChatGPT, OpenAI’s developer-facing endpoints, the Chat Completions API and Responses API, will make all three models—o3, o4-mini, and o4-mini-high—available. This will enable engineers to create apps using the company’s models at usage-based prices.
Given its enhanced performance, OpenAI is charging developers a comparatively modest price for o3, $40 per million output tokens and $10 per million input tokens (approximately 750,000 words, longer than The Lord of the Rings series). OpenAI charges $1.10 per million input tokens and $4.40 per million output tokens for o4-mini, which is the same price as o3-mini.
Also Read: OpenAI intends to bill up to $20,000 a month for specialized AI “agents.”
According to OpenAI, o3-pro, a version of o3 that requires greater processing power to generate its responses, will be made available to ChatGPT Pro customers only in the upcoming weeks.
According to OpenAI CEO Sam Altman, O3 and O4-mini might be the business’s final standalone AI reasoning models in ChatGPT before GPT-5, which the company claims would integrate its reasoning models with more conventional models like GPT-4.1.