OpenAI unveils advanced reasoning AI model 'o3' set to launch early next year

OpenAI image. /Courtesy of Chosun DB

OpenAI, the developer of ChatGPT, unveiled the advanced reasoning artificial intelligence (AI) model 'o3' on the 20th (local time).

'o3' is an upgraded version of the reasoning model 'o1' that OpenAI released last September, and OpenAI also introduced a smaller model called 'o3 mini.' The model name skipped 'o2' and was designated 'o3.' OpenAI explained, 'We decided not to use the name 'o2' out of respect for the British telecommunications brand 'O2.'

Sam Altman, CEO of OpenAI, said, 'We plan to launch 'o3 mini' by the end of January next year and will soon follow with 'o3.' It will be provided in preview form to some Research Institute members starting today.

'o3,' which focuses on reasoning ability, has been trained to think before responding, just like 'o1.' It can reason and plan tasks, and OpenAI emphasized that it helps find solutions for tasks stretching over long periods. Similar to 'o1,' 'o3' takes a few seconds to minutes to respond, but OpenAI claims that it is more reliable in fields such as physics, science, and mathematics.

Moreover, OpenAI reported that 'o3' has approached general artificial intelligence (AGI) under certain conditions. When given a prompt, 'o3' pauses momentarily before responding, considering the relevant prompt and explaining its reasoning. It then summarizes what it believes to be the most accurate response.

Notably, 'o3' has introduced 'reasoning time adjustment.' Users can set the computing time, or 'o3's' thinking time, to 'low, medium, high,' with longer computation times resulting in better performance.

OpenAI described 'o3' as overwhelming other models in benchmarks (performance measurements). In benchmarks focused on programming tasks, it showed a performance that is 22.8 percentage points higher than 'o1,' and in coding skill measurement metrics, it scored 2727 points, significantly surpassing the 2400 points representing the top 99.2% of engineers.

This year, in the American Invitational Mathematics Examination (AIME), it answered only one question incorrectly, achieving a score of 96.7%, and in the Graduate-level biology, physics, and chemistry test (GPQA Diamond), it achieved a performance of 87.7%, according to OpenAI.

This suggests that competition with AI models from Google and Meta will intensify. Google announced 'Gemini 2.0' earlier this month. It operates twice as fast as its previous model and can 'think, remember, plan, and even take action on your behalf,' Google noted. Facebook Meta Platforms is also scheduled to release 'LLaMA 4' next year.