The new image model created by OpenAI generates Newton's prism experiment./Courtesy of OpenAI

On the 25th (local time), OpenAI, the developer of ChatGPT, unveiled a new image generation model that is more sophisticated and intuitive than existing image generation AIs. This model is the first to integrate text and images, capable of understanding complex requests and generating images that reflect the user's intent.

OpenAI announced that the newly launched 'ChatGPT-4o Image Generation' model combines the text-based linguistic intelligence of GPT-4 with image generation capabilities. OpenAI explained that this enables the implementation of an AI that deeply understands the meaning of text while creating visually sophisticated images.

The existing 'DALL-E' model required users to input prompts in detail to create the desired images, but the new model can easily recognize the user's intent with simple instructions and generate complex images as well. It can create images that previous AIs struggled with, such as a bicycle with triangular wheels.

In particular, the text insertion feature has been greatly improved, allowing for accurate representation of posters featuring various types of whales or cartoons with dialogue. The previous model often struggled with distorted text or misunderstandings of the relationship between objects and text, but the new model is said to have overcome these limitations.

Images explaining natural laws, mathematical formulas, menus, business logos, and more can be easily generated upon request, with support for transparent backgrounds enabling practical uses like sticker production.

OpenAI emphasized that this model is not just a simple upgrade from the existing 'DALL-E,' but developed on an entirely different technological foundation, resulting in significant improvements in both functionality and performance. The new model supports various languages, including Korean, and is available to both paid ChatGPT Pro users and free users.