The stage for generative artificial intelligence (AI) competition has shifted from text to video. Google ignited competition with OpenAI's Sora by showcasing its video generation AI, Veo 2. Veo 2 is targeting market leadership with improved video control capabilities and support for 4K (3840x2160) quality compared to existing models.
According to Google, Veo 2 was unveiled on 16th (local time) through its official blog, stating that it has surpassed the limits of video generation AI technology. Veo 2 was introduced by Google DeepMind in May and features the ability to create and edit videos based on string-based commands. It was also launched for corporations on 4th.
Google noted that Veo 2 has improved realism and expressiveness compared to the first generation Veo. It particularly explained that the details of physical movement and human expression have been enhanced. The new model has reduced instances of 'unnatural motion' or 'hallucination phenomena' where previous models often showed errors in scenes with human activities like walking or moving objects.
Additionally, Veo 2 can implement user-requested visualizations more delicately. When effects from specific camera lenses or cinematic directions are input as prompts, they are reflected accordingly.
Google emphasized that "Veo 2 understands the language of cinematography." For instance, commands like "35mm film," "slow motion," and "shallow depth of field" can create video outputs that resemble films. It also supports up to 4K resolution and enables longer video generation.
However, Veo 2 is available only to a limited number of users through Google's VideoFX platform. As it is designed with features aimed at video professionals and creators, it is expected to generate new demand in various fields such as YouTube and advertising.
Previously, OpenAI's Sora, officially released on 9th, received a favorable response from users. On 12th, a surge of Sora users worldwide led to access issues with OpenAI ChatGPT.
Sora is available for all ChatGPT Plus and Pro subscribers. It can generate videos up to one minute in length from text prompts and can create complex scenes and multiple character interactions with simple commands, which is a strong point. While focusing on popular and creative content generation, the maximum resolution is limited to Full HD (1920x1080).
Meanwhile, the generative video AI market is competitive, with corporations like Google and OpenAI, as well as Meta and various startups. Meta challenged the AI video market by unveiling 'Movie Gen' in October. In addition, emerging companies like 'Runway' and 'Pika Labs' are introducing innovative video generation tools, expanding the market.