On the 29th, Naver Cloud, as the lead operator of the Ministry of Science and ICT's independent AI foundation model project, unveiled the first outcome of the "Omni foundation model" development task.
The newly unveiled "HyperCLOVA X SEED 8B Omni" is Korea's first foundation model to adopt a native omni-modal architecture, equipped to learn and understand diverse data types—such as text, images, and audio—within a single model. This enables AI to integrate context and solve problems across complex environments involving speech, writing, vision, and audio.
To maximize the potential of omni-modal AI, Naver Cloud plans to go beyond the traditional internet document–centric training on text and images, secure data that reflects real-world contexts, and apply it to the model. The model also offers capabilities to understand, generate, and edit by combining text and images, achieving global-level multimodal generation performance.
Naver Cloud also verified the feasibility of implementing an omni-modal agent with complex problem-solving skills by combining vision, audio, and tool-use capabilities into a reasoning AI through "HyperCLOVA X SEED 32B Think." The model demonstrated strong performance in real-world use and proved its capabilities by recording top-tier scores across various subjects.
Based on this AI technology, Naver Cloud plans to expand AI agents that can be used in search, commerce, content, and public and industrial settings, building a technology ecosystem to realize "AI for everyone."