Deputy Prime Minister and Minister of the Ministry of Science and ICT Bae Kyung-hoon, Ha Jung-woo, Senior Secretary for AI Future Planning at the Presidential Office, and other attendees pose for a photo at the Independent AI Foundation Project briefing at COEX Auditorium in Gangnam-gu, Seoul, on the afternoon of the 30th. /Courtesy of News1

Debate continues over the qualifications of elite teams participating in the government's "national AI model project." The national AI model project is a core government initiative being pushed with the goal of building an AI model unique to Korea. On the 30th, after the first project briefing hosted by the Ministry of Science and ICT, allegations were raised that the Upstage elite team copied a Chinese model, but the party that sparked the suspicion apologized, and the controversy was settled for the moment. However, this time, a claim emerged that the Naver elite team fine-tuned the encoder (the function corresponding to the brain) of a Chinese model.

According to the industry on the 6th, Naver Cloud's "HyperCLOVA X Seed 32B Sync" model faced allegations that the cosine similarity and Pearson correlation coefficient of its vision encoder weights with Alibaba of China's Qwen 2.4 language model recorded at least 99.51% and 98.98%, respectively.

Naver Cloud acknowledged using Chinese open source. The company said, "For this model, we strategically adopted a verified external encoder in consideration of compatibility with the global technology ecosystem and efficient optimization of the overall system," adding, "However, this is not due to a lack of technological self-reliance, but a high-level engineering judgment to enhance the completeness and stability of the overall model by leveraging a standardized, high-performance module." It added, "In the global AI industry, this approach has become a universal design standard for system scalability."

Naver Cloud emphasized, "A vision encoder serves as the optic nerve that converts visual information into signals a model can understand, and Naver possesses proprietary vision technologies such as VUClip." It continued, "The core contribution of the released model lies not in simple component assembly, but in the completion of an integrated architecture," adding, "Designing a system to understand and generate text, speech, and images simultaneously within a single organic structure is the most essential and difficult challenge of Multimodal AI." Naver said it has transparently disclosed these technical choices and license information through Hugging Face and technical reports. It said there was no intention to misrepresent the model's performance or exaggerate its technical contribution; rather, the aim was to share the results of considering which technical path could yield the most efficient and powerful performance.

Some say the controversy grew because the government did not define criteria for "from scratch" in AI model development. "From scratch" means developing an AI model directly from the beginning.

Naver Cloud emphasized, "The foundation model is the core area that interprets input information, reasons, and produces results, corresponding to the 'brain' responsible for thinking and identity in humans," adding, "Naver has developed this core engine 100% with its own technology from the from-scratch stage."

Lee Seung-hyun, vice president of FortyTwoMaru, said in a post on GitHub, "We need to move beyond fruitless disputes and establish clear standards for what constitutes true technological sovereignty."

The national AI model project is a key initiative of the Lee Jae-myung administration that concentrates resources such as graphics processing units (GPUs) and data. The five elite teams are led by Naver Cloud, NC AI, SK Telecom, LG AI Research Institute, and Upstage. The Ministry of Science and ICT plans to eliminate 15 teams this month through the first evaluation, then conduct evaluations every six months and select the final one or two teams as national AI models in 2027.

※ This article has been translated by AI. Share your feedback here.