The government eliminated Naver Cloud and NC AI in the first round among five elite teams participating in the "national AI model project." The government initially planned to narrow the five teams to four, but two teams were eliminated at once. The NC AI elite team fell short in benchmark, expert, and user evaluations, and Naver Cloud dropped out for failing to meet originality requirements. With only Upstage, SK Telecom, and LG AI Research Institute remaining as elite teams, the government said it would hold a repechage. Concerns over fairness and equity appear inevitable.
Second Vice Minister Ryu Je-myung of the Ministry of Science and ICT said at a briefing on the "independent AI foundation model project first-stage evaluation results" held at Government Complex Seoul on the afternoon of the 15th, "The aim of compressing competition among a small number is not so much to pick two final corporations, but to design a framework that creates the fiercest competitive environment so we can produce many results in a short period." The national AI model project is a core government program that concentrates support such as graphics processing units (GPUs) and data with the goal of building a uniquely Korean AI model to rank among the top three AI countries worldwide.
The government said it would give another chance not only to Naver Cloud and the NC AI elite team, which were eliminated in the first evaluation, but also to Motif Technologies, Kakao, KT, Konan Technology, the KAIST consortium, and all other corporations that were not selected as elite teams. Any additional elite teams selected will, like those that passed the first stage, receive GPU and data support and be allowed to use the K-AI corporate designation.
Some say the government created a pretext for Naver Cloud. That is because the government did not announce company-by-company rankings for the detailed evaluation. Earlier this month, Naver Cloud faced allegations that the image and audio encoders of its "HyperCLOVA X Seed 32B Sync" model copied "Qwen 2.5 ViT" from China's Alibaba. Naver said, "It is not due to a lack of technological self-reliance; we strategically adopted external encoders to raise the completeness and stability of the overall model by using standardized high-performance modules." Naver Cloud said, "We respect the Ministry of Science and ICT's decision and will not consider reapplying."
Below are key questions and answers with Second Vice Minister Ryu Je-myung of the Ministry of Science and ICT and Kim Kyung-man, director general of the ministry's Artificial Intelligence Policy Bureau.
◇ "New elite team contest to proceed quickly... same GPU and data period will be given"
―I'm curious about the criteria and timing for recruiting one additional team.
"Because an unexpected vacancy arose for the fourth spot, we will complete the administrative procedures for soliciting a new corporation to participate in stage two as quickly as possible. We will give opportunities not only to the (eliminated) corporations that failed to join stage two, but also to the 10 consortia that did not join the stage one evaluation and to capable corporations that can newly form a consortium."
―You are effectively holding a repechage, but any additionally selected elite team will face a time gap. What advantage will you give?
"We will design and provide the same government-supported GPUs and data so that corporations additionally joining can use them for the entire project period. Even if the evaluation period or the project end point differs, we will have them complete stage two over the same duration. Even if there is a difference in start order, we plan to manage the period difference flexibly."
―What are the schedule for the second evaluation and the future roadmap?
"The three teams that passed the first-stage evaluation will be able to start stage two immediately. The government has leased GPUs and is providing them for participating corporations to use, but if the three elite teams (that passed stage one) wait for the additional elite team, we would have to idle leased GPU resources, which would waste the budget. (However, for the additional elite team selected) we are designing the project participation period and the total amount of GPUs provided so they can compete under the same conditions as the three teams that started earlier."
◇ "Naver fell short of technical requirements, submitted explanation materials after project ended"
―Please explain in more detail why Naver Cloud was eliminated.
"Naver stated in its own technical report that it used the weights of a previously released open-source model as-is for image and audio encoders. The call for proposals for the project included the basic conditions that an independent AI foundation model project must meet. What we essentially aim for in an independent AI foundation model project is something designed and trained from scratch under any conditions, and (Naver Cloud) did not meet this, which became the reason for elimination. Many evaluators also noted that it fell short of the technical requirements the project aims for. While using external encoders is a common method during development, in this judgment the encoders were not in a form where weights could be updated but were frozen, so using external encoders and weights as-is was internally deemed hard to recognize as an independent foundation model. Even if an existing open-source model is used, it should be wiped clean and the weights filled with self-secured data, with such experience demonstrated and verified. That said, it is true that they used open source without copyright issues."
―Did Naver make a prior inquiry about the encoder issue?
"No. For reference, after the controversy arose, Naver sent an explanation. There were points where various judgments or perspectives could lead to multiple interpretations. The important thing is that the explanation was not made before the project ended on Dec. 31, and it was sent while the subsequent evaluation was underway, so we did not reflect it in the evaluation. We judged that reflecting materials received after procedures had concluded into the subsequent evaluation would pose a procedural problem. Naver explained that it has its own encoders and that the encoder currently used accounts for a fairly small share of the project. There are various differences in views, and experts see it differently."
―Specific criteria like whether external encoders can be used likely were not included in the first call for proposals. Will you provide clear guidelines for judging originality going forward?
"It is fair to say that worldwide, including among global corporations, there are no corporations that do not use open source. This should never be seen as demonizing open source. Strategically using open source appropriately at each stage based on licensing conditions is naturally accepted in the global AI ecosystem. However, what we want to try in the independent AI foundation model project is that, even so, we design the model ourselves, and even if we use open source, using pre-trained weights as-is is, in a way, free-riding on others' experience. We want to newly undertake that experience—the learning experience itself. Only then can we build more competitive AI models, even when leveraging open source in the future. Because AI competition moves fast, rather than starting such a project when everything is certain, we began quickly despite uncertainties and are making improvements. During development, the government has managed while aligning continually, and operators have proceeded in communication with the government. As for the evaluation, consensus on criteria and methods was reached as much as possible in consultation with the elite teams, and the evaluation was conducted at that agreed level."
◇ Fairness controversy expected over repechage
―Was there originally a minimum passing threshold? You said Naver Cloud and NC AI can reapply and come back. If they return or a new team joins, will there be no penalties in stage two?
"The aim of compressing competition among a small number is not so much to pick two final corporations, but to design a framework that creates the fiercest competitive environment so we can produce many results in a short period. We judged that even corporations not directly participating would be spurred to chase the technology. For example, Motif Technologies and KT did not advance to the first-round finals, but they accelerated development and posted top-tier results at the global AI performance evaluator Artificial Analysis. With that in mind, we intend to ensure the stage one results do not affect stage two at all, so participants can make a fresh start and try again."
―Won't the repechage raise fairness concerns for other participating elite teams? It could be read as giving someone among the two eliminated operators another chance. Isn't selecting an additional elite team wasteful?
"We are also running two consortia that build foundation model projects in specialized fields by leveraging available resources for corporations that could not participate in stage one. Our GPU resources and budget are limited, but the background of the stage one evaluation was to let as many AI corporations as possible use GPUs in any way and create conditions for them to join AI development. We gained a lot in the process. This is absolutely not an approach cobbled together to favor a particular corporation. We have discussed this individually with participating corporations, and there is consensus that the achievements generated by the elite teams should not become the property of any individual corporation. Many AI service corporations in Korea should be able to use them through open source contributions. Instead of using the term repechage, we would appreciate it if you could call it a re-leap, catch-up program, re-leap program, or re-challenge program."
―Naver Cloud was not the only one with a "from scratch" issue. Were the same standards applied to other elite teams, and was there any disagreement?
"In the case of Upstage, evaluators pointed out issues with reference citations. When setting standards for an independent foundation model, we believed that even when using publicly available open source under ethical standards, one should transparently state how it was used and what was changed, and that when such matters undergo technical verification, Korea's AI ecosystem will advance further. From that perspective, Upstage's failure to cite references does not really meet our standards. However, experts did not view it as a defect large enough to determine pass or fail. SK Telecom also had some criticism in that regard. But that was not an absolute evaluation criterion."
―Will the evaluation criteria be the same in stage two?
"There will likely be no change to the broad framework of benchmark, expert, and user evaluations. But for the "from scratch" component, we plan to collect more input from academia, industry, and experts to refine differentiated scoring by degree. Still, uncertainties will remain. If we secure additional GPUs, we could supply more. It's not decided, but we might raise targets and adapt to and accommodate the competitive landscape among global AI corporations, moving dynamically."