Al startup Upstage faced allegations that its "Solar Open 100B" model unveiled at the "National AI Model Project first briefing" hosted by the Ministry of Science and ICT on the 30th copied a Chinese model. In response, Upstage moved to explain directly through a public verification session on the afternoon of the 2nd.
Upstage had drawn attention as the only startup consortium among the five elite teams competing in the selection for the national AI model. The National AI Model Project is a key initiative of the Lee Jae-myung administration that concentrates resources such as GPUs and data. The other four elite teams are led by large corporations, including Naver Cloud, NC AI, SK Telecom, and LG AI Research Institute. For that reason, attention focused on what results the Upstage consortium, a startup alliance, would deliver. The Ministry of Science and ICT plans to eliminate one team this month through the first evaluation, then conduct evaluations every six months and select the final one to two teams as national AI models in 2027.
◇ Solar Open 100B model, suspicion of similarity to a Chinese AI model
The controversy over Upstage allegedly copying a Chinese model began with a point raised by Go Seok-hyun, CEO of AI startup PsionicAI. On the 1st, Go wrote on his Facebook page, "It is quite regrettable that a model suspected to be the result of copying and fine-tuning a Chinese model was submitted to a project funded by taxpayers' money." He attached a GitHub report comparing the performance of Upstage's "Solar Open 100B" model submitted at the first briefing with the Chinese AI model (GLM 4.5 Air).
The GitHub report said the LayerNorm component of the neural network in Upstage's AI model is 96.8% identical to the Chinese model. The report also said the "MoE" structure is the same, which suggests that Upstage's AI model blueprint is effectively the same as the Chinese model. The report states, "This 'selective preservation' is decisive evidence of derivation."
Lee Min-seok, a professor in the School of Software at Kookmin University, said, "It appears the Upstage model was trained from scratch," adding, "Those claiming similarity between Upstage and the Chinese model cite cosine similarity, but the target is LayerNorm used to adjust parameters, not the parameters themselves, so testing from scratch that way is wrong." From scratch means developing an AI model directly from the beginning. Lee explained that building an AI model, simply put, involves designing the architecture (structure) and training it to fill the parameters, and that architectures are mostly similar.
◇ Upstage rebuts at verification session: "A model baked in our own way"
In response to Go's point, Upstage CEO Kim Sung-hoon said, "Thank you for your interest in the Solar model," and added, "On the 2nd, we will invite you and industry experts to our office to explain the entire training process, address the concerns you raised in detail, and show that it was made as a From Scratch model."
Ahead of the public verification session on the 2nd, Go said, "In part of the Solar (Open 100B) code, the copyright of China's ZhipuAI (which created GLM 4.5 Air) is specified," adding, "Based on our internal analysis and others' analyses, we judge it to be true that this Solar model took over and used most of the GLM model's training code as is."
Using open-source (publicly available information) to develop an AI model is not a problem at all. However, experts say verification is necessary because developing a national AI model with taxpayers' money is a different situation. Lee Kyung-jun, a professor at the College of Business Administration and Department of Big Data Application at Kyunghee University, said, "It is unlikely that Upstage would have done something that fails to meet standards under public scrutiny, but verification is necessary to clearly resolve the controversy."
At the public verification session held near Gangnam Station in Seoul on the 2nd, CEO Kim said, "The decisive criterion for determining whether it is from scratch is whether training started with the model's weights randomly initialized, and Solar is a model baked in our own way from the beginning." As for the notation of the Chinese model (ZhipuAI) copyright, he said it was a practical mistake and unrelated to the substance of the matter.
◇ "Innovation is forged through transparent, rigorous verification"
Whatever the truth, the allegations are expected to be a burden for Upstage. At the public verification session, Kim said, "If we fail to resolve the (copying controversy) issue, it could significantly affect the government's review, so I ask you to issue a public apology."
In response, on the 3rd, Go wrote on Facebook, "Immediately after the (Upstage Solar) model was released, during our internal analysis, we identified indications that could be interpreted as similar to a particular model (GLM) and some structural and statistical characteristics, for which the links to the references at the time were not clear. Given that the model is being discussed at a national level, we concluded it would be advisable to promptly bring the matter into the public domain before conducting further verification and cross-verification," adding, "We sincerely apologize for causing unnecessary confusion and controversy by making it public without rigorous verification, as it is difficult to draw conclusions based on the similarity of LayerNorm values alone." However, Go did not make it clear in the apology that Upstage's model is different from the Chinese model.
Bae Kyung-hoon, Deputy Prime Minister and Minister of the Ministry of Science and ICT, said of the controversy, "Innovation is forged through transparent and rigorous verification," adding, "There is no innovation without growing pains, and the current debate is an essential process that Korea's AI must go through to leap higher."