AI image. /Courtesy of pixabay

When you ask AI to draw a picture, it may produce an image with three fingers or a distorted face. A domestic research team has proposed design principles for Generative AI that can reduce such errors.

Researchers led by professors Yun Sung-hwan and Yoo Jae-joon at the Ulsan National Institute of Science and Technology (UNIST) Graduate School of Artificial Intelligence said on the 22nd that they had proven through theory and experiments that an AI model's robustness and generalization performance can be improved at the same time. The study was accepted to the International Conference on Computer Vision (ICCV) 2025, which opened on the 19th in Hawaii, United States.

Diffusion models, such as ChatGPT's DALL·E, are used in image-generating AI. Such diffusion models are vulnerable to problems like error accumulation during short generation processes, quantization errors that arise when compressing a model to move it onto small devices, and adversarial attacks that shake the output by planting subtle perturbations in the input.

The researchers viewed these issues as stemming from insufficient generalization performance in AI. Generalization performance refers to a model's ability to operate reliably even with new data or in environments not used for training.

The researchers found the solution in the valley shape at the minimum of the loss function. A loss function numerically represents the difference between the AI's predicted result and the correct answer; the lower the value, the better the training. During training, AI proceeds in a direction that reduces the loss value, and if this minimum point is narrow and steep, performance easily collapses with small perturbations. Conversely, if it reaches a wide and flat minimum, performance remains stable even in new situations or under interference.

In experiments, among learning algorithms that seek flat minima, sharpness-aware minimization (SAM) proved the most effective.

The researchers said, "Beyond simply improving image quality, it is meaningful in that we presented design principles for Generative AI that can be trusted and used across various industries and real-world environments," adding, "It will serve as a foundation for enabling large-scale generative models like ChatGPT to train stably even with small amounts of data."

References

arXiv (2025), DOI: https://doi.org/10.48550/arXiv.2503.11078

※ This article has been translated by AI. Share your feedback here.