KT and Korea University launch KSAFE-MM to benchmark Korea-centric AI safety

KT said on the 16th it released KSAFE-MM, a multimodal large language model (MLLM) benchmark co-developed with Korea University.

KSAFE-MM consists of "KSAFE-MM-G," which converts global common risks into the Korean cultural context, and "KSAFE-MM-C," which reflects issues unique to Korean society such as jeonse fraud and the Dokdo dispute. Comprising a total of 14,135 evaluation samples, it is Korea's largest Korean-language multimodal safety evaluation dataset. It validated 12 global multimodal large language models (MLLMs), including Google Gemma and Naver HyperCLOVA X.

KT employees develop KSAFE-MM. /Courtesy of KT

In particular, it is characterized by presenting an automated general-purpose pipeline (Pipeline: a work process spanning from data collection to deployment). Existing benchmarks are centered on manual review, which incurs high expense and is not highly efficient.

KSAFE-MM implemented a four-stage automated pipeline that encompasses the entire process, from collecting sensitive topics based on local communities, to generating template-based queries (Query: questions users input into an AI model), generating synthetic images, and generating jailbreak queries designed to cleverly bypass AI safety mechanisms or ethical constraints.

This means it provides a standard framework that can rapidly build safety benchmarks reflecting local characteristics without experts from a specific cultural sphere, lowering expense and improving efficiency. The joint research team from KT and Korea University demonstrated immediate applicability to any culture worldwide through a pilot experiment (JSAFE-MM-C) that applied the same pipeline to Japanese.

KT released the research results and benchmarks on arXiv and Hugging Face.

※ This article has been translated by AI. Share your feedback here.