Kakao releases Kanana Safeguard AI models to ensure safe generative AI in Korea

Kakao's artificial intelligence (AI) guardrail model 'Kanana Safeguard' /Courtesy of Kakao

Kakao is taking steps to create a safe and trustworthy generative AI environment. Kakao announced that it has developed an AI guardrail model called 'Kanana Safeguard' that can verify the safety and reliability of AI services, becoming the first corporation in the country to publicly release three types of models as open source.

As generative AI services rapidly expand, social concerns about harmful content are growing. In response, Kakao recognized the need for an AI guardrail system as a technical and institutional countermeasure and developed the Kanana Safeguard model. Major global big tech corporations are also operating models that detect the risks associated with generative AI.

Kanana Safeguard is based on the language model 'Kanana,' developed by Kakao, and demonstrates high performance specialized in the Korean language by utilizing a dataset reflecting the Korean language and culture. In F1 score evaluations, its Korean performance surpassed that of global models.

The three types of models released this time can detect various types of risks. Kanana Safeguard detects harmfulness such as hate, bullying, and sexual content, while 'Kanana Safeguard-Siren' detects legal risks related to personal information and intellectual property rights. 'Kanana Safeguard-Prompt' detects attempts to misuse AI, and all models can be downloaded from Hugging Face.

To contribute to building a safe AI ecosystem, Kakao applied the Apache 2.0 license to these models, allowing for commercial use, modification, and redistribution. The company plans to promote model upgrades through continuous updates.

※ This article has been translated by AI. Share your feedback here.