A LLM agent plays Orak's game./Courtesy of KRAFTON

KRAFTON announced the release of the benchmark 'Orak' to evaluate the gaming performance of large language model (LLM) based artificial intelligence (AI) agents.

Orak is designed to quantitatively analyze AI's situational awareness, judgment, and decision-making in popular game environments across six genres: action, adventure, RPG, simulation, strategy, and puzzle.

This system reflects the AI design experience accumulated through the collaboratively developed co-playable character (Co-Playable Character, CPC) by KRAFTON and NVIDIA. It can assess the ability of LLM-based AI to interpret and make decisions in complex game contexts, allowing for experimental new gameplay experiences with AI through iterative validation.

Additionally, it is implemented so that the AI can operate like a game player by delivering game information in a text-based manner through the model context protocol (Model Context Protocol, MCP) and converting the language model's responses into game actions.

KRAFTON plans to expand AI research beyond the gaming industry to various fields with this benchmark and will also provide a dataset for language model fine-tuning.

Lee Kang-wook, Deputy Minister of KRAFTON's deep learning division, said, 'Orak is a game-specific language model benchmark that encapsulates KRAFTON's prior research and expertise, and we are also planning a competition to compete in the design capabilities of LLM agents based on it.' He noted, 'We will continuously enhance LLM technology optimized for the gaming field to lead innovation in gameplay experiences through AI.'

※ This article has been translated by AI. Share your feedback here.