LG AI Research on the 9th unveiled EXAONE (LG AI Research) 4.5, a Multimodal AI model that simultaneously understands and reasons over text and images.

EXAONE 4.5 is a vision-language model (VLM) that integrates a self-developed vision encoder and a large language model (LLM) into a single architecture, based on the technological capabilities accumulated since LG AI Research developed Korea's first Multimodal AI model, EXAONE 1.0, in Dec. 2021.

STEM benchmark performance comparison between EXAONE 4.5 and global peer models. /Courtesy of LG AI Research Institute

A defining feature of EXAONE 4.5 is its "real-world reasoning ability," which reads and analyzes complex unstructured data encountered on industrial sites. It goes beyond merely recognizing objects in photos to consolidate the text and visual information contained in complex design drawings, financial statements, and various technology contracts to grasp context. This is a key step toward evolving AI into "physical intelligence" that can solve real challenges on industrial sites beyond data in virtual worlds.

In performance evaluations, EXAONE 4.5 outperformed global competitors such as OpenAI's "GPT-5 mini" and Anthropic's "Claude Sonnet 4.5" in average scores across 13 metrics measuring AI vision processing and reasoning capabilities. In particular, it posted 77.3 points on STEM (science, technology, engineering and mathematics) benchmarks, demonstrating world-class competitiveness. It also outperformed Google's latest model in coding benchmarks and complex chart analysis.

It also delivered efficiency gains. EXAONE 4.5 reduced parameters to 3.3 billion—one-seventh the size of the previous model—yet maintained equivalent text reasoning performance by applying high-speed inference techniques such as a hybrid attention architecture. Supported languages were expanded to six, including Korean and English, as well as Spanish, Japanese and Vietnamese.

LG AI Research opened EXAONE 4.5 on the global platform Hugging Face for research and education to expand the AI ecosystem. The goal is to expand modalities for the domestic AI foundation model project "K-EXAONE." The plan is to advance it into "physical intelligence" that understands and judges not only speech and video but also the physical environment.

It will also continue collaborations with relevant institutions on high-quality data training, based on its self-designed AI risk classification system, to evolve the model to deeply understand Korea's history and cultural context.

Lee Jin-sik, head of the EXAONE Lab at LG AI Research, said, "EXAONE 4.5 is a signal flare announcing the entry into the multimodal era, perfectly understanding visual information beyond text," adding, "We will expand the scope of understanding to speech and video, and further to the physical environment, to build AI that can make practical judgments and take action on industrial sites."

※ This article has been translated by AI. Share your feedback here.