Illustration = ChatGPT

President Lee Jae-myung, in his first policy briefing since taking office, said the Hangul Word Processor (HWP) format of Hancom Office is not suitable for artificial intelligence (AI) use and called for measures. The HWP document format, mainly used by the government and public institutions, has been seen as an obstacle to using public data because it is not suitable for AI training, drawing attention to whether discussions on improvements will gain traction.

◇ President Lee: "HWP data that AI cannot read… please find a technical solution"

According to the industry on the 18th, President Lee, after receiving a report from National Data Minister Ahn Hyeong-jun at the 2026 policy briefing held at the Sejong Convention Center on the 11th, pointed out the public data compatibility issues of HWP and demanded countermeasures.

President Lee said, "As the importance of data grows, the core of an AI society is ultimately data," and added, "What matters is what quality data we create and how we use it." He continued, "Government official documents are the highest-quality asset from a data perspective, but most are written in HWP and various techniques are applied, so isn't it that machines cannot read them," and asked, "How do we solve this?"

In response, Minister Ahn said, "We need to standardize so AI can read it," and added, "We are preparing to convert HWP into PDF files and other formats to make it 'machine readable.'" President Lee then asked again, "If you convert to PDF, can everything be read," and Ahn answered, "In general cases it can, but when it is opaque, it cannot be read even as a PDF file, so another technical conversion is needed." President Lee said, "If we fully leverage technology, there will be a way," and ordered, "It cannot be done by hand; please find a solid technical solution."

This remark marks the first time the president has publicly highlighted that high-quality data—central to the AI era—is not being properly used because of outdated government practices. It is interpreted as a strong message that the entire document system must be fundamentally improved to enhance data usability.

◇ Most public documents are in HWP… "Even ChatGPT can't read them"

Most public documents produced by the government and public institutions are currently distributed as files with the HWP extension. According to a survey on "AI use in the public sector" conducted by Democratic Party of Korea lawmaker Wi Seong-gon from Sept. 17 to Oct. 6 of last year among civil servants in central ministries and metropolitan and basic local governments, 91.1% of 14,208 public officials nationwide said they mainly use HWP and PDF for administrative documents such as reports and plans. In the early days of public-sector computerization, the government encouraged the use of Hancom Office to foster the domestic software (SW) industry. As a result, while public institutions are not required to use HWP, it continues to this day as a matter of practice.

However, as AI technology has advanced and the importance of data has grown, calls are increasing for improvements to boost the use of public data. Public documents are cited as the data most needed by AI developers. They are written in refined language, and policies and administrative flows are systematically organized by causality, making them optimal for improving domestic information and Korean-language understanding. But HWP is a closed document format with a strong focus on security, and when extracting data for AI training, context can break or only meaningless binary information remains, undermining usefulness, critics say.

To address this, HANCOM in 2021 changed the default format of Hancom Office documents from the closed HWP to the open HWPX. HWPX is a machine-readable format that allows data classification and extraction without separate processing. The government also switched its document storage standard from HWP to HWPX starting in 2021. Still, there are reports of limits in the field. Past documents already created in HWP are still used, and if users do not use the HWPX version of Hancom Office, there are restrictions on data use.

Global AI companies are focusing on document formats that are internationally used rather than HWP. In fact, ChatGPT, the most widely used Generative AI in Korea, cannot directly read the HWP format, requiring a separate conversion process. For this reason, if public documents are produced and distributed mainly in HWP, points of connection with the global AI ecosystem will inevitably be limited, critics say.

Hancom's position is that it is addressing related issues through the shift to HWPX and technological advances. A Hancom official said, "The current default storage format HWPX is an XML-based structure that follows international standards and is suitable for AI use, and we provide tools free of charge to convert HWP files to HWPX," adding, "We also provide 'Hancom Data Loader' technology that can directly extract text and document structure information from HWP binary files without a separate conversion process."

The official added, "For large language model (LLM) training, it is less that a specific format is technically impossible and more a timing issue based on the support priorities and strategies of LLM corporations," and said, "In fact, Google's Gemini 3.0 supports not only HWPX but also the HWP format, so data compatibility will improve going forward."

※ This article has been translated by AI. Share your feedback here.