The Personal Information Protection Commission (hereafter referred to as PIPC) announced on the 19th that it has published a guide titled "Guide for Generating and Utilizing Synthetic Data" to support the safe generation and use of synthetic data, a technology gaining attention for enhancing personal information protection.
Synthetic data is virtual data generated by algorithms that learn the statistical characteristics and patterns of original data. It has the advantage of being able to share and utilize data safely without including personal identification information. Following the announcement of a reference model for generating synthetic data in May, the PIPC specified the generation process and compliance with relevant laws in this guide.
The guide categorizes the stages of generating and utilizing synthetic data into ▲preparatory work ▲synthetic data generation ▲safety and usability verification ▲review committee evaluation ▲utilization and safe management, and it includes detailed procedures and checklists for each stage to be immediately applicable in the field. Additionally, it provides guidance on the procedures and precautions for generating and utilizing unstructured synthetic data (images).
Yang Cheong-sam, director of the PIPC's Personal Information Policy Bureau, said, "It is significant that industry-academia collaboration and legal experts have actively participated in preparing this systematic guide," and noted, "Through this guide, we expect to clarify the standards and procedures for utilizing synthetic data, thereby alleviating difficulties faced in industry and research settings."