Researchers at the Electronics and Telecommunications Research Institute (ETRI) are discussing the realistic character utterance video generation framework technology./Courtesy of ETRI

The Korea Electronics and Telecommunications Research Institute (ETRI) announced on the 15th that it has successfully implemented an artificial intelligence (AI) avatar that can speak naturally like a real person using just one portrait photograph.

Existing AI voice assistants and navigation systems have only been able to recognize and perform commands. However, the newly developed technology provides an experience that feels like conversing with a real person by accurately rendering mouth shapes and facial expressions. For example, scenarios such as an AI driver conversing naturally with the driver or making eye contact and communicating with pedestrians have become possible.

The research team implemented the AI avatar based on a unique algorithm that selectively learns and synthesizes facial areas closely related to speech, such as the lips and jaw. This method reduced unnecessary data learning and enabled more refined representations of lip shapes, teeth, and skin wrinkles.

ETRI noted that this technology showed superior performance in clarity, naturalness, and lip synchronization compared to technologies presented at major global conferences such as the Conference on Computer Vision and Pattern Recognition (CVPR) and the Association for the Advancement of Artificial Intelligence (AAAI).

This technology can be utilized in various industries, including autonomous vehicles, kiosks, bank counters, news broadcasting, and advertising models. There is a strong potential for it to become a core technology in the digital human industry that goes beyond simple information delivery to emotionally engage with people.

Yoon Dae-seop, head of the Mobility UX Research Lab at ETRI, said, "As mobility technology advances, the elderly and socially vulnerable groups may become marginalized. I hope that AI avatar technology will evolve into smart mobility services that everyone can easily use."

Choi Dae-woong, the lead researcher at ETRI, also stated, "We plan to further enhance the technology so that AI avatars can converse and move naturally like real people," and added, "In the future, we will implement interactions that can replace some human personnel in orders and consultations."

Currently, this technology is registered on the ETRI technology transfer site as the "realistic character speech video generation framework technology." The research team plans to actively pursue technology transfer and commercialization strategies for various industries.

※ This article has been translated by AI. Share your feedback here.