DeepSeek's Liang Wenpeng Prioritizes AGI Development, Emphasizes Innovation

The DeepSeek AI sign is photographed at the office building of the DeepSeek AI startup located in Beijing, China./Courtesy of Yonhap News Agency

A month has passed since the Chinese artificial intelligence (AI) startup DeepSeek garnered attention for its low-cost and high-efficiency AI model, but it is reported that core figures, including founder Liang Wenpeng (40), have focused on technology development while refraining from external activities.

On the 27th, the Hong Kong South China Morning Post (SCMP) reported, citing sources, that DeepSeek has prioritized the development of artificial general intelligence (AGI) over attracting investments or expanding into new businesses.

A source familiar with DeepSeek's internal affairs noted that Liang Wenpeng is focusing on maximizing the performance and efficiency of AI with minimal resources. He said, "It will take time to see how effective DeepSeek's strategy is," adding, "The scaling law remains important in AI research, and it is difficult to maintain a lead through algorithm innovation alone."

The scaling law refers to the concept that the performance of AI models improves as the amount of data and computation increases. Unlike American Silicon Valley corporations that use high-performance graphics processing units (GPUs) en masse, DeepSeek has created a sensation in the AI industry by utilizing technologies that maximize efficiency, such as "Mixture of Experts" (MoE) and "Sparse Attention."

In an interview with the Chinese tech media 36kr last July, Liang Wenpeng emphasized, "DeepSeek's goal is not to maximize profits through technological innovation, but to ensure that everyone can benefit from AI." In response to a question about the timeline for achieving AGI, he answered, "Whether in 2 years, 5 years, or 10 years, it will be possible while we are still alive."

SCMP noted that while DeepSeek has emerged as one of the most notable corporations in China, it is minimizing external exposure by rejecting visits from investors. Liang Wenpeng has likewise not appeared in public events over the past month. The only instance of him going public was his participation in a private enterprise forum hosted by President Xi Jinping on the 17th, covered by state media, which only reported a brief handshake with Xi.

DeepSeek has maintained a cautious stance, refraining from commenting on new product release schedules or organizational operations, and has not officially addressed rumors. Recently, Reuters reported that DeepSeek initially planned to launch the next version of its AI model "R1" in early May but is accelerating development to bring the timeline forward.

DeepSeek updated its large language model (LLM) "V2" to "V3" seven months after its release in May last year and introduced its inference model "R1" based on V3 on January 20 this year.

Meanwhile, unlike its minimal external exposure, the DeepSeek research team is actively engaging with the developer community. This week, DeepSeek shared its research results by releasing three open-source code repositories related to AI model development.

Additionally, according to local media, some members of the DeepSeek research team attended a closed session of the global developer conference held last week in Shanghai. On the 18th, DeepSeek presented a paper authored by the research team, including Liang Wenpeng, introducing a sparse attention mechanism called "Native Sparse Attention" (NSA), which enables rapid training and inference of long sentences.

※ This article has been translated by AI. Share your feedback here.