State-backed hackers exploit AI as model extraction attacks surge

Google Cloud Threat Intelligence Group publishes the 2026 Cybersecurity Forecast Report on the 5th. /Courtesy of Google Cloud

With the spread of generative artificial intelligence (AI), attempts at cyberattacks to identify how AI models work internally are increasing.

Google Threat Intelligence Group (GTIG) and DeepMind released an "AI threat tracking report" on the 13th, saying that cases of model extraction and distillation attacks targeting Generative AI were identified in the fourth quarter of last year.

Model extraction and distillation attacks analyze an AI model's reasoning and thought process to replicate it or figure out its internal structure. The report noted that if an AI model is manipulated by an attacker, it can operate in unintended ways and cause confusion.

However, it analyzed that most cases confirmed so far involved private corporations or researchers seeking to replicate a model's logic rather than cyberattacks. The main target was Google's Generative AI model "Gemini."

The report also said that it is becoming routine for state-backed cyberattack groups from countries including China and Russia to use AI in overall threat activities.

APT42, an Iran government-backed hacking group, used AI models to advance targeted social engineering techniques, and the North Korea-backed hacker group UNC2970 used Gemini to plan and conduct reconnaissance during attacks targeting the defense industry.

Google's analysis of English- and Russian-language dark web communities found that threat actors tend to use commercial AI models rather than develop their own. The report explained that the recent rise in API key theft and misuse reflects this trend.

※ This article has been translated by AI. Share your feedback here.