Google says hackers are using AI to find zero-days and build malware

Google says attackers are using AI for zero-days and reconnaissance.
Report highlights AI-linked zero-days, Android malware, and AI supply chain attacks.

Google threat researchers say attackers are expanding their use of generative AI in cyber operations. The activity spans vulnerability research, malware development and attacks on AI software supply chains.

The findings come from Google Threat Intelligence Group, which based its report on Mandiant incident response work, Gemini-related abuse investigations, and proactive threat research. The group said AI is being used as an operational tool by threat actors and as a target through compromised AI-related software components.

Zero-day raises concern

One of the report’s main findings involves a zero-day exploit that GTIG assessed was likely developed with AI assistance. The vulnerability affected a popular open-source, web-based system administration tool and allowed two-factor authentication to be bypassed, although valid user credentials were still required.

GTIG said the exploit appeared in a Python script prepared by cybercrime actors for a planned mass exploitation operation. The group worked with the affected vendor to disclose the vulnerability and said its early discovery may have prevented broader use.

John Hultquist, chief analyst at Google Threat Intelligence Group, said AI-assisted vulnerability discovery is already under way.

“There’s a misconception that the AI vulnerability race is imminent. The reality is that it’s already begun,” Hultquist said. “For every zero-day we can trace back to AI, there are probably many more out there.”

Hultquist said state-linked actors are using the technology, while criminal groups should not be underestimated because of their history of attacks.

The report said the exploit did not rely on common memory corruption or input validation flaws. Instead, it involved a logic error tied to a hard-coded trust assumption, a class of issue that can be difficult for traditional scanners to detect.

GTIG said signs in the exploit suggested AI involvement. These included detailed instructional comments, a hallucinated CVSS score, and a structured Python format. The group said it did not believe Gemini was used in that case.

State-linked groups

State-linked groups have also used AI to support vulnerability research. GTIG said clusters associated with China and North Korea have used expert-style prompts, security datasets, and automated workflows to assist code review and exploit testing.

In one case, UNC2814 prompted Gemini to act as a senior security auditor or C/C++ binary security expert while researching embedded device targets, including TP-Link firmware and Odette File Transfer Protocol implementations.

Threat actors also tested a GitHub-hosted vulnerability repository known as “wooyun-legacy.” The repository contains more than 85,000 vulnerability cases collected by the Chinese bug bounty platform WooYun between 2010 and 2016.

The report said APT45 sent thousands of repetitive prompts to analyse CVEs and validate proof-of-concept exploits. GTIG said the activity involved AI-supported vulnerability research at scale.

Malware developers’ evasion

AI is also being used to support obfuscation and defence evasion. GTIG identified several malware families or tools with LLM-enabled obfuscation features, including PROMPTFLUX, HONESTCUE, CANFAIL, and LONGSTREAM.

PROMPTFLUX previously used the Gemini API to generate code, while HONESTCUE requested VBScript obfuscation and evasion techniques. The report said these techniques support just-in-time code modification intended to complicate static detection.

GTIG also said APT27, a China-linked threat actor, used Gemini to accelerate development of a fleet management application likely connected to an operational relay box network. The tool supported mobile Wi-Fi and router device types. GTIG said this suggested it could provide residential IP addresses to obscure intrusion activity.

Russia-linked intrusion activity targeting Ukrainian organisations included CANFAIL and LONGSTREAM, according to the report. GTIG said both families used LLM-generated decoy code to hide malicious functionality.

In CANFAIL, researchers found unused code blocks that appeared to have been generated as filler content. LONGSTREAM also contained inactive administrative routines. These included repeated checks for daylight saving status that were unrelated to the downloader’s main function.

PROMPTSPY points to autonomous malware activity

The report also detailed PROMPTSPY, an Android backdoor first identified by ESET. Public reporting had already noted its use of the Google Gemini API to help pin a malicious app in the Android recent apps list.

GTIG said its review found additional AI-enabled functions. PROMPTSPY includes a module called “GeminiAutomationAgent,” which sends the visible Android user interface hierarchy to the Gemini 2.5 Flash Lite model and receives structured JSON instructions for actions like clicks and swipes.

The malware uses Android’s Accessibility API to map on-screen elements. The model response gives the malware action types and coordinates, allowing it to simulate gestures based on the device’s current state.

PROMPTSPY can also capture authentication gestures, including PINs or lock patterns, for replay. The malware includes persistence mechanisms that make uninstallation harder, including an invisible overlay placed over the uninstall button to intercept touch events.

The backdoor can be relaunched through Firebase Cloud Messaging when a device becomes inactive. Its command-and-control infrastructure, Gemini API keys, and VNC relay server can also be updated without redeploying the malware.

Google said it disabled assets linked to the activity. The report also said no apps containing PROMPTSPY were found on Google Play based on current detection, and that Google Play Protect blocks known versions by default on Android devices with Google Play Services.

Attacks beyond malware

Beyond malware, attackers continue to use LLMs for research and reconnaissance. Observed activity included prompts to build organisational hierarchies and identify third-party relationships. Other prompts sought information about finance and human resources teams.

In one targeted case, a threat actor tried to identify the exact make and model of a computer used by a high-value target by asking an LLM to examine photos. GTIG said such environmental details can support tailored exploit development or follow-on activity.

The report also described a suspected China-linked actor using agentic security tools against a Japanese technology company and an East Asian cybersecurity platform. The actor used Hexstrike and Strix to support reconnaissance and vulnerability validation.

Information operations actors have also used AI for research, content creation, and localisation. GTIG said it had not identified breakthrough abilities from these uses. It observed AI-generated political satire, synthetic media, and narrative audio from actors linked to Russia, Iran, China, and Saudi Arabia.

GTIG cited activity tied to the pro-Russia “Operation Overload” campaign, where suspected AI voice cloning was used to impersonate real journalists. The report said the content appeared to combine authentic video, edited footage, and fabricated audio to deliver false messages.

Threat actors are also building ways to access AI models at scale while avoiding account limits and enforcement. GTIG observed custom middleware, proxy relays, anti-detect browsers, account-pooling services, and automated registration pipelines used to obtain anonymised access to LLM services.

One China-linked cluster, UNC6201, attempted to use a public Python script for LLM account registration and cancellation. Another cluster, UNC5673, used tools like Claude-Relay-Service and CLI-Proxy-API to aggregate accounts in multiple model providers.

AI software supply chains become targets

The report said AI systems are also becoming targets through their surrounding software layers. GTIG said frontier models remain difficult to compromise directly, but orchestration tools, open-source wrappers, API connectors, and skill configuration files create exposure points.

In February 2026, VirusTotal researchers reported risks in the OpenClaw AI agent ecosystem, including malicious and insecure skill packages. GTIG said some packages masqueraded as OpenClaw skills while containing hidden routines to run unauthorised code and commands on host systems.

The risk was tied to the elevated access granted to AI agent skills. Malicious skills could execute code, download payloads, discover local data, and exfiltrate information, while poorly secured legitimate skills could expose credentials or authentication data.

OpenClaw has since partnered with VirusTotal to add automated scanning into ClawHub, its public skill marketplace. The scanning uses VirusTotal’s Code Insight ability to assess code behaviour and either approve, warn on, or block submitted skills.

The report also covered TeamPCP, also tracked as UNC6780, which claimed responsibility in late March 2026 for supply chain compromises involving GitHub repositories and GitHub Actions. The affected projects included Trivy, Checkmarx, LiteLLM, and BerriAI.

GTIG said TeamPCP gained access through compromised PyPI packages and malicious pull requests. The actor then embedded the SANDCLOCK credential stealer to extract cloud secrets, including AWS keys and GitHub tokens, from build environments.

The report highlighted the LiteLLM compromise because the project is used as an AI gateway for integrating multiple LLM providers. GTIG said stolen AI API secrets could support further system access and traditional intrusion activity.

Google said it is responding to these risks through product safeguards, abuse investigations, account enforcement, threat intelligence, and AI-assisted defensive tooling. The report cited Big Sleep, an AI agent from Google DeepMind and Google Project Zero that searches for software vulnerabilities, and CodeMender, an experimental agent designed to help fix code vulnerabilities.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events, click here for more information.

TNG – Latest News & Reviews