Artificial Intelligence (AI) has become deeply embedded in almost every major industry, with around 78% of organizations reportedly using AI in at least one business function. Among the main consumers are sectors such as healthcare, telecommunications, finance, automotive and manufacturing, with AI being used to assist diagnostics and medical imaging, fraud detection, quality control, customer service and more. The global adoption of AI supports the automation of critical business-related tasks and streamlines operations, with one-third of companies using generative AI, such as large language models (LLMs) and image generators, for tasks like content creation and coding assistance, reducing the likelihood of human error and increasing efficiency and innovation. However, as the adoption of AI grows, so too does the attack surface for cyber threats, leaving many organizations exposed to risks that challenge their security safeguards.
In 2024, 87% of global organizations faced an AI-powered cyberattack, a number that is only expected to increase. This is also reflected in our open-source intelligence research, where we are regularly coming across instances of AI-powered cyberattacks. This blogpost will provide a summary of the common types of threats observed by the Silobreaker Analyst Team, looking at both vulnerabilities within AI systems that could be exploited, as well as malicious use of AI by threat actors.

Adversarial AI Threats
AI and machine learning (ML) systems have introduced new and unique vulnerabilities that can be exploited by threat actors, enabling them to compromise such systems’ integrity, often without having to breach the underlying infrastructure. New vulnerabilities are discovered regularly, in particular prompt injection flaws, which are considered the largest threats to LLMs. Prompt injections allow for the input of a carefully crafted prompt that encourages an AI model to ignore its instructions or safety measures and execute a malicious action. A recent example is the discovery of an SQL injection vulnerability in Anthropic’s SQLite Model Context Protocol (MCP) implementation, identified in June 2025, that allows an attacker to embed a prompt to call powerful tools like email, database, and cloud APIs, to steal data or move laterally, all without detection. Anthropic’s SQLite MCP server has been forked over 5,000 times, meaning the unpatched code is present in thousands of downstream agents.
Instances of malware using prompt injection have also been observed, an example being the recently discovered experimental malware, Skynet. The malware embeds natural-language text into its code to manipulate AI models into misclassifying it as benign. While the malware is still rudimentary and the prompt injection attempt failed at the time of discovery, the malware’s existence displays threat actors’ intentions in exploiting AI for detection evasion.

Threat actors can also employ ‘indirect’ prompt injection attacks, instead hiding malicious instructions on a website or within an email that is later executed by the AI model when prompted. Similarly, a threat actor could employ an adversarial input attack by crafting malicious input data that is designed to trick an AI model into making an incorrect decision. For example, an attacker could add subtle perturbations, such as editing pixels or adding stickers to visual content, causing the AI model to perceive the content for something it is not. This type of attack can have significant impacts on multiple sectors, notably healthcare, with an adversarial input attack potentially modifying medical images, thereby causing AI systems to misdiagnose or misclassify patients’ conditions. Alternatively, such an attack could impact the automotive industry, causing recognition systems to misinterpret road signs, lane markings, or the presence of vehicles, increasing the risk of crashes, and their potential impact on human lives.
Another threat is data poisoning, which involves the contamination of training data, typically by inserting malicious or misleading examples into an AI’s training set, or manipulating the data collection process, to influence and exploit the model’s behavior. An example of this is ConfusedPilot, a method that targets Retrieval Augmented Generation-based AI systems. ConfusedPilot can add malicious content to documents that may be referenced by AI, causing them to repurpose misinformation and potentially compromise decision-making processes.
Threat actors could also employ model inversion attacks against AI models, which involves querying a ML model and analyzing its outputs to reconstruct sensitive information from the training data. Threat actors could abuse such data to extract personal information belonging to individuals or organizations, which can be used for further malicious activities. A similar attack is model stealing, where a threat actor could send a high number of queries to a proprietary ML service and use the outputs to reverse engineer a copy of the model. While the stolen model may not be an exact replica, it can mimic functionality, enabling attackers to discover new vulnerabilities or bypass regular usage restrictions.
Malicious uses of AI
While threat actors are actively seeking ways to exploit vulnerabilities within AI models, they are also attempting to use AI to facilitate cyberattacks, notably to increase their scale, automate tasks, and improve campaign success. Cybercriminal groups, nation-state advanced persistent threats (APTs), and hacktivists alike are adapting to the growing AI landscape, adding such tools into their attack arsenal in varying degrees. While leveraging AI can benefit experienced and known threat actors, new and low-skilled actors can also utilize AI capabilities to launch sophisticated cyberattacks with little effort.
AI phishing
A major exploitable feature of generative AI models, particularly LLMs, is their ability to produce human-like text, with phishing attacks having reportedly increased by over 1,200% since the rise of generative AI. Instead of manually writing convincing phishing emails or smishing messages, threat actors can use AI tools to automate the production of messages tailored to the targeted organization or individual, often by scraping data from social media and professional networking sites. The messages are increasingly becoming contextually relevant and grammatically correct, while also leveraging the target’s writing style, making such lures more believable and capable of evading spam detection tools. The recently developed AI chatbot, Venice AI, for example, has gained significant interest among the hacking community for its lack of content moderation and use of open-source language models. Venice AI can write a convincing and grammatically correct phishing email within seconds, only requiring an attacker to insert a phishing link.
Threat actors can also combine phishing techniques with adversarial AI threats, such as prompt injection, causing AI models to misinterpret malicious content based on prompts provided by an attacker. For example, a threat actor could prompt an AI model to suggest malicious URLs, recommend hacking tools, or provide disinformation when asked certain questions. Netcraft researchers outlined this after discovering 34% of domains suggested by LLMs were potentially harmful, with Perplexity AI observed recommending a spoofed domain for a UK financial institution instead of the official site.
AI use by nation-state actors
North Korean IT workers in particular have leveraged AI to improve the quantity and quality of their operations by using real and AI-enhanced images, as well as resumes, to create convincing worker profiles. IT workers are also experimenting with other AI technologies, such as voice-changing software, which could be used to trick interviewers into thinking they are communicating with legitimate interviewees. Such use of AI, known as deepfakes, has also become increasingly prevalent in other campaigns, often also incorporating AI-generated imagery and videos. Threat actors can leverage deepfakes to trick users in phishing-style attacks, but also to alter individual opinions, as has been seen in various influence operations.

The use and abuse of AI in disinformation campaigns has been more prevalent in general. For example, the Russia-nexus Pravda network is believed to engage in ‘LLM grooming’ to influence AI models’ responses. As of March 2025, the network has amassed 150 domains, which have spread 207 false claims over an estimated 3.6 million yearly articles. China-linked actors have similarly been observed leveraging AI during influence operations, with one campaign using AI to orchestrate an ‘account farm’ used to create new Facebook and Instagram profiles that amplify false information regarding current events in Japan, Myanmar, and Taiwan. Many actors, including DRAGONBRIDGE, KRYMSKYBRIDGE, and Storm-1516, have further abused AI for their campaigns by exploiting models to create images, videos, and other content designed to support their claims.
AI-driven malware delivery
The global hype surrounding AI has also naturally drawn threat actors to use fake AI tools to deliver their malware. For example, groups such as UNC6032 have been observed using fake AI video generator websites to distribute malware and deploy Python-based infostealers and backdoors like XWORM and FROSTRIFT, while ransomware groups like CyberLock, Lucky_Gh0$t, and Numero leverage installers for popular AI tools to deliver their ransomware payloads. Threat actors often leverage news about new AI tools or updated features to create fake download pages for the supposed tools, indicating that threat actors, just like organizations, are keeping track of the latest developments in the field for maximum effectiveness.
Exploiting AI for malware delivery can also enable full insights into a target’s environment, allowing delivery systems to decide whether to remain dormant, evolve to evade detection, or execute its malware. Furthermore, malware delivery systems could autonomously decide when to launch or self-destruct, choosing to operate at times that will allow it to go unnoticed, thereby enhancing its stealth. One method that can also be used is the slopsquatting technique, which makes coding agents hallucinate non-existent but plausible package names that can be used to deliver malware.
AI-written malware
AI development has further revolutionized the cyber threat landscape, notably via the emergence of AI-written malware, which features greater adaptation characteristics than traditional variants. Research conducted by Rapid7 demonstrates the capabilities of polymorphic malware, which can automatically modify its code to evade detection. Furthermore, tools like Nytheon AI and WormGPT offer real-time capabilities without requiring coding skill, while other tools can provide instructions for ransomware development or phishing kits via chat interfaces that support multiple languages. A recent example includes the LameHug malware family, which CERT-UA identified on July 10th, 2025. LameHug uses the Hugging Face API to interact with the Qwen 2.5 LLM to generate malicious commands in multiple languages that can be executed on Windows systems. Similarly, a Russian-speaking campaign, dubbed ScopeCreep, leveraged AI to develop a Windows malware toolkit capable of privilege escalation, credential harvesting, obfuscation, and Telegram-based C2 communication. Some LLMs, like GPT-4, are even capable of generating malicious code that rewrites itself after execution, making detection via signature-based systems virtually impossible.
Other malicious AI uses
While many threat actors are leveraging AI for content generation, they also utilize ML for reconnaissance and exploit discovery. AI can process large amounts of data at one time to spot patterns or weak spots that could go unnoticed by the human eye, analyzing a target organization’s network footprint, employee profiles, or open ports to prioritize who to attack and when. Nation-state actors have also been observed leveraging AI to identify zero-day vulnerabilities or to evade anomaly-based intrusion detection, with some groups even using AI to search for exposed credentials.
Key takeaways: The AI attack surface
Adversarial AI threats and malicious uses of AI are undoubtedly going to develop as the technology evolves, increasing cybersecurity requirements in securing AI infrastructure while adapting to combat AI-augmented attacks. AI is increasingly being used for defense mechanisms, with many IT teams deploying AI to detect, mitigate, and learn from AI-based cyber threats. Teams are also employing adversarial training, intentionally exposing ML models to adversarial examples during development to build resistance against such manipulation. Meanwhile, AI penetration testing and red teaming are gaining traction, probing AI models for vulnerabilities and weaknesses that may be exploited by an attacker. By staying ‘ahead of the curve’, cybersecurity teams can position themselves to strengthen their defenses against such threats.
If you would like to learn how Silobreaker’s intelligence and analytics can help your organization’s scenario planning for security measures, threats, and market developments involving AI, request a demo here.

