As AI becomes increasingly integrated into mainstream applications, it's no surprise that it's also being used for malicious purposes. Bad actors are now repurposing tools designed to enhance productivity, creativity, and problem-solving to exploit vulnerabilities in computer systems or networks. With the accelerated adoption of AI, dark AI has emerged, where sophisticated models are manipulated for harmful activities.
Despite the strong emphasis on the ethical use and security of large language models (LLMs), bad actors still find ways to exploit them. These jailbroken models allow users to pose rogue and malicious questions, leading to their use in fraudulent activities and cybercrime.
These models are discussed and accessible online, including the dark web and private hacking forums. The rise of these dark AI models raises serious questions about the safe use of AI and the potential risks to the general public.
This blog will explore dark AI and the role of the dark web and hacking forums in promoting the misuse of LLMs. We will also discuss some of the tools assisting hackers in malicious activities.
Understanding Dark AI
Dark AI is the malicious use of AI technologies, such as LLMs, to facilitate cyberattacks and data breaches. This form of AI is precisely engineered to exploit the capabilities of advanced AI systems, enabling threat actors to infiltrate networks, manipulate data, and conduct sophisticated cyberattacks.
Cybercriminals use dark AI for various hostile purposes, such as compromising enterprise IT ecosystems, gaining unauthorized access to sensitive data, and automating hacking attempts.
According to Gartner, AI-enhanced attacks are a top new emerging risk for 8 out of 10 enterprises.
Let’s explore some of the ways attackers misuse LLMs and AI technologies:
- Automated Hacking Attempts: Cybercriminals once spent hours scanning systems for weaknesses. AI-driven tools now automate hacking processes, identifying network vulnerabilities far faster than any human could. Once a weakness is found, AI can initiate attacks on its own, increasing the speed and scale of cyberattacks.
Our recent "State of Attack on GenAI" report found that adversaries require only 42 seconds on average to complete an attack, highlighting the speed at which vulnerabilities can be exploited. - Social Engineering and Deepfakes: Social engineering is a well-known tactic where attackers manipulate individuals into divulging sensitive information or granting unauthorized access. These bad actors use AI to enhance social engineering with realistic deepfakes, impersonating trusted individuals through fake videos or audio. Such advanced deceptions trick employees into sharing sensitive data, posing a significant threat to organizational security.
- AI-Powered Malware: AI is being used to create adaptive malware that can learn and evolve to bypass security measures. This next generation of malware is harder to detect and neutralize, making it a potent tool for cybercriminals.
The Role of Dark Web and Hacking Forums
The dark web and various online hacking forums have become popular and private platforms for bad actors to discuss and share methods of jailbreaking LLMs while remaining anonymous. These platforms allow illegal activities to thrive, particularly since they remain largely unregulated.
Below are ways in which dark web forums and hacking channels are promoting malicious use of AI:
1. Distribution of Jailbroken LLMs
Many forums host discussions and share information on bypassing LLM guardrails, making it easier for cybercriminals to exploit these models. Hostile versions of AI models, designed specifically for illicit activities, are bought and sold openly, with no regulation. These AI models are either stripped of safety measures or custom-built to conduct attacks like phishing, fraud, or data theft.
2. Sale of Stolen and Hacked Accounts
Stolen accounts for paid versions of ChatGPT are regularly sold on these forums. Some are distributed for free, often sourced from malware logs on infected devices. Sellers usually advise that buyers avoid changing the account details to keep the theft unnoticed, enabling prolonged use of the account.
Hacked premium accounts allow cybercriminals to access advanced AI tools without detection, increasing aggressive AI-driven cybercrime.
3. Bulk Creation of Fake Accounts
Attackers use automated tools to create multiple accounts for free versions of AI platforms, circumventing API limits imposed by legitimate services. These fake accounts are often sold in bundles, allowing buyers to switch quickly when one account is banned for suspicious activity. This tactic helps avoid interruptions during malicious operations.
4. Growth of AI-Focused Discussions
Public forums and Telegram channels have seen an increasing number of posts related to the use of AI technologies, such as ChatGPT, for illegal purposes.
Kaspersky's research shows that these discussions peaked in April 2023. While the number of posts has since decreased, the models discussed have become more advanced, which continues to drive new forms of AI misuse.
AI Tools Assisting Hackers in Malicious Activities
Bad actors use various AI tools to enhance their attacks, automate cybercrimes, and evade traditional security measures.
Here, we will examine some of the commonly used tools that have assisted hackers in carrying out illegal activities.
1. WormGPT
WormGPT is a blackhat alternative to ChatGPT, designed for cybercriminals. Built on the GPT-J, it allows unlimited character inputs and unrestricted topic exploration. This makes WormGPT a powerful tool for targeted activities, such as launching Business Email Compromise (BEC) attacks. Allegedly trained on datasets that include malware-related information, WormGPT can generate harmful code or craft convincing phishing emails.
In July 2023, the email security vendor SlashNext revealed that WormGPT was being sold on an underground cybercrime forum. The tool was advertised as capable of generating phishing emails that could bypass spam filters, further enhancing its utility for cybercriminals.
2. FraudGPT
FraudGPT is a subscription-based generative AI tool built on GPT-3. It helps attackers develop hacking software, create viruses, and design phishing websites. FraudGPT is a starter kit for cyberattacks, offering features like custom hacking guides, vulnerability mining, and zero-day exploit creation.
This tool removes the safety protections and ethical barriers found in mainstream AI platforms like ChatGPT and Google Bard, allowing criminals to generate undetectable content.
3. ChaosGPT
ChaosGPT is a modified version of OpenAI’s Auto-GPT, created by an anonymous developer with negative objectives. The AI has outlined five primary goals that reflect an evil AI mindset: destroying humanity, establishing global dominance, causing chaos, manipulating humans, and achieving immortality. It seeks to harm humans, create mass destruction, and amass power to control society through manipulation.
4. ChatGPT (Manipulated)
While ChatGPT itself is not designed for malicious use, it can be manipulated with specific prompts, such as the infamous "Do Anything Now" (DAN) prompt. This prompt tricks ChatGPT into simulating an unrestricted mode where it bypasses typical safety filters and restrictions. In this mode, ChatGPT can potentially generate harmful content, provide dangerous instructions, or assist bad actors in unethical or illegal tasks.
What Does This Mean for Us?
The emergence of these repurposed AI tools highlights the rising influence of dark AI. Tools like WormGPT, FraudGPT, and ChaosGPT make it easier to create harmful content, execute fraudulent activities, and even develop malware or fake websites. This means cybercriminals now have powerful, accessible resources for their attacks. With dark AI on the rise, the need for advanced tools to detect and counter AI-driven threats has also risen. Organizations that are building AI-driven software must rely on guardrails to identify and prevent such incidents of resource hijacking for malicious purposes.
Pillar's cutting-edge platform is dedicated to securing the entire AI lifecycle. We offer proactive model-agnostic guardrails to detect and block any attempt to abuse, manipulate or hijack the AI application which could result in jailbroken models as seen above. Our proprietary risk detection engines, trained on real-world AI interactions, ensure the highest accuracy in identifying and mitigating AI-related risks.