Two cartoon figures fighting each other
© Daniel Pudles

Technology groups have heralded the generative AI boom as an opportunity to boost the effectiveness of workers and to slash costs. But there is also growing concern about the technology being weaponised by hackers to enable devastating cyber attacks.

Powered by large language models (LLMs), generative AI enables computers to generate complex content — such as text, audio, video and music — from simple human inputs, or “prompts”. It exploded into public consciousness late last year with the launch of OpenAI’s ChatGPT, a consumer chatbot that can answer users’ questions and commands.

However, the technology has already captured the imagination of hackers. After ChatGPT went live, the number of posts on dark web forums about how to exploit the tool skyrocketed from 120 in January to 870 in February, according to a report by NordVPN — a 625 per cent rise.

Most LLMs have filters to block the input of harmful human prompts. But one area of concern surrounds what are known as prompt injections — where hackers find ways to bypass these filters, prompting the system to generate hate speech or propaganda, share confidential information, or even write malicious code.

Hundreds of these prompt injections have been reported, including hackers and researchers trying to find out if systems can write computer malware and ransomware.

Mackenzie Jackson, developer advocate at cyber security group GitGuardian, says the malware it is possible to write, so far, is fairly basic — but generative AI “is changing the ability of who can write malware”. This is creating a new generation of what he dubs “AI hackers”, who cannot themselves typically hack, but who know how to wield these off-the-shelf tools.

Eran Shimony, principal security researcher at CyberArk, found ways to make ChatGPT potentially use “these tools to automatically generate new families and types of malware”. Hackers can ask ChatGPT to “replace and mutate every part” of the malware, thereby creating “polymorphic”, or mutating, malware that evades detection systems, he explains.

Shimony says that there is a “cat and mouse” game between hackers and chatbot owners, such as ChatGPT, with the owners seeking to patch up flaws and vulnerabilities. However, he warns that “alternatively, hackers or attackers can download publicly free LLMs on their own machine, and train them on malicious code”, thereby creating more sophisticated, dangerous models that they can deploy themselves. “We think the trend will increase in the future,” Shimony adds.

At present, there are no ways to stop this, according to an August blog post from the National Cyber Security Centre. “Whilst research is ongoing into prompt injection, it may simply be an inherent issue with LLM technology,” the post said. “Research is also ongoing into possible mitigations, and there are some strategies that can make prompt injection more difficult, but as yet there are no sure-fire mitigations.”

In addition, the feedback loops used to develop open-source generative AI technology can potentially be seized upon by cyber criminals. A hacker might look to feed a model with malicious training data, and in turn manipulate what is churned out, known as data poisoning.

Further cyber security risks are created by the ability of generative AI to automate tasks at scale. For example, experts warn it could be used to write and deploy social engineering scams, such as phishing emails or romance fraud. In particular, it could target numerous victims in a more personalised way than was previously possible, scraping information from a person’s social media and online presence, or replicating their writing style.

For now, however, the threats are largely theoretical, says Sandra Joyce, vice-president at Mandiant Intelligence, part of Google Cloud. Hackers are experimenting with the technology and offering services on the dark web to help get around the filters of certain chatbot systems. But no consequential hacks have taken place.

“We haven’t responded to a single incident where AI was used in some differentiating way,” she says. Otherwise, the main activity she has seen is the use of deepfakes in information or disinformation campaigns.

Defending against future AI-powered hacks will require extra vigilance and vetting. But AI is already being used to detect malware on computers, by spotting unusual behaviour and patterns that humans would otherwise miss.

Some are now proposing it be deployed in cyber defence more widely. A recent report by Gartner calls on cyber professionals to “receive the mandate to exploit GenAI opportunities to improve security and risk management, optimise resources [and] defend against emerging attack techniques”.

Looking longer term, researchers are exploring the potential of fully automated cyber defence, whereby tools might detect new malware and instantly block it from spreading or causing harm.

“Our opportunity now is to leapfrog over them [the hackers],” suggests Joyce. “This is the first time we have a technology that has an inherent defender’s advantage.”

But Jackson warns there will be a steep learning curve, nonetheless: “The next five years is going to be a real challenge for security teams. We understand how hackers think but we don’t understand how these malicious programs think.”

Copyright The Financial Times Limited 2024. All rights reserved.
Reuse this content (opens in new window) CommentsJump to comments section

Follow the topics in this article

Comments