Buy it, build it or break it?

Looking for ways to use large language models (LLMs) to simplify attacks and evade defenses, cyber attackers are faced with three choices: Play a cat-and-mouse game to evade guardrails put in place by creators of leading AI models such as ChatGPT; dedicate time and effort to train your AI model; or enlist an uncensored open source template or something from the Dark Web to do their bidding.

Last month, underground developers appeared to have taken the first approach, releasing a malicious AI-powered front-end service, Dark Gemini, which likely modified suggestions sent to legitimate LLMs to break restrictions on writing malicious programs and geolocation of people in photographs. While many security professionals were unimpressed with the capabilities demonstrated by the service, the the chatbot was shown what could be accomplished with little effort.

While Dark Gemini has not been very successful, the systematic approach to building a front-end to bypass the guardrails that limit legitimate LLMs demonstrates that a minimalist approach can provide significant AI capabilities, such as text synthesis and translation , to launch current attacks, such as phishing, more effectively.

Offensive AI: Subvert, Buy or Build?

Dark Gemini is the latest example of finding ways to trick “born good” AIs into doing your dirty work. In February Microsoft and Open AI warned that actors threaten nation-states – including those from China, Iran, North Korea and Russia – were using companies’ LLMs to augment threat groups’ operations. Earlier this month, researchers at AI security firm HiddenLayer noted that guardrails set up to limit unsafe responses from Google’s Gemini could be easily circumvented.

However, using AI for more complex components of an attack – such as creating sophisticated malware – will likely prove difficult enough given the obstacles created by current guardrails, says Dov Lerner, head of security research at the intelligence firm about Cybersixgill threats.

“To be truly effective, [any malware] it must be evasive, it must evade any type of defense present and, certainly, if it is malware distributed on a corporate system, then [it] it has to be very sophisticated,” he says. “So I don’t think AI can write [malware] programs at this time.”

Enter “born malicious” options for sale on the Dark Web. AI-based chatbots trained on Dark Web content have already proliferated, including FraudGPT, WormGPT, and DarkBART. Uncensored AI models based on Llama2 and hybrid Wizard-Vicuna approaches are also available as pre-trained downloads from the repositories.

Other approaches, however, will likely lead to more serious threats. Cybercriminals with access to Unlimited AI models via HuggingFace and other AI model repositories could create their own platforms with specific capabilities, says Dylan Davis, threat intelligence analyst at Recorded Future’s Insikt Group.

“The impact that unrestricted models will have on the threat landscape” will be significant, he says. “These models are easily accessible…, easy to lift, [and] they are constantly improving, much better than most [Dark Web] models and become more efficient.”

“This is typical of the ongoing cybersecurity arms race,” says Alex Cox, director of LastPass’ threat intelligence team. “With disruptive technology like artificial intelligence, you see rapid adoption by both the good guys and the bad guys, with defensive mechanisms and processes put in place by the good guys.”

The arms race and AI defense strategies

As attackers continue to look for ways to use AI, defenders will have a hard time maintaining it Artificial intelligence protects against attacks such as prompt injection, says Recorded Future’s Davis. To create defenses that are difficult to bypass, companies must conduct extensive adversarial testing, to create rules designed to filter or censor both inputs and outputs—an expensive proposition, she says.

“Adversarial training is currently one of the most effective ways to do this [create] a resilient model, but there’s a huge trade-off here between safety and model skill,” says Davis. “The more adversarial the training, the less ‘useful’ the models become, so most model builders will be shy of usability point of view, as any sane business would be.”

Defending against clandestine developers who create their own models or use pre-trained open source models in unexpected ways is nearly impossible. In these cases, defenders must view such tools as part of the cybersecurity arms race and adapt as attackers gain new capabilities, says LastPass’ Cox.

“Generative AI guardrails and security should be treated like any other input validation process, and safeguards need to be evaluated, re-evaluated, and reassembled regularly as capabilities improve and vulnerabilities are discovered,” he says. “In that sense, it is simply another technology that needs to be managed in the world of vulnerability assessment.”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *