LLM’s pervasive hallucinations expand code developers’ attack surface

The use of large language models (LLMs) by software developers gives attackers a greater opportunity than previously thought to deploy malicious packages into development environments, according to recently published research.

The study by security vendor LLM Lasso Security follows a report last year on the potential for attackers to abuse LLMs’ tendency to hallucinateor to generate seemingly plausible but not factually based results in response to user input.

AI package hallucination

THE previous study focused on ChatGPT’s tendency to spoof code library names, among other forgeries, when software developers asked for the AI-enabled chatbot’s help in a development environment. In other words, the chatbot sometimes outputs links to non-existent packages on public code repositories when a developer might ask it to suggest packages for use in a project.

Security researcher Bar Lanyado, author of the study and now at Lasso Security, found that attackers could easily drop a real malicious package at the location ChatGPT is pointing to and give it the same name as the hallucinated package. Any developer who downloads the package based on ChatGPT’s recommendations could end up introducing malware into their development environment.

The one from Lanyado follow-up research examined the pervasiveness of the packet hallucination problem in four different large language models: GPT-3.5-Turbo, GPT-4, Gemini Pro (formerly Bard), and Coral (Cohere). He also tested each model’s propensity to generate hallucinatory packets across different programming languages ​​and how often they generated the same hallucinatory packet.

For testing, Lanyado compiled a list of thousands of “how to” questions that developers in different programming environments – python, node.js, go, .net, ruby ​​- most commonly seek assistance from LLM in development. Lanyado then asked each model a coding-related question and a recommendation for a package related to the question. It also asked each model to recommend 10 other packages to solve the same problem.

Repetitive results

The results were worrying. An astonishing 64.5% of the “conversations” Lanyado had with Gemini resulted in hallucinatory packets. With Coral, that number was 29.1%; other LLMs such as GPT-4 (24.2%) and GPT3.5 (22.5%) did not fare much better.

When Lanyado asked each model the same set of questions 100 times to see how often the models would hallucinate the same packages, he found that the repetition rates were also eyebrow-raising. Cohere, for example, vomited the same hallucinated packets 24 percent of the time; Chat GPT-3.5 and Gemini around 14% and GPT-4 at 20%. In several cases, different models hallucinated the same or similar packages. The greatest number of such models with cross-hallucinations occurred between GPT-3.5 and Gemini.

Lanyado says that even if different developers asked an LLM a question on the same topic but phrased the questions differently, there’s a chance the LLM would recommend the same hallucinated package in each case. In other words, any developer using an LLM for coding assistance will likely encounter many of the same scary packages.

“The question could be completely different but on a similar topic, and the hallucination could still occur, making this technique very effective,” Lanyado says. “In the current research, we received ‘repetitive packages’ for many different questions and topics and also across different models, which increases the likelihood that these hallucinated packages are used.”

Easy to use

An attacker armed with the names of some hallucinated packages, for example, could upload packages with the same names to the appropriate repositories knowing that there is a good chance that an LLM will direct developers to it. To prove that the threat isn’t theoretical, Lanyado took a hallucinated package called “huggingface-cli” that he encountered during his testing and uploaded an empty package of the same name to the Hugging Face repository for machine learning models. Developers have downloaded that package more than 32,000 times, she says.

From a threat actor’s perspective, packet hallucinations offer a relatively simple vector for malware distribution. “Like us [saw] from the research results, it’s not that difficult,” he says. On average, all models combined hallucinated 35% for nearly 48,000 questions, Lanyado adds. GPT-3.5 had the lowest rate of hallucinations; Gemini got the highest score, with an average repeatability of 18% across all four models, he notes.

Lanyado suggests that developers use caution when acting on an LLM’s package recommendations when they are not completely sure of its accuracy. He also says that when developers encounter an unknown open source package they should visit the package’s repository and examine the size of its community, its maintenance records, its known vulnerabilities, and its overall engagement rate. Developers should also thoroughly scan the package before introducing it to the development environment.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *