Last year, the Open Worldwide Application Security Project (OWASP) published multiple versions of the “OWASP Top 10 For Large Language Models,” reaching a 1.0 document in August and a 1.1 document in October. These papers not only demonstrate the rapidly evolving nature of large language models, but also the evolving ways in which they can be attacked and defended. In this article, we’ll talk about four items in the top 10 that are most likely to contribute to the accidental disclosure of secrets like passwords, API keys, and more.
We are already aware that LLMs can reveal secrets because they happened. In early 2023, GitGuardian reported finding over 10 million secrets in public Github commits. Github’s AI coding tool Copilot was trained on public commits, and in September 2023, researchers at the University of Hong Kong published a paper on how they created an algorithm that generated 900 prompts designed to trick Copilot into revealing the secrets of his training data. When these tips were used, Copilot revealed over 2,700 valid secrets.
The technique used by the researchers is called “rapid injection”. It is number 1 in the OWASP Top 10 for LLM and they describe it as follows: [blockquote]
“This manipulates a large language model (LLM) through cunning inputs, causing unwanted actions by the LLM. Direct injections override system prompts, while indirect injections manipulate inputs from external sources.”
You may be more familiar with the quick injection bug revealed last year that caused ChatGPT to start spitting out training data if you asked it to repeat certain words forever.
Tip 1: Rotate your secrets
Even if you don’t think you accidentally posted secrets to GitHub, a number of secrets were pushed into an early commit and locked into a more recent commit, so they’re not immediately obvious without reviewing the entire commit history, not just the state current of your public repositories.
A tool from GitGuardian, called Has My Secret Leaked, lets you encrypt a current secret, then send the first few characters of the hash to determine if there are any matches in their database to what they find in their GitHub scans. A positive match does not guarantee that your secret has been leaked, but it provides a potential probability that it has so that you can investigate further.
The caveats to key/password rotation are that you should know where they are used, what might break when they change, and have a plan to mitigate that breakage as new secrets propagate to systems that need them. Once rotated, you need to make sure that older secrets have been disabled.
Attackers can’t use a secret that no longer works, and if your secrets that might be in an LLM have been rotated, they become nothing more than useless high-entropy strings.
Tip 2: Clean up your data
Element no. 6 in the OWASP Top 10 for LLM is “Disclosure of Sensitive Information”:
LLMs may inadvertently reveal sensitive data in their responses, leading to unauthorized data access, privacy breaches, and security breaches. It is critical to implement data sanitization and strict user policies to mitigate this issue.
While deliberately designed prompts can cause LLMs to reveal sensitive data, they can also do so accidentally. The best way to ensure that LLM does not reveal sensitive data is to ensure that LLM never knows.
This is more focused when you are forming an LLM for use by people who may not always have your best interests at heart or people who simply should not have access to certain information. Whether it’s your secrets or secret sauce, only those who need access to it should have it… and your LLM is probably not one of those people.
Using open source tools or paid services to scan your training data for secrets BEFORE providing the data to your LLM will help remove secrets. What your LLM doesn’t know, they can’t say.
Tip 3: Patch regularly and limit privileges
We recently saw a piece about using .env files and environment variables as a way to keep secrets available in your code, but out of your code. But what if your LLM could be asked to disclose environmental variables… or do something worse?
This combines both element no. 2 (“Insecure Output Handling”) that item no. 8 (“Excessive Agency”).
- Insecure output handling: This vulnerability occurs when LLM output is accepted out of control, exposing backend systems. Improper use can lead to serious consequences such as XSS, CSRF, SSRF, privilege escalation, or remote code execution.
- Excessive agency: LLM-based systems can take actions that lead to unintended consequences. The problem arises from excessive functionality, permissions, or autonomy granted to LLM-based systems.
It is difficult to separate them from each other because they can make each other worse. If an LLM can be tricked into doing something and its operating context has unnecessary privileges, the potential for arbitrary code execution to cause serious harm is multiplied.
Every developer has seen the cartoon “Exploits of a Mom” in which a boy named “Robert”); DROP TABLE Students;”` deletes a school’s student database. Even though an LLM seems smart, it’s actually no smarter than a SQL database. It’s like your “comedian” brother who convinces your little nephew to repeat swear words at grandma, wrong inputs can create bad results. Both should be sanitized and considered unreliable.
Additionally, you need to set guardrails around what the LLM or app can do, considering the principle of least privilege. Essentially, apps that use or enable LLM and the LLM infrastructure should not have access to data or functionality that they absolutely do not need, so that they cannot accidentally put it at the service of a hacker.
AI can still be considered in its infancy and, as with any child, should not be given the freedom to wander into any non-childproof room. LLMs may misunderstand, hallucinate, and be deliberately misled. When this happens, good locks, good walls, and good filters should help prevent them from accessing or revealing secrets.
In summary
Large language models are an amazing tool. They are set to revolutionize a number of professions, processes and industries. But they are distant they come from mature technology, and many are adopting them recklessly for fear of being left behind.
As you would with any child who has developed enough mobility to get into trouble, you need to keep an eye on him and lock any cabinets you don’t want him to get into. Proceed with large language models, but proceed with caution.