Google Open Sources Magika – AI-powered file identification tool

February 17, 2024PressroomArtificial intelligence/data protection

AI-powered file identification tool

Google announced the open source of Magika, an artificial intelligence (AI) tool for identifying file types, to help defenders accurately detect binary and textual file types.

“Magika outperforms conventional file identification methods by providing a 30% overall increase in accuracy and up to 95% higher accuracy on traditionally difficult-to-identify, but potentially problematic content such as VBA, JavaScript, and Powershell,” the company said .

The software uses a “customized, highly optimized deep learning model” that allows for precise identification of file types within milliseconds. Magika implements inference functions using Open Neural Network Exchange (ONNX).

Google said it is using Magika internally on a large scale to help improve user security by directing Gmail, Drive and Safe Browsing files to appropriate content and security policy scanners.

Cyber ​​security

In November 2023, the tech giant unveiled RETVec (short for Resilient and Efficient Text Vectorizer), a multilingual text processing model for detecting potentially malicious content such as spam and malicious emails in Gmail.

Amid an ongoing debate over the risks of the rapidly developing technology and its abuse by state actors associated with Russia, China, Iran and North Korea to boost their hacking efforts, Google said the implementation of the Large-scale artificial intelligence can strengthen digital security and “tilt the balance of cybersecurity from attackers to defenders.”

Google Open Source Magika

He also highlighted the need for a balanced regulatory approach to the use and adoption of AI in order to avoid a future where attackers can innovate, but defenders are held back due to AI governance choices.

“AI enables security professionals and defenders to expand their work in threat detection, malware analysis, vulnerability detection, vulnerability remediation, and incident response,” noted Phil Venables and Tech giant Royal Hansen. “AI offers the best opportunity to reverse the defender’s dilemma and tip the scales of cyberspace to give defenders a decisive advantage over attackers.”

Concerns have also been raised about generative AI models’ use of data collected from the web for training purposes, which may also include personal data.

Cyber ​​security

“If you don’t know what your model will be used for, how can you ensure that its downstream use will respect data protection and people’s rights and freedoms?” the Information Commissioner’s Office stressed last month of the United Kingdom (ICO).

Additionally, new research has shown that large language models can function as “sleeper agents” that may appear harmless but can be programmed to engage in deceptive or malicious behavior when specific criteria are met or special instructions are given.

“Such backdoor behavior can be persisted so that it is not removed by standard security training techniques, including supervised tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it ), researchers from the startup Anthropic. said in the study.


Did you find this article interesting? Follow us on Twitter and LinkedIn to read the most exclusive content we publish.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *