Microsoft announced several new features in Azure AI Studio that the company says should help developers build generative AI apps that are more robust and resistant to malicious model manipulation and other emerging threats.
In a March 29 blog post, Sarah Bird, Microsoft’s Chief Product Officer for Responsible AI, highlighted growing concerns about threat actors using timely injection attacks making AI systems behave in dangerous and unexpected ways as a primary driver for new tools.
“Organizations too concerned about quality and reliability,” said the bird. “They want to ensure that their AI systems do not generate errors or add unsubstantiated information into application data sources, which can undermine user trust.”
Azure AI Studio is a hosted platform that organizations can use to build custom AI assistants, co-pilots, bots, search tools, and other applications, based on their data. Announced in November 2023, the platform hosts machine learning models from Microsoft as well as models from several other sources, including OpenAI. Meta, Embracing Face and Nvidia. It allows developers to quickly integrate multimodal capabilities and responsible AI capabilities into their models.
Other major players such as Amazon and Google have rushed to the market with similar offerings over the past year to capitalize on the growing interest in artificial intelligence technologies around the world. A recent study commissioned by IBM discovered this 42% of organizations with more than 1,000 employees they are already actively using AI in some capacity and many of them plan to increase and accelerate investment in the technology in the coming years. And not all they were saying it ahead of their use of AI.
Protection from timely engineering
The five new features that Microsoft has added, or will soon add, to Azure AI Studio are: Prompt Shields; earth detection; security system messages; safety assessments; and risk and safety monitoring. The features are designed to address some significant challenges that researchers have recently discovered – and continue to discover on a routine basis – regarding the use of large language models and generative artificial intelligence tools.
Quick shields for example, it is Microsoft’s mitigation for so-called quick indirect attacks and jailbreaks. The feature builds on existing mitigations in Azure AI Studio against jailbreak risk. In timely engineering attacks, adversaries use suggestions that appear harmless and not overtly malicious to try to direct an AI model towards generating malicious and undesirable responses. Timely engineering is among the most dangerous in a growing class of attacks that attempt to do so Jailbroken AI models or cause them to behave in a manner inconsistent with any filters and constraints the developers may have built into them.
Researchers recently demonstrated how adversaries can engage in well-timed engineering attacks to gain generative AI models disseminate your training datato disclose personal information, generate misinformation and potentially harmful content, such as instructions on how to wire a car.
With Prompt Shields, developers can integrate features into their models that help distinguish between valid and potentially unreliable system input; set delimiters to mark the start and end of input text, and use data markup to mark input texts. Prompt Shields is currently available in preview mode in Azure AI Content Safety and will become generally available soon, according to Microsoft.
Mitigations for model hallucinations and harmful content
With ground detection, meanwhile, Microsoft has added a feature to Azure AI Studio that it says can help developers reduce the risk of their AI models “hallucinating.” Model hallucination is a tendency for AI models to generate results that appear plausible but are completely made up and not based on, or grounded in, training data. LLM hallucinations can be extremely problematic if an organization were to treat the outcome as real and act accordingly in some way. In a software development environment, for example, LLM hallucinations could lead developers to potentially introduce vulnerable code into their applications.
What’s new in Azure AI Studio ground detection The capability is fundamentally about helping to detect, more reliably and at a larger scale, potentially ungrounded generative AI outputs. The goal is to provide developers with a way to test their AI models against what Microsoft calls grounding metrics, before implementing the model into the product. The feature also highlights potentially unsubstantiated claims in LLM outputs, so users know to verify the output before using it. Ground detection isn’t available yet, but should be in the near future, according to Microsoft.
The new system message structure offers developers a way to clearly define their model’s capabilities, profile, and limitations in their specific environment. Developers can use the feature to define the format of the output and provide examples of expected behavior, so that it becomes easier for users to detect deviations from expected behavior. It’s another new feature that isn’t available yet but should be soon.
Azure AI Studio was just announced safety assessments ability and his risk and safety monitoring they are both currently available in preview status. Organizations can use the former to assess the vulnerability of their LLM model to jailbreak attacks and unexpected content generation. The risk and security monitoring feature allows developers to detect model inputs that are problematic and could trigger hallucinatory or unexpected content, so they can implement mitigations against them.
“Generative AI can be a force multiplier for every department, company and industry,” said Microsoft’s Bird. “At the same time, fundamental patterns introduce new security challenges that require new mitigations and continuous learning.”