Welcome to the 12th post in my This Week In AI News series. Each week I post 1 article on something I'm working on, or find valuable, for AI developers. Want an easier way to get new posts? Follow this tag: ππ½
Key Resources To Know This Week
Resource | Description |
---|---|
1οΈβ£ Blog Post | Official Announcement of New Tools |
2οΈβ£ Collection | My Responible AI For Developers Collection |
3οΈβ£ Collection | My Azure AI For Developers Collection |
π | Watch the 2-part AI Show Episode on this topic!
Click each thumbnail image to view the full replay video.
We've talked about #ResponsibleAI in two contexts before - Model Debugging for predictive AI apps (MLOps), and AI-Assisted Evaluation for generative AI (LLMOps). Then late last week, I shared this exciting announcement about new Responsible AI tools coming to Azure AI!
In today's post, I want to dig a little deeper into the announcement to learn what the tools do, why they matter, and how developers can get started using them in their generative AI application workflows.
1 | The Azure AI Platform
The Azure AI Studio provides a browser-based UI/UX for exploring the rich capabilities of the Azure AI plaform as shown below. It also supports a code-first experience for developers who prefer working with an SDK or command-line tools.
It streamlines your end-to-end development workflow for generative AI applications - from exploring the model catalog, to building & evaluating your AI application, to deploying and monitoring the application in production. With built-in support for operationalizing Responsible AI, developers can go from evaluating their applications for quality, to configuring them for content safety in production.
2 | New Azure AI Tools for Responsible AI
The recent announcement from the Responsible AI team highlights a number of new tools and capabilities that are available (or coming soon) to Azure AI, to further improve the quality and safety of generative AI application on Azure. This short video gives you a quick preview of how these tools are put to use to create safeguards for generative AI apps on Azure. In the rest of this post, we'll dive briefly into each of these tools to understand what they do, and why it matters.
Watch the 2-part series on The AI Show (links at the top of this post) for more details.
3 | Prompt Shields
The first new capability comes in the form of Prompt Shields that can detect and block prompt injection attacks to safeguard the integrity of your LLM system. These attacks work by tricking the system into harmful or unplanned behaviors, "injecting" unauthorized instructions into the default user prompt at runtime.
In a direct attack (jailbreak) the user is the adversary. The user prompt attempts to get the model to disregard developer-authored system prompts and training in favor of executing potentially harmful instructions. In an indirect attack (cross-domain prompt injection) the adversary is a third-party and the attack occurs via untrusted external data sources that may be embedded in the user prompt, but not authored by user or developer.
Prompt Shields work proactively to detect suspicious inputs in real-time and block them before they reach the LLM. This can use techniques like Spotlighting that transform the input to mitigate these attacks while preserving the semantic content of the user prompt.
π | Learn more in this post
π | Review the main announcement
4 | Groundedness Detection
The second capability involves Groundedness Detection to combat the familiar problem of "Hallucinations". Here, models fabricate a response that may look valid but is not grounded in any real data. Identifying and remediating this is critical to improve trustworthiness of generative AI responses.
Previously developer options included manual checks (not scalable) and chaining requests (to have an LLM evaluate if the previous response was grounded with respect to a reference document) with mixed results. The new tool uses a custom-built fine-tuned language model that detects ungrounded claims more accurately - giving developers multiple options to mitigate the behavior, from pre-deployment testing to post-deployment rewriting of responses.
π | Learn more in this post
π | Review the main announcement
5 | Safety System Messages
The third capability recognizes that prompt engineering is a powerful way to improve the reliability of the generative AI application, along with services like Azure AI Content Safety. Writing effective system prompts (metaprompts) can have a non-trivial impact on the quality of responses - and system messages that can "guide the optimal use of grounding data and overall behavior" are ideal.
With the new system message framework and template recommendations for LLMs, developers now get example templates and recommendations to help them craft more effective messages. For instance, the system message framework describes four concepts (define model capabilities, define model output format, provide examples, provide behavioral guardrails) you can apply in crafting the system message. The screenshot above shows an example of how this is applied in a retail chatbot app.
π | Learn more from the documentation
π | Review the main announcement
6 | Automated Safety Evaluations
The fourth capability recognizes that most developers lack the resources and expertise to conduct rigorous safety evaluations on their generative AI applications - which would involve curating high-quality datasets for testing, and interpreting evaluation results for effective mitigation.
Previously, Azure AI supported pre-built generation quality metrics like groundedness, relevance, coherence and fluency for AI-assisted evaluations. With the new capability, this now includes additional risk and safety metrics like hateful and unfair content, sexual content, violent content, self-harm-related content, and jailbreaks.
To conduct a safety evaluation on your generative AI application, you need a test dataset and some way to simulate adversarial interactions with your application so you can evaluate the resulting responses for the relevant safety metrics. The new Azure AI automated safety evaluations capability streamlines this for you in four steps:
- Start with targeted prompts (created from templates)
- Use AI-assisted simulation (for adversarial interactions)
- Create your test datasets (baseline & adversarial)
- Evaluate the test datasets (for your application) Β
Outcomes can now be used to configure or adapt other elements of the application's risk mitigation system.
π | Learn more in this post
π | Review the main announcement
7 | Risk & Safety Monitoring
The final new capability announced was around Risk & Safety Monitoring in Azure Open AI - adding a new Dashboard capability described as follows:
In addition to the detection/ mitigation on harmful content in near-real time, the risks & safety monitoring help get a better view of how the content filter mitigation works on real customer traffic and provide insights on potentially abusive end-users.
To use the feature, you need an Azure OpenAI resource in a supported region, and a model deployment with a content filter configured. Once setup, simply open the Deployments tab, visit the model deployment page, and select the Risks & Safety tab as shown in this figure from the announcement post below.
The resulting dashboard provides two kinds of insights into content filter effectiveness. The first focuses on Content Detection with visualized insights into metrics like Total blocked request count and block rate, Blocked requests by category, Severity distribution by category and more. The second focuses on Abusive User Detection to highlight how regularly the content filters safeguards are abused by end users and identify the severity and frequency of those occurrences.
π | Learn more in this post
π | Review the main announcement
8 | Get Started Exploring
This was a lot - and it is still just the tip of the iceberg when it comes to actively understanding and applying responsible AI principles in practice. Want to get started exploring this topic furthere? Bookmark and revisit these three core resources:
Resource | Description |
---|---|
1οΈβ£ Blog Post | Official Announcement of New Tools |
2οΈβ£ Collection | My Responible AI For Developers Collection |
3οΈβ£ Collection | My Azure AI For Developers Collection |
Top comments (0)