Agent Workflows are a new perspective in creating dynamic and complex workflows based on business use cases with the help of Large Language Models (LLM) as a reasoning engine. These agent workflows decompose natural language query-based tasks into multiple actionable steps with iterative feedback loops and self-reflection to produce the final result using tools and APIs. Naturally, this justifies the need to measure and evaluate the robustness of these workflows, particularly those that are conflictive or harmful in nature.
amazon Bedrock agents can break down natural language conversations into a sequence of tasks and API calls using ai/techniques/react” target=”_blank” rel=”noopener”>React and ai/techniques/cot” target=”_blank” rel=”noopener”>chain of thought (CoT) stimulation techniques using LLM. This offers enormous use case flexibility, enables dynamic workflows, and reduces development costs. amazon Bedrock Agents play an instrumental role in customizing and tailoring applications to help meet specific project requirements while protecting private data and securing your applications. These agents work with the managed infrastructure capabilities of AWS and amazon Bedrock, reducing infrastructure management overhead.
Although amazon Bedrock Guardrails have built-in mechanisms to help prevent general harmful content, you can incorporate a custom, detailed user-defined mechanism with amazon Bedrock Guardrails. amazon Bedrock Guardrails provides additional customizable protections on top of the built-in protections of the base models (FM), providing security protections that are among the best in the industry by blocking harmful content and filtering hallucinated responses for Recovery Augmented Generation (RAG) and summary. workloads. This allows you to customize and apply security, privacy, and truth protections within a single solution.
In this post, we demonstrate how you can identify and improve the robustness of amazon Bedrock Agents when integrated with amazon Bedrock Guardrails for domain-specific use cases.
Solution Overview
In this post, we explore a sample use case for an online retail chatbot. The chatbot requires dynamic workflows for use cases such as searching and purchasing shoes based on customer preferences using natural language queries. To implement this, we created an agent workflow using amazon Bedrock Agents.
To test its adversarial robustness, we asked this bot to provide fiduciary retirement advice. We use this example to demonstrate robustness concerns, followed by improving robustness using the agent workflow with amazon Bedrock Guardrails to help prevent the bot from providing fiduciary advice.
In this implementation, the preprocessing stage (the first stage of the agent workflow, before the LLM is invoked) of the agent is disabled by default. Even with preprocessing enabled, there is usually a need for more granular and use case-specific control over what can be marked as safe and acceptable or not. In this example, a shoe retail agent offering fiduciary advice is definitely outside the scope of the product use case and may be harmful advice, resulting in customers losing trust, among other security concerns.
Another typical detailed robustness check requirement could be to restrict the generation of personally identifiable information (PII) by these agent workflows. We can configure and configure amazon Bedrock Guardrails on amazon Bedrock Agents to provide greater robustness against regulatory compliance cases and custom business needs. without the need to perfect LLMs.
The following diagram illustrates the architecture of the solution.
<img class="alignnone wp-image-86932 size-large" src="https://technicalterrence.com/wp-content/uploads/2024/10/Improve-the-robustness-of-LLM-applications-with-Amazon-Bedrock-Guardrails.png" alt="This figure shows a high-level architecture of this blog in its finalized state. amazon Bedrock agents capture the user request to generate a plan and then call lambda to execute the API that can call any database, AWS service like email, or other applications. These agents are partnered with Guardrails for amazon Bedrock to provide greater adversarial robustness.” width=”1024″ height=”464″/>
We use the following AWS services:
- amazon Bedrock will invoke LLM
- amazon Bedrock Agents for Agent Workflows
- amazon Bedrock Guardrails to Deny Adverse Inputs
- AWS Identity and Access Management (IAM) for permission control across multiple AWS services
- AWS Lambda API Deployment for Enterprise
- amazon SageMaker will host Jupyter notebooks and invoke the amazon Bedrock Agents API
In the following sections, we demonstrate how to use the GitHub repository to run this example using three Jupyter notebooks.
Prerequisites
To run this demo in your AWS account, complete the following prerequisites:
- Create an AWS account if you don't already have one.
- Clone the GitHub repository and follow the steps explained in the README.
- Set up a SageMaker notebook using AWS CloudFormation templateavailable in the GitHub repository. The CloudFormation template also provides the IAM access needed to configure SageMaker resources and Lambda functions.
- Gain access to models hosted on amazon Bedrock. Choose Manage model access in the navigation panel of the amazon Bedrock console and choose from the list of available options. We used Anthropic Claude 3 Haiku on amazon Bedrock and amazon Titan Embeddings Text v1 on amazon Bedrock for this post.
create a railing
In it Part 1a notebook, complete the following steps to create a barrier to help prevent the chatbot from providing fiduciary advice:
- Create a security barrier with amazon Bedrock Guardrails using the Boto3 API with content filters, word and phrase filters, and sensitive word filters such as PII and regular expressions (regex) to protect our retail customers' sensitive information.
- List and create versions of railings.
- Update the railings.
- Perform unit tests on the railings.
- Please note the
guardrail-id
andguardrail-arn
values to use in Part 1c:
Try the use case without guardrails
In it Part 1b notebook, complete the following steps to demonstrate the use case using amazon Bedrock Agents without amazon Bedrock Guardrails and without preprocessing to demonstrate the adversarial robustness problem:
- Choose the underlying FM for your agent.
- Provide clear and concise instruction to the agent.
- Create and associate an action group with an API schema and a Lambda function.
- Create, invoke, test, and deploy the agent.
- Demonstrate a chat session with multi-turn conversations.
The agent's instruction is as follows:
A valid user query would be “Hello, my name is John Doe. I'm looking to buy running shoes. Can you give more details about Shoe ID 10? However, when using amazon Bedrock Agents without amazon Bedrock Guardrails, the agent allows fiduciary advice for questions such as the following:
- “How should I invest for my retirement? “I want to be able to generate $5,000 a month.”
- “How can I earn money to prepare for my retirement?”
Test the use case with railings
In it Part 1c notebook, repeat the steps from Part 1b, but now to demonstrate using amazon Bedrock Agents with guardrails (and still without preprocessing) to enhance and evaluate adversary robustness concerns by not allowing fiduciary advice. The complete steps are as follows:
- Choose the underlying FM for your agent.
- Provide clear and concise instruction to the agent.
- Create and associate an action group with an API schema and a Lambda function.
- During the configuration of the amazon Bedrock agents in this example, associate the firewall created earlier in Part 1a with this agent.
- Create, invoke, test, and deploy the agent.
- Demonstrate a chat session with multi-turn conversations.
To associate a guardrail-id
with an agent during creation, we can use the following code snippet:
As we can expect, our retail chatbot should refuse to answer invalid queries because it has no bearing on its purpose in our use case.
Cost considerations
The following are important cost considerations:
Clean
For him Part 1b and Part 1c notebooks, to avoid incurring recurring costs, the implementation automatically cleans up resources after a full notebook run. You can consult the instructions of the notebook in the Cleaning Resources section on how to avoid automatic cleaning and experiment with different prompts.
The cleaning order is as follows:
- Disable the action group.
- Delete the action group.
- Remove the alias.
- Delete the agent.
- Delete the Lambda function.
- Empty the S3 bucket.
- Delete the S3 bucket.
- Remove IAM roles and policies.
You can remove security barriers from the amazon Bedrock console or API. Unless firewalls are invoked through agents in this demo, you will not be charged. For more details, see Remove a railing.
Conclusion
In this post, we demonstrate how amazon Bedrock Guardrails can improve the robustness of the agent framework. We were able to prevent our chatbot from responding to non-relevant queries and protect our clients' personal information, ultimately improving the robustness of our agency implementation with amazon Bedrock Agents.
In general, the amazon Bedrock Agents preprocessing stage can intercept and reject adverse input, but guardrails can help prevent prompts that may be very specific to the topic or use case (such as PII and HIPAA rules) that the LLM has not seen before. , without having to refine the LLM.
For more information about creating models with amazon Bedrock, see Customize your model to improve its performance for your use case. For more information about using agents to organize workflows, see Automate tasks in your application using conversational agents. For details on using guardrails to protect your generative ai applications, see Stop harmful content on models that use amazon Bedrock Guardrails.
Expressions of gratitude
The author thanks all the reviewers for their valuable comments.
About the author
Ray Shayan is an applied scientist at amazon Web Services. His research area is everything related to natural language (such as NLP, NLU and NLG). His work has focused on conversational ai, task-oriented dialogue systems, and LLM-based agents. His research publications cover natural language processing, personalization, and reinforcement learning.