In the era of generative ai, agents that simulate human actions and behaviors are emerging as a powerful tool for companies to create production-ready applications. Agents can interact with users, perform tasks, and exhibit decision-making skills, mimicking human intelligence. By combining agents with amazon Titan Core Models (FM) in the amazon Bedrock family, customers can develop complex, multi-modal applications that enable the agent to understand and generate natural language or images.
For example, in the fashion retail industry, an assistant powered by agents and multimodal models can provide customers with a personalized and immersive experience. The assistant can engage in natural language conversations and understand the customer's preferences and intentions. You can then use multimodal capabilities to analyze images of clothing items and make recommendations based on customer feedback. Additionally, the agent can generate visual aids, such as clothing suggestions, improving the overall customer experience.
In this post, we implement a fashion assistant agent using amazon Bedrock Agents and the amazon Titan family models. Fashion Assistant provides a personalized, multi-modal conversational experience. Among others, amazon Titan Image Generator's image coloring and coloring capabilities can be used to generate fashion inspirations and edit user photos. amazon Titan Multimodal Embeddings models can be used to search a database for a style using a pop-up text or user-provided reference image to find similar styles. The agent uses Anthropic Claude 3 Sonnet to orchestrate its actions, for example, searching for the current weather to receive weather-appropriate clothing recommendations. A simple web user interface through illuminated provides the user with the best experience to interact with the agent.
The fashion assistant agent can be seamlessly integrated into existing e-commerce platforms or mobile applications, providing customers with a smooth and enjoyable experience. Customers can upload their own images, describe the style they want, or even provide a reference image, and the agent will generate personalized recommendations and visual inspirations.
The code used in this solution is available at the GitHub repository.
Solution Overview
The Fashion Assistant Agent uses the power of the amazon Titan and amazon Bedrock Agents models to provide users with a comprehensive set of style-related functionalities:
- Image-to-image or text-to-image search – This tool allows customers to find products similar to the styles they like from the catalog, improving their user experience. We use the Titan Multimodal Embeddings model to embed each product image and store it in amazon OpenSearch Serverless for future retrieval.
- Text to Image Generation – If the desired style is not available in the database, this tool generates unique and personalized images based on the user's query, allowing the creation of custom styles.
- Weather API Connection – By obtaining weather information for a given location mentioned in the user's message, the agent can suggest appropriate styles for the occasion, ensuring that the customer is dressed for the weather.
- paint – Users can upload an image and request to change the background, allowing them to view their preferred styles in different configurations.
- in paint – This tool allows users to modify specific clothing items in an uploaded image, such as changing the design or color, while keeping the background intact.
The following flowchart illustrates the decision-making process:
And the corresponding architecture diagram:
Prerequisites
To set up the Fashion Assistant Agent, make sure you have the following:
- An active AWS account and an AWS Identity and Access Management (IAM) role with access to amazon Bedrock, AWS Lambda, and amazon Simple Storage (amazon S3).
- Installing required Python libraries like Streamlit
- Anthropic Claude 3 Sonnet, amazon Titan Image Generator, and amazon Titan Multimodal Embeddings models enabled in amazon Bedrock. You can confirm that they are enabled in the Access to the model amazon Bedrock console page. If these models are enabled, the access status will be displayed as Access grantedas shown in the following screenshot.
Before running the notebook provided in the GitHub repository to start building the infrastructure, make sure your AWS account has permission to:
- Create managed IAM roles and policies
- Create and invoke Lambda functions
- Create, read, and write to S3 buckets
- Access and manage amazon Bedrock agents and models
If you want to enable image-to-image or text-to-image search capabilities, additional permissions are required for your AWS account:
- Create security policy, access policy, collect, index and map index in OpenSearch Serverless
- Call
BatchGetCollection
in OpenSearch serverless
Set up fashion assistant agent
To set up the Fashion Assistant Agent, follow these steps:
- Clone the GitHub repository using the command
- Complete prerequisites to grant sufficient permissions
- Follow the deployment steps outlined in the README.md
- (Optional) If you want to use the
image_lookup
feature, run code snippets inopensearch_ingest.ipynb
use amazon Titan Multimodal Embeddings to embed and store sample images - Run the Streamlit UI to interact with the agent using the command
By following these steps, you can create a powerful and engaging fashion assistant agent that combines the capabilities of amazon Titan models with the automation and decision-making capabilities of amazon Bedrock agents.
Put the fashion assistant to the test.
Once the fashion assistant is set up, you can interact with it through the Streamlit user interface. Follow these steps:
- Navigate to your Streamlit UI, as shown in the following screenshot
- Upload an image or enter a text message describing the desired style, according to the desired action, for example, image search, image generation, exterior painting or interior painting. The following screenshot shows an example message.
- Press Enter to send the message to the agent. You can see the agent's chain of thought (CoT) process in the UI, as shown in the following screenshot.
- When the response is ready, you will be able to see the agent's response in the user interface, as shown in the following screenshot. The response may include generated images, similar style recommendations, or modified images based on your request. You can download the generated images directly from the user interface or check the image in your S3 bucket.
Clean
To avoid unnecessary costs, be sure to remove the resources used in this solution. You can do this by running the following command.
Conclusion
The fashion assistant agent, powered by the amazon Titan and amazon Bedrock Agents models, is an example of how retailers can create innovative applications that improve the customer experience and drive business growth. Using this solution, retailers can gain a competitive advantage by offering personalized style recommendations, visual inspirations, and interactive fashion tips to their customers.
We encourage you to explore the potential of creating more agents like this fashion assistant by checking out the examples available on amazon-bedrock-samples/tree/main/agents-for-bedrock” target=”_blank” rel=”noopener”>aws-samples repository on GitHub.
About the authors
Akarsha Sehwag is a data scientist and machine learning engineer at AWS Professional Services with 5+ years of experience building machine learning-based solutions. Leveraging his expertise in computer vision and deep learning, he empowers customers to leverage the power of machine learning in the AWS cloud efficiently. With the advent of generative ai, he worked with numerous clients to identify good use cases and turn them into production-ready solutions.
Yanyan Zhang is a Senior Generative ai Data Scientist at amazon Web Services, where she has been working on cutting-edge ai/ML technologies as a Generative ai Specialist, helping clients leverage GenAI to achieve desired results. Yanyan graduated from Texas A&M University with a Ph.D. Bachelor's Degree in Electrical Engineering. Outside of work, she loves to travel, exercise, and explore new things.
Antonia Wiebeler She is a data scientist at the AWS Generative ai Innovation Center, where she enjoys creating proofs of concept for customers. His passion is exploring how generative ai can solve real-world problems and create value for customers. While he doesn't code, he enjoys running and competing in triathlons.
Alex Newton is a data scientist at the AWS Generative ai Innovation Center, helping customers solve complex problems with generative ai and machine learning. He enjoys applying cutting-edge machine learning solutions to solve real-world challenges. In his free time you will find Alex playing in a band or watching live music.
Chris Pecora is a Generative ai Data Scientist at amazon Web Services. He is passionate about creating innovative products and solutions while focusing on customer-obsessed science. When not running experiments and keeping up with the latest advances in generative ai, he loves spending time with his children.
Maira Ladeira Tanke is a Senior Data Scientist for Generative ai at AWS. With a background in machine learning, he has over 10 years of experience designing and building ai applications with clients across industries. As a technical leader, he helps clients accelerate the realization of business value through generative ai solutions on amazon Bedrock. In her free time, Maira enjoys traveling, playing with her cat, and spending time with her family in a warm place.