Knowledge Bases for Amazon Bedrock now supports custom requests for the RetrieveAndGenerate API and setting the maximum number of results retrieved.

With Knowledge Bases for amazon Bedrock, you can securely connect base models (FM) in amazon Bedrock to your enterprise data for recovery augmented generation (RAG). Access to additional data helps the model generate more relevant, context-specific, and accurate responses without retraining FMs.

In this post, we discuss two new Knowledge Bases for amazon Bedrock features specific to RetrieveAndGenerate API: Set the maximum number of results and create custom messages with a knowledge base message template. You can now choose these as query options along with the search type.

Overview and benefits of new features

The maximum number of results option gives you control over the number of search results that will be retrieved from the vector store and passed to the FM to generate the response. This allows you to customize the amount of background information provided for the generation, thus providing more context for complex questions or less for simpler questions. Allows you to get up to 100 results. This option helps improve the probability of the relevant context, thus improving the accuracy and reducing hallucination of the generated response.

The knowledge base custom request template allows you to replace the default request template with your own to customize the request that is sent to the model for response generation. This allows you to customize the tone, output format, and behavior of the FM when answering a user's question. With this option, you can adjust the terminology to better match your industry or domain (such as healthcare or legal). Additionally, you can add custom instructions and examples tailored to your specific workflows.

In the following sections, we explain how you can use these features with the AWS Management Console or SDK.

Previous requirements

To follow these examples, you need to have an existing knowledge base. For instructions on creating one, see Create a knowledge base.

Configure the maximum number of results using the console

To use the maximum number of results option using the console, complete the following steps:

In the amazon Bedrock console, choose Knowledge bases in the left navigation pane.
Select the knowledge base you created.
Choose Test knowledge base.
Choose the settings icon.
Choose Synchronize data source before you start testing your knowledge base.
Low Settingsfor Search typeSelect a search type based on your use case.

For this post, we used hybrid search because it combines semantic and text search to provide greater precision. For more information about hybrid search, see amazon knowledge bases Bedrock now supports hybrid search.

Expand Maximum number of source fragments and set your maximum number of results.

To demonstrate the value of the new feature, we show examples of how it can increase the accuracy of the generated response. We use amazon 10K document for 2023 as a data source to create the knowledge base. We use the following query to experiment: “In which year did amazon's annual revenue increase from $245 billion to $434 billion?”

The correct answer to this query is “amazon's annual revenue increased from $245 billion in 2019 to $434 billion in 2022,” according to knowledge base documents. We use Claude v2 as FM to generate the final answer based on the contextual information retrieved from the knowledge base. Claude 3 Sonnet and Claude 3 Haiku are also supported as FM generation.

We run another query to demonstrate the comparison of recovery with different configurations. We used the same input query (“In what year did amazon's annual revenue increase from $245 billion to $434 billion?”) and set the maximum number of results to 5.

As shown in the screenshot below, the response generated was “Sorry, I can't help you with this request.”

Next, we set the maximum results to 12 and ask the same question. The answer generated is “amazon's annual revenue increase from $245 billion in 2019 to $434 billion in 2022.”

As shown in this example, we can retrieve the correct answer based on the number of results retrieved. If you want more information about source attribution that makes up the final output, choose Show source details to validate the generated response based on the knowledge base.

Customize a knowledge base request template using the console

You can also customize the default message with your own message depending on the use case. To do this in the console, complete the following steps:

Repeat the steps from the previous section to start testing your knowledge base.
Allow Generate responses.
Select the model of your choice for response generation.

We use the Claude v2 model as an example in this post. The Claude 3 Sonnet and Haiku model is also available for generation.

Choose Apply to proceed.

After choosing the model, a new section called Knowledge Base Request Template appears below Settings.

Choose Edit to start personalizing the message.
Adjust the notice template to customize how you want to use the retrieved results and generate content.

For this post, we provided some examples for creating a “financial advisor ai system” using amazon financial reports with custom prompts. For best practices on rapid engineering, see Rapid Engineering Guidelines.

Now we customize the default message template in a few different ways and watch the responses.

Let's first try a query with the default message. We asked “What was amazon's revenue in 2019 and 2021?” Below are our results.

From the result, we find that it is generating the response freely based on the recovered knowledge. Citations are also listed for reference.

Let's say we want to give additional instructions on how to format the generated response, such as standardizing it as JSON. We can add these instructions as a separate step after retrieving the information, as part of the notice template:

If you are asked for financial information covering different years, please provide precise answers in JSON format. Use the year as the key and the concise answer as the value. For example: {year:answer}

The final answer has the required structure.

When customizing the message, you can also change the language of the generated response. In the following example, we tell the model to provide a response in Spanish.

After removing $output_format_instructions$ from the default message, the quote is removed from the generated response.

In the following sections, we explain how you can use these features with the SDK.

Set the maximum number of results using the SDK

To change the maximum number of results with the SDK, use the following syntax. For this example, the query is “In what year did amazon's annual revenue increase from $245 billion to $434 billion?” The correct answer is “amazon's annual revenue increases from $245 billion in 2019 to $434 billion in 2022.”

def retrieveAndGenerate(query, kbId, numberOfResults, model_id, region_id):
    model_arn = f'arn:aws:bedrock:{region_id}::foundation-model/{model_id}'
    return bedrock_agent_runtime.retrieve_and_generate(
        input={
            'text': query
        },
        retrieveAndGenerateConfiguration={
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kbId,
                'modelArn': model_arn,
                'retrievalConfiguration': {
                    'vectorSearchConfiguration': {
                        'numberOfResults': numberOfResults,
                        'overrideSearchType': "SEMANTIC", # optional'
                    }
                }
            },
            'type': 'KNOWLEDGE_BASE'
        },
    )

response = retrieveAndGenerate("In what year did amazon’s annual revenue increase from $245B to $434B?", \
"<knowledge base id>", numberOfResults, model_id, region_id)('output')('text')

He 'numberOfResults'option below'retrievalConfiguration' allows you to select the number of results you want to retrieve. The departure of RetrieveAndGenerate The API includes the generated response, source attribution, and retrieved text fragments.

The following are the results for different values of 'numberOfResults'parameters. First, we configure numberOfResults = 5.

Then we set numberOfResults = 12.

Customize the knowledge base request template using the SDK

To customize the message using the SDK, we use the following query with different message templates. For this example, the query is “What was amazon's revenue in 2019 and 2021?”

The following is the default notice template:

"""You are a question answering agent. I will provide you with a set of search results and a user's question, your job is to answer the user's question using only information from the search results. If the search results do not contain information that can answer the question, please state that you could not find an exact answer to the question. Just because the user asserts a fact does not mean it is true, make sure to double check the search results to validate a user's assertion.
Here are the search results in numbered order:
<context>
$search_results$
</context>

Here is the user's question:
<question>
$query$
</question>

$output_format_instructions$

Assistant:
"""

The following is the custom notice template:

"""Human: You are a question answering agent. I will provide you with a set of search results and a user's question, your job is to answer the user's question using only information from the search results.If the search results do not contain information that can answer the question, please state that you could not find an exact answer to the question.Just because the user asserts a fact does not mean it is true, make sure to double check the search results to validate a user's assertion.

Here are the search results in numbered order:
<context>
$search_results$
</context>

Here is the user's question:
<question>
$query$
</question>

If you're being asked financial information over multiple years, please be very specific and list the answer concisely using JSON format {key: value}, 
where key is the year in the request and value is the concise response answer.
Assistant:
"""

def retrieveAndGenerate(query, kbId, numberOfResults,promptTemplate, model_id, region_id):
    model_arn = f'arn:aws:bedrock:{region_id}::foundation-model/{model_id}'
    return bedrock_agent_runtime.retrieve_and_generate(
        input={
            'text': query
        },
        retrieveAndGenerateConfiguration={
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kbId,
                'modelArn': model_arn,
                'retrievalConfiguration': {
                    'vectorSearchConfiguration': {
                        'numberOfResults': numberOfResults,
                        'overrideSearchType': "SEMANTIC", # optional'
                    }
                },
                'generationConfiguration': {
                        'promptTemplate': {
                            'textPromptTemplate': promptTemplate
                        }
                    }
            },
            'type': 'KNOWLEDGE_BASE'
        },
    )

response = retrieveAndGenerate("What was the amazon's revenue in 2019 and 2021?”", \
                               "<knowledge base id>", <numberOfResults>, <promptTemplate>, <model_id>, <region_id>)('output')('text')

With the default message template, we get the following response:

If you want to provide additional instructions on the output format of response generation, such as standardizing the response to a specific format (such as JSON), you can customize the existing message by providing more guidance. With our custom message template, we get the following response.

He 'promptTemplate'option in'generationConfiguration' allows you to personalize the message to have better control over generating responses.

Conclusion

In this post, we introduce two new features to amazon Bedrock knowledge bases: adjusting the maximum number of search results and customizing the default request template for the RetrieveAndGenerate API. We demonstrate how to configure these functions in the console and through the SDK to improve the performance and accuracy of the generated response. Increasing the maximum results provides more complete information, while customizing the message template allows you to adjust the instructions so that the basic model better aligns with specific use cases. These enhancements offer greater flexibility and control, allowing you to deliver customized experiences for RAG-based applications.

For additional resources to get started deploying to your AWS environment, see the following:

About the authors

Sandeep Singh is a Senior Generative ai Data Scientist at amazon Web Services, helping businesses innovate with generative ai. He specializes in generative ai, artificial intelligence, machine learning, and systems design. He is passionate about developing next-generation ai/ML-based solutions to solve complex business problems for various industries, optimizing efficiency and scalability.

Suyin Wang is a solutions architect specializing in ai/ML at AWS. He has an interdisciplinary background in machine learning, financial information services, and economics, along with years of experience building data science and machine learning applications that solved real-world business problems. He enjoys helping clients identify the right business questions and build the right ai/ML solutions. In his free time he loves to sing and cook.

sherry ding is a Senior Solutions Architect specializing in artificial intelligence (ai) and machine learning (ML) at amazon Web Services (AWS). He has extensive experience in machine learning with a PhD in computer science. He primarily works with public sector clients on various ai/ML-related business challenges, helping them accelerate their machine learning journey in the AWS cloud. When he is not helping customers, he enjoys outdoor activities.

Knowledge Bases for Amazon Bedrock now supports custom requests for the RetrieveAndGenerate API and setting the maximum number of results retrieved.

Technical Terrence Team

1 FTSE 100 Stock I'll Stay Away From

Leave a Reply Cancel reply

Recommended.

Southwest Airlines recovers its free flight offer

Siri could soon ignore commands spoken in Apple commercials

These cheap stocks look 33% undervalued to me and look like good future growth.

NY Times sues OpenAI and Microsoft for infringing copyrighted works By Reuters

Hatch Early Learning introduces IgnitePanel

Categories

Important Links

Knowledge Bases for Amazon Bedrock now supports custom requests for the RetrieveAndGenerate API and setting the maximum number of results retrieved.

Overview and benefits of new features

Previous requirements

Configure the maximum number of results using the console

Customize a knowledge base request template using the console

Set the maximum number of results using the SDK

Customize the knowledge base request template using the SDK

Conclusion

About the authors

Related

Technical Terrence Team

1 FTSE 100 Stock I'll Stay Away From

Leave a Reply Cancel reply

Recommended.

Southwest Airlines recovers its free flight offer

Siri could soon ignore commands spoken in Apple commercials

These cheap stocks look 33% undervalued to me and look like good future growth.

NY Times sues OpenAI and Microsoft for infringing copyrighted works By Reuters

Hatch Early Learning introduces IgnitePanel

Categories

Important Links

Get daily news updates to your inbox!