Discover AWS cost and usage insights with generative AI powered by Amazon Bedrock

Managing cloud costs and understanding resource usage can be a daunting task, especially for organizations with complex AWS deployments. AWS Cost and Usage Reports (AWS CUR) provide valuable insights into data, but interpreting and querying the raw data can be challenging.

In this post, we explore a solution that uses generative artificial intelligence (ai) to generate a SQL query from a user's natural language question. This solution can simplify the process of querying CUR data stored in an amazon Athena database by generating SQL queries, running the query in Athena, and rendering it in a web portal for easy understanding.

The solution uses amazon Bedrock, a fully managed service that offers a selection of high-performance base models (FMs) from leading ai companies such as AI21 Labs, Anthropic, Cohere, Meta, Mistral ai, Stability ai, and amazon through a single API, along with a broad set of capabilities to build generative ai applications with security, privacy, and responsible ai.

Challenges addressed

The following challenges can prevent organizations from effectively analyzing their CUR data, leading to potential inefficiencies, overspending, and missed cost optimization opportunities. We aim to address and simplify these challenges using generative ai with amazon Bedrock.

SQL query complexity – Writing SQL queries to extract information from CUR data can be complex, especially for non-technical users or those who are not familiar with the CUR data structure (unless you are an experienced database administrator).
Data accessibility – To obtain information from structured data in databases, users need to access the databases, which can be a potential threat to overall data protection.
Ease of use – Traditional CUR data analysis methods often lack a user-friendly interface, making it difficult for non-technical users to take advantage of the valuable insights hidden in the data.

Solution Overview

The solution we reviewed is a web application (chatbot) that allows you to ask questions related to AWS costs and usage in natural language. The application generates SQL queries based on user input, runs them against an Athena database containing CUR data, and presents the results in an easy-to-use format. The solution combines the power of generative ai, SQL generation, database queries, and an intuitive web interface to provide a seamless experience for analyzing CUR data.

The solution uses the following AWS services:

The following diagram illustrates the solution architecture.

Figure 1. Solution architecture

The data flow consists of the following steps:

CUR data is stored in amazon S3.
Athena is configured to access and query CUR data stored in amazon S3.
The user interacts with the Streamlit web application and submits a natural language question related to AWS costs and usage.

Figure 2. Shows the Chatbot Dashboard for asking questions

The Streamlit application sends user input to amazon Bedrock and the LangChain application facilitates overall orchestration.
The LangChain code uses LangChain's BedrockChat class to invoke the FM and interact with amazon Bedrock to generate a SQL query based on user input.

Figure 3. Shows the initialization of the SQL string

The generated SQL query is executed against the Athena database using FM on amazon Bedrock, which queries the CUR data stored in amazon S3.
The query results are returned to the LangChain application.

Figure 4. Shows the query generated in the application output logs.

LangChain sends the SQL query and query results to the Streamlit application.
The Streamlit application displays the SQL query and query results to the user in a formatted and user-friendly manner.

Figure 5. Shows the final result presented in the chatbot web application, including the SQL query and query results.

Prerequisites

To configure this solution, you must have the following prerequisites:

Configure the solution

Complete the following steps to configure the solution:

Create an Athena database and table to store your CUR data. Ensure that the necessary permissions and settings are in place for Athena to access CUR data stored in amazon S3.
Configure your compute environment to call amazon Bedrock APIs. Make sure to associate an IAM role with this environment that has IAM policies granting access to amazon Bedrock.
Once your instance is up and running, install the following libraries that are used to work within the environment:

pip install langchain==0.2.0 langchain-experimental==0.0.59 langchain-community==0.2.0 langchain-aws==0.1.4 pyathena==3.8.2 sqlalchemy==2.0.30 streamlit==1.34.0

Use the following code to establish a connection to the Athena database using the langchain library and the pyathena library. Configure the language model to generate SQL queries based on user input using amazon Bedrock. You can save this file as cur_lib.py.

from langchain_experimental.sql import SQLDatabaseChain
from langchain_community.utilities import SQLDatabase
from sqlalchemy import create_engine, URL
from langchain_aws import ChatBedrock as BedrockChat
from pyathena.sqlalchemy.rest import AthenaRestDialect

class CustomAthenaRestDialect(AthenaRestDialect):
    def import_dbapi(self):
        import pyathena
        return pyathena

# DB Variables
connathena = "athena.us-west-2.amazonaws.com"
portathena="443"
schemaathena="mycur"
s3stagingathena="s3://cur-data-test01/athena-query-result/"
wkgrpathena="primary"
connection_string = f"awsathena+rest://@{connathena}:{portathena}/{schemaathena}?s3_staging_dir={s3stagingathena}/&work_group={wkgrpathena}"
url = URL.create("awsathena+rest", query={"s3_staging_dir": s3stagingathena, "work_group": wkgrpathena})
engine_athena = create_engine(url, dialect=CustomAthenaRestDialect(), echo=False)
db = SQLDatabase(engine_athena)

# Setup LLM
model_kwargs = {"temperature": 0, "top_k": 250, "top_p": 1, "stop_sequences": ("\n\nHuman:")}
llm = BedrockChat(model_id="anthropic.claude-3-sonnet-20240229-v1:0", model_kwargs=model_kwargs)

# Create the prompt
QUERY = """
Create a syntactically correct athena query for AWS Cost and Usage report to run on the my_c_u_r table in mycur database based on the question, then look at the results of the query and return the answer as SQLResult like a human
{question}
"""
db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)

def get_response(user_input):
    question = QUERY.format(question=user_input)
    result = db_chain.invoke(question)
    query = result("result").split("SQLQuery:")(1).strip()
    rows = db.run(query)
    return f"SQLQuery: {query}\nSQLResult: {rows}"

Create a Streamlit web application to provide a user interface for interacting with the LangChain application. Include input fields for users to enter their questions in natural language and display the generated SQL queries and query results. You can name this file cur_app.py.

import streamlit as st
from cur_lib import get_response
import os

st.set_page_config(page_title="AWS Cost and Usage Chatbot", page_icon="chart_with_upwards_trend", layout="centered", initial_sidebar_state="auto",
menu_items={
        'Get Help': 'https://docs.aws.amazon.com/cur/latest/userguide/cur-create.html',
        #'Report a bug':,
        'About': "# The purpose of this app is to help you get better understanding of your AWS Cost and Usage report!"
    })#HTML title
st.title("_:orange(Simplify) CUR data_ :sunglasses:")

def format_result(result):
    parts = result.split("\nSQLResult: ")
    if len(parts) > 1:
        sql_query = parts(0).replace("SQLQuery: ", "")
        sql_result = parts(1).strip("()").split("), (")
        formatted_result = ()
        for row in sql_result:
            formatted_result.append(tuple(item.strip("(),'") for item in row.split(", ")))
        return sql_query, formatted_result
    else:
        return result, ()

def main():
    # Get the current directory
    current_dir = os.path.dirname(os.path.abspath(__file__))
    st.markdown("", unsafe_allow_html=True)
    st.title("AWS Cost and Usage chatbot")
    st.write("Ask a question about your AWS Cost and Usage Report:") 
        
        
        Connect the LangChain application and Streamlit web application by calling the get_response Format and display the SQL query and result in the Streamlit web application. Append the following code with the preceding application code: 
        
        
        # Create a session state variable to store the chat history
    if "chat_history" not in st.session_state:
        st.session_state.chat_history = ()

    user_input = st.text_input("You:", key="user_input")

    if user_input:
        try:
            result = get_response(user_input)
            sql_query, sql_result = format_result(result)
            st.code(sql_query, language="sql")
            if sql_result:
                st.write("SQLResult:")
                st.table(sql_result)
            else:
                st.write(result)
            st.session_state.chat_history.append({"user": user_input, "bot": result})
            st.text_area("Conversation:", value="\n".join((f"You: {chat('user')}\nBot: {chat('bot')}" for chat in st.session_state.chat_history)), height=300)
        except Exception as e:
            st.error(str(e))

    st.markdown("

", unsafe_allow_html=True) if __name__ == "__main__": main()

Deploy the Streamlit application and LangChain application to your hosting environment, such as amazon EC2, or a Lambda function.

Clean

Unless you invoke amazon Bedrock with this solution, you will not incur any charges for it. To avoid ongoing charges for amazon S3 storage to save the CUR reports, you can delete the CUR data and the S3 bucket. If you set up the solution with amazon EC2, be sure to stop or delete the instance when you are finished.

Benefits

This solution offers the following benefits:

Data analysis made easy – You can analyze CUR data using natural language using generative ai, eliminating the need for advanced SQL knowledge
Greater accessibility – The web-based interface allows non-technical users to efficiently access and obtain information from CUR data without requiring database credentials.
Time saving – You can quickly get answers to your cost and usage questions without having to manually write complex SQL queries
Improved visibility – The solution provides visibility into AWS costs and usage, enabling better cost optimization and resource management decisions.

Summary

The AWS CUR chatbot solution uses Anthropic Claude on amazon Bedrock to generate SQL queries, database queries, and an easy-to-use web interface to simplify the analysis of CUR data. By allowing you to ask questions in natural language, the solution removes barriers and enables both technical and non-technical users to gain valuable insights into AWS costs and resource usage. With this solution, organizations can make more informed decisions, optimize their cloud spend, and improve overall resource utilization. We recommend that you do your due diligence when setting this up, especially for production; you can choose other programming languages and frameworks to configure it based on your preferences and needs.

amazon Bedrock enables you to build powerful generative ai applications with ease. Accelerate your process by following the quick start guide at amazon-bedrock-quick-start” target=”_blank” rel=”noopener”>GitHub and use amazon Bedrock knowledge bases to rapidly develop cutting-edge Retrieval Augmented Generation (RAG) solutions or enable generative ai applications to execute multi-step tasks on enterprise systems and data sources using amazon Bedrock agents.

About the author

Anutosh is a Solutions Architect at AWS India. He loves to dig deep into his customers’ use cases to help them navigate their AWS journey. He enjoys building cloud solutions to help customers. He is passionate about migration and modernization, data analytics, resiliency, cybersecurity, and machine learning.

Discover AWS cost and usage insights with generative AI powered by Amazon Bedrock

Technical Terrence Team

This is how I would create a second income of over £20,000 a year

Leave a Reply Cancel reply

Recommended.

Can Bitcoin have $ 97K? -The data from 1-3 month holders reveal a crucial demand of BTC

How to automate AP invoice approval workflow?

Google AI Introduce STUDY: A Socially Aware-Temporally Causal Recommender System for Audiobooks in an Educational Setting

XRP Price Rises on Increased Whale Activity

Ethereum Surges as Cryptocurrencies Turn Bullish, New Altcoin Set for More Gains

Categories

Important Links

Discover AWS cost and usage insights with generative AI powered by Amazon Bedrock

Challenges addressed

Solution Overview

Prerequisites

Configure the solution

Clean

Benefits

Summary

About the author

Related

Technical Terrence Team

This is how I would create a second income of over £20,000 a year

Leave a Reply Cancel reply

Recommended.

Can Bitcoin have $ 97K? -The data from 1-3 month holders reveal a crucial demand of BTC

How to automate AP invoice approval workflow?

Google AI Introduce STUDY: A Socially Aware-Temporally Causal Recommender System for Audiobooks in an Educational Setting

XRP Price Rises on Increased Whale Activity

Ethereum Surges as Cryptocurrencies Turn Bullish, New Altcoin Set for More Gains

Categories

Important Links

Get daily news updates to your inbox!