eFeel is an industry-leading provider of managed detection and response (MDR) services that protect users, data, and applications for more than 2,000 organizations worldwide in more than 35 industries. These security services help their clients anticipate, resist, and recover from sophisticated cyber threats, avoid disruptions from malicious attacks, and improve their security posture.
In 2023, eSentire was looking for ways to deliver differentiated customer experiences by continuing to improve the quality of its security investigations and customer communications. To achieve this, eSentire created ai Investigator, a natural language query tool for its customers to access security platform data using AWS generative artificial intelligence (ai) capabilities.
In this post, we share how eSentire built ai Investigator using amazon SageMaker to deliver private and secure generative ai interactions to their customers.
ai Researcher Benefits
Prior to ai Investigator, customers would engage eSentire Security Operations Center (SOC) analysts to further understand and investigate their asset data and associated threat cases. This involved a manual effort for eSentire customers and analysts, asking questions and searching data across multiple tools to formulate answers.
eSentire's ai Investigator allows users to complete complex queries using natural language by uniting multiple data sources from each customer's own security telemetry and eSentire's asset, vulnerability and threat data mesh. This helps customers quickly and easily explore their security data and accelerate internal investigations.
Providing ai Investigator internally to the eSentire SOC workbench has also accelerated the eSentire investigation process by improving the scale and efficiency of multi-telemetry investigations. LLM models augment SOC investigations with knowledge from security experts and eSentire security data, enabling higher quality investigation results while reducing investigation time. More than 100 SOC analysts are now using ai Investigator models to analyze security data and provide rapid research conclusions.
Solution Overview
eSentire customers expect rigorous security and privacy controls for their sensitive data, requiring an architecture that does not share data with third-party large language model (LLM) providers. Therefore, eSentire decided to build its own LLM using the fundamental models Llama 1 and Llama 2. A basic model (FM) is an LLM that has undergone unsupervised pre-training on a text corpus. eSentire tested several FMs available on AWS for its proof of concept; however, Meta's direct access to Llama 2 FM via Hugging Face in SageMaker for training and inference (and its licensing structure) made Llama 2 an obvious choice.
eSentire has over 2 TB of signal data stored in its amazon Simple Storage Service (amazon S3) data lake. eSentire used gigabytes of additional human research metadata to make supervised adjustments to Llama 2. This additional step updates the FM by training it with data labeled by security experts (such as question-answer pairs and research conclusions).
eSentire used SageMaker on several levels, ultimately making their end-to-end process easier:
- They extensively used SageMaker notebook instances to spin up GPU instances, giving them the flexibility to swap out high-powered compute when needed. eSentire used instances with CPU for data preprocessing and post-inference analysis and GPU for real model (LLM) training.
- The additional benefit of SageMaker notebook instances is their streamlined integration with eSentire's AWS environment. Because they have large amounts of data (terabyte scale, over 1 billion total rows of relevant data in preprocessing inputs) stored on AWS (in amazon S3 and amazon Relational Database Service (amazon RDS) for PostgreSQL clusters), SageMaker notebook instances enabled the secure movement of this volume of data directly from the AWS source (amazon S3 or amazon RDS) to the SageMaker notebook. They did not need additional infrastructure for data integration.
- SageMaker real-time inference endpoints provide the infrastructure needed to host your custom, self-paced LLMs. This was very useful in combination with SageMaker's integration with amazon Elastic Container Registry (amazon ECR), SageMaker endpoint configuration, and SageMaker models to provide the complete configuration needed to spin up your LLMs as needed. The full-featured, end-to-end deployment capability provided by SageMaker allowed eSentire to effortlessly and consistently update their model registry as they iterate and update their LLMs. All of this was completely automated with the software development life cycle (SDLC) using Terraform and GitHub, which is only possible through the SageMaker ecosystem.
The following diagram visualizes the architecture diagram and workflow.
The application interface can be accessed through the amazon API Gateway, using private and edge gateways. To emulate intricate thought processes similar to those of a human researcher, eSentire designed a system of chained agent actions. This system uses AWS Lambda and amazon DynamoDB to orchestrate a series of LLM invocations. Each LLM call builds on the previous one, creating a cascade of interactions that together produce high-quality responses. This complex setup ensures that the application's backend data sources integrate seamlessly, thereby providing personalized responses to customer queries.
When you build a SageMaker endpoint, an S3 URI for the bucket containing the model artifact and Docker image is shared using amazon ECR.
For its proof of concept, eSentire selected the Nvidia A10G Tensor Core GPU housed in an MLG5 2XL instance for its balance of performance and cost. For LLMs with a significantly larger number of parameters, which demand higher computational power for both training and inference tasks, eSentire used 12XL instances equipped with four GPUs. This was necessary because the computational complexity and amount of memory required for LLMs can increase exponentially with the number of parameters. eSentire plans to leverage P4 and P5 instance types to scale its production workloads.
Additionally, a monitoring framework that captured inputs and outputs from ai Investigator was needed to enable visibility into threat hunting into LLM interactions. To achieve this, the application is integrated with open source eSentire LLM Gateway Project to monitor interactions with customer queries, backend agent actions, and application responses. This framework enables trust in complex LLM applications by providing a security monitoring layer to detect malicious poisoning and injection attacks, while also providing governance and compliance support by logging user activity. The LLM gateway can also integrate with other LLM services, such as amazon Bedrock.
amazon Bedrock lets you personalize FM privately and interactively, with no encryption required. Initially, the goal of eSentire was to train custom models using SageMaker. As their strategy evolved, they began exploring a broader range of FMs, evaluating their internally trained models against those provided by amazon Bedrock. amazon Bedrock offers a convenient environment for benchmarking and a cost-effective solution for managing workloads due to its serverless operation. This serves eSentire well, especially when client queries are sporadic, making serverless a cost-effective alternative to persistently running SageMaker instances.
Also from a security perspective, amazon Bedrock does not share user inputs or model outputs with any model providers. Additionally, eSentire has custom guardrails for NL2SQL applied to its models.
Results
The following screenshot shows an example of the eSentire ai Investigator output. As illustrated, a natural language query is posed to the application. The tool can correlate multiple data sets and present an answer.
<img loading="lazy" class="alignnone wp-image-78504 size-full" src="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2024/06/12/ai-investigator.png” alt=”” width=”938″ height=”496″/>
Dustin Hillard, CTO at eSentire, shares: “eSentire customers and analysts ask hundreds of security data exploration questions per month, typically taking hours to complete. ai Investigator is now in an initial deployment to over 100 customers and over 100 SOC analysts, providing immediate self-service response to complex questions about their security data. “eSentire LLM models are saving thousands of hours of time for clients and analysts.”
Conclusion
In this post, we share how eSentire created ai Investigator, a generative ai solution that provides private and secure self-service interactions with customers. Customers can get near real-time answers to complex questions about their data. ai Investigator has also saved eSentire analysts a lot of time.
The aforementioned LLM gateway project is eSentire's own product and AWS assumes no responsibility.
If you have any comments or questions, please share them in the comments section.
About the authors
Aishwarya Subramaniam He is a Senior Solutions Architect at AWS. He works with commercial customers and AWS partners to accelerate customers' business results by providing expertise in AWS services and analytics.
Ilya Zenkov is a Senior ai Developer specializing in Generative ai at eSentire. He focuses on advancing cybersecurity with expertise in machine learning and data engineering. His experience includes critical roles in developing ML-powered drug discovery and cybersecurity platforms.
David Hillard He is responsible for leading product development and technology innovation, systems teams and corporate IT at eSentire. She has extensive machine learning experience in speech recognition, translation, natural language processing, and advertising, and has published more than 30 articles in these areas.