We are pleased to announce the availability of the Jamba-Instruct Large Language Model (LLM) on amazon Bedrock. Jamba-Instruct is built by AI21 Labs and notably supports a context window of 256,000 tokens, making it especially useful for processing large documents and complex retrieval augmented generation (RAG) applications.
What is Jamba-Instruct?
Jamba-Instruct is an instruction-adapted version of the base Jamba model, formerly open source by AI21 Labs, which combines a production grade model, Structured state space (SSM) Transformer technology and architecture. Using the SSM approach, Jamba-Instruct can achieve the largest context window length in its model size class while delivering the performance provided by traditional transformer-based models. These models offer a performance increase over the previous generation of AI21 models, the Jurassic-2 model family. For more information about the SSM/Transformer hybrid architecture, see the Jamba: a hybrid transformer-mamba language model White paper.
Get started with Jamba-Instruct
To get started with Jamba-Instruct models on amazon Bedrock, you must first gain access to the model.
- In the amazon Bedrock console, choose Access to the model in the navigation panel.
- Choose Modify access to the model.
- Select the AI21 Labs models you want to use and choose Next.
- Choose Deliver to request access to the model.
For more information, see Accessing the model.
You can then test the model in the amazon Bedrock Text or Chat playground.
Example Use Cases for Jamba-Instruct
Jamba-Instruct's long context length is particularly well suited for complex retrieval augmented generation (RAG) workloads or potentially complex document analysis. For example, it would be suitable for detecting contradictions between different documents or analyzing one document in the context of another. Below is an example message suitable for this use case:
You can also use Jamba for query extension, a technique in which an original query is transformed into related queries, to optimize RAG applications. For example:
You can also use Jamba for standard LLM operations, such as entity abstraction and extraction.
You can find quick guidance for Jamba-Instruct in the AI21 Model Documentation. For more information on Jamba-Instruct, including relevant benchmarks, see Built for the Enterprise: Introducing AI21’s Jamba-Instruct Model.
Programmatic access
You can also access Jamba-Instruct through an API, using amazon Bedrock and the AWS SDK for Python (Boto3). For installation and configuration instructions, see the Quick startThe following is an example code snippet:
Conclusion
AI2I Labs Jamba-Instruct on amazon Bedrock is ideal for applications where a long context window (up to 256,000 tokens) is required, such as producing summaries or answering questions based on long documents, avoiding the need to manually segment document sections to fit the smaller context windows of other LLMs. The new SSM/Transformer hybrid architecture also provides model performance benefits. It can provide a performance boost of up to three times more tokens per second for context window lengths exceeding 128,000 tokens, compared to other models of similar size.
AI2I Labs Jamba-Instruct on amazon Bedrock is available in the US East (N. Virginia) AWS Region and can be accessed on an on-demand consumption model. For more information, see Supported foundation models in amazon Bedrock. To get started with AI2I Labs Jamba-Instruct on amazon Bedrock, visit the amazon Bedrock console.
About the authors
Joshua BroydePhD, is a Principal Solutions Architect at AI21 Labs. He works with AI21 clients and partners across the entire generative ai value chain, including enterprise-level generative ai enablement, use of generative ai chains and workflows. Complex LLMs for regulated and specialized environments, and the use of LLMs at scale.
Fernando Espigares Caballero is a Senior Partner Solutions Architect at AWS. He co-creates solutions with strategic technology partners to deliver value to customers. He has more than 25 years of experience working on IT platforms, data centers, and cloud and Internet-related services, and holds multiple industry and AWS certifications. He currently focuses on generative ai to drive innovation and the creation of novel solutions that solve specific customer needs.