The Amazon Elastic Compute Cloud (Amazon EC2) accelerated computing portfolio offers the broadest choice of accelerators to power your artificial intelligence (ai), machine learning (ML), graphics, and high-performance computing (HPC) workloads. We are pleased to announce the expansion of this portfolio with three new instances featuring the latest NVIDIA GPUs: Amazon EC2 P5e instances powered by NVIDIA H200 GPUs, Amazon EC2 G6 instances powered by NVIDIA L4 GPUs, and Amazon EC2 G6e instances powered by NVIDIA GPUs. L40S. All three instances will be available in 2024 and we look forward to seeing what you can do with them.
AWS and NVIDIA have collaborated for more than 13 years and have pioneered large-scale, high-performance and cost-effective GPU-based solutions for developers and enterprises across the spectrum. We’ve combined powerful NVIDIA GPUs with differentiated technologies from AWS, including AWS Nitro System, 3200 Gbps Elastic Fabric Adapter (EFA) v2 networking, hundreds of GB/s of data throughput with Amazon FSx for Luster, and exascale computing with Amazon EC2 UltraClusters to Deliver the highest-performance infrastructure for ai/ML, graphics, and HPC. Along with other managed services such as Amazon Bedrock, Amazon SageMaker, and Amazon Elastic Kubernetes Service (Amazon EKS), these instances provide developers with the industry’s best platform for building and deploying graphical, HPC, and generative ai applications.
Cost-effective, high-performance GPU-based instances for ai, HPC, and graphics workloads
To power the development, training, and inference of larger large language models (LLMs), EC2 P5e instances will feature NVIDIA’s latest H200 GPUs, offering 141GB of HBM3e GPU memory, which is 1.7 times larger and 1.4 times faster than H100 GPUs. . This increase in GPU memory coupled with up to 3200 Gbps of EFA networking enabled by AWS Nitro System will allow you to continue building, training, and deploying your cutting-edge models on AWS.
EC2 G6e instances, with NVIDIA L40S GPUs, are designed to provide developers with a widely available option for publicly available LLM training and inference, as well as to support the growing adoption of small language models (SLM). They are also optimal for digital twin applications that use NVIDIA Omniverse to describe and simulate 3D tools and applications, and to create virtual worlds and advanced workflows for industrial digitization.
EC2 G6 instances, with NVIDIA L4 GPUs, will offer a lower-cost, energy-efficient solution for deploying machine learning models for natural language processing, language translation, image and video analysis, speech recognition and personalization, as well as uploads. graphics work, such as creating and rendering cinema-quality real-time graphics and game streaming.
About the Author
Chetan Kapoor is the director of product management for the Amazon EC2 accelerated computing portfolio.