The Intel Gaudi 3 AI Accelerator answers the demand for computing solutions to drive generative AI by global enterprises. The manufacturer Intel presented the Gaudi 3 AI accelerator , which comes to compete in the growing chip market in which companies such as Nvidia and AMD also participate.
Intel’s new processor delivers 4x more AI computing power for BF16, 1.5x more memory bandwidth, and 2x more network bandwidth for massive system scaling compared to its predecessor, delivering It represents a significant leap in performance and productivity for AI training and inference on popular large language models (LLM) and multimodal models.
Building on the proven performance and efficiency of the Intel Gaudi 2 AI accelerator – the only MLPerf alternative to LLM on the market – Intel offers customers the choice between community-based open software and industry-standard Ethernet networking to scale. your systems more flexibly.
“Despite its constant evolution, the artificial intelligence market faces a significant gap in current offerings. Feedback from our customers and the broader market underscores the desire for more options . Companies weigh considerations such as availability, scalability, performance, cost and power efficiency. Intel Gaudi 3 stands out as the GenAI alternative that presents a compelling combination of price performance, system scalability and time-to-value advantage,” said Justin Hotard, executive vice president of Intel. and general director of the Data Center & AI Group.
Why AI accelerator chips matter
Currently, companies in critical sectors such as finance, manufacturing, and healthcare are seeking to rapidly expand accessibility to AI and transition generative AI ( GenAI ) projects from experimental phases to full-scale implementation. To manage this transition, drive innovation, and achieve revenue growth goals, companies need open, cost-effective, and more energy-efficient solutions and products that meet return on investment (ROI) and operational efficiency needs.
“The Intel Gaudi 3 Accelerator will meet these requirements and deliver versatility through community-based open software and industry-standard open Ethernet, helping enterprises flexibly scale their AI systems and applications,” Intel promised.
The Intel Gaudi 3 accelerator, designed for efficient large-scale AI computing , is manufactured on a 5 nanometer (nm) process and offers significant advances over its predecessor. It is designed to allow the activation of all engines in parallel – with the matrix multiplication engine (MME), the tensor processor cores (TPC) and the network interface cards (NICs) – allowing the necessary acceleration for fast, efficient, and scaled deep learning computing.
What is Intel’s Gaudi 3 chip like?
Key features include:
- Computing engine dedicated to AI. The Intel Gaudi 3 accelerator was designed especially for high-performance, high-efficiency GenAI computing. Each accelerator features a heterogeneous computing engine composed of 64 custom and programmable AI TPCs and eight MMEs. Each Intel Gaudi 3 MME is capable of performing an impressive 64,000 parallel operations, enabling a high degree of computational efficiency, making them adept at handling complex matrix operations, a type of computing that is fundamental to learning algorithms. deep. This unique design accelerates the speed and efficiency of parallel AI operations and supports multiple data types, including FP8 and BF16.
- Memory Increase for LLM Capacity Requirements. 128 gigabytes (GB) of HBMe2 memory capacity, 3.7 terabytes (TB) of memory bandwidth, and 96 megabytes (MB) of built-in static random access memory (SRAM) provide plenty of memory to process large data sets of GenAI, which is especially useful for serving large linguistic and multimodal models, resulting in higher workload performance and greater data center cost efficiency.
- Efficient system scaling for enterprise GenAI. Each Intel Gaudi 3 accelerator integrates 24 200-gigabit (Gb) Ethernet ports, providing a flexible, open-standard network. They enable efficient scaling to support large computing clusters and eliminate dependency on proprietary network fabric vendors. The Intel Gaudi 3 accelerator is designed to efficiently scale up and down from a single node to thousands to meet the expansive requirements of GenAI models.
- Open industry software for developer productivity. Intel Gaudi software integrates the PyTorch framework and provides optimized models based on the Hugging Face community, the most common AI framework for GenAI developers today. This allows GenAI developers to operate at a high level of abstraction for ease of use and productivity, as well as portability of models between different types of hardware.
- Gaudi 3 PCIe. The Gaudi 3 Peripheral Component Interconnect Express (PCIe) add-on card is new to the product line. This new form factor, designed to deliver high efficiency with lower power consumption, is ideal for workloads such as fine-tuning, inference, and recovery augmented generation (RAG). It is equipped as a full-height form factor at 600 watts, with a memory capacity of 128 GB and a bandwidth of 3.7 TB per second.
Benefits of the Gaudi 3 chip
The Intel Gaudi 3 accelerator will deliver significant performance improvements for training and inference tasks on leading GenAI models. Specifically, the Gaudi 3 accelerator is expected to offer, on average, compared to the Nvidia H100:
- 50% more training speed in the Llama2 models with 7B and 13B parameters, and GPT-3 with 175B parameters.
- 50% faster inference performance and 40% higher inference energy efficiency3 on the Llama 7B and 70B, and Falcon 180B parameter models. An even greater inference performance advantage on longer input and output sequences.
- 30% faster 4 in the Llama 7B and 70B, and Falcon 180B parameter models compared to the Nvidia H200.
Market adoption and availability
Intel Gaudi 3 Accelerator will be available to OEMs in Q2 2024 in industry-standard Universal Motherboard and Open Accelerator Module (OAM) configurations. . Notable OEMs bringing Gaudi 3 to market include Dell Technologies, HPE, Lenovo and Supermicro.
General availability of the Intel Gaudi 3 Accelerator is planned for Q3 2024 and the Intel Gaudi 3 PCIe Add-in Card for Q4 2024.
The Intel Gaudi 3 Accelerator will also power several cost-effective LLM cloud infrastructures for training and inference, offering price-performance advantages and options to organizations that now include NAVER.
Developers can get started today with access to Intel Gaudi 2-based instances in the developer cloud to learn, prototype, test, and run applications and workloads.
Separately, Intel announced that the boost from the Intel Gaudi 3 accelerator will be critical for Falcon Shores , Intel’s next-generation graphics processing unit ( GPU ) for AI and high-performance computing (HPC). Falcon Shores will integrate Intel Gaudi and Intel Xe intellectual property (IP) with a single GPU programming interface based on the Intel oneAPI specification.