This blog was written by a guest author from IDC for publication on micron.com
To date, data center system architectures have adapted to the disruption of artificial intelligence (AI) by increasing data throughput within the core data center. Processing, memory, storage and networking technologies have focused on where the data is and trying to move that data reactively as fast as possible to where it needs to be.* However, the culmination of AI in the data center will be in predicting where data is needed next.
Future data center locations represent a hybrid model based on centralized, core data centers — massive football field-sized warehouses full of servers — and decentralized, edge data centers that are located strategically close to populations using the data. Since data that’s closer to users is faster and requires less energy to move, AI will identify patterns, predict where data will be needed, and proactively — not reactively — move data between the core and edge. Predictively and proactively moving data to where it’s to be consumed will allow enterprises to take full advantage of distributed and heterogenous computing. Benefits include faster data access and integration for analysis, better allocation of synchronized resources and reduced power costs.
*Data Point: AI IT Infrastructure semiconductor and storage mechanisms revenue will grow at a 36% CAGR from 2022-2027 and grow to over $190 Billion by 2027.1
Benefits of new data center architectures
Emerging component- and system-level technologies will allow system designers to re-architect core and edge infrastructure systems to enable AI-predicted and synchronized data processing, storage and movement. These technologies will also enable more application specificity according to where the technology will be deployed whether in the core or at the edge. These technologies include:
- HBM3E — High-bandwidth memory (HBM) is a computer memory interface for stacking DRAM dies and connecting them through wires called through-silicon vias (TSVs). The HBM structure enables many memory chips to be packed into a smaller space than could be in the traditional structure and so reduces the distance that data needs to travel between the memory and processor.
The latest generation of HBM is HBM3E. Using a 1024-bit data path and operating at 9.6 gigabits per second (Gb/s), HBM3E can deliver bandwidth of 1229 Gb/s. HBM3E allows the 1024-bit-wide data channels to be divided into 16 64-bit channels or 32 32-bit channels, which can expand the number of memory channels available to data center system designers. Providing higher-performance and higher-capacity dedicated memories, such as for server GPUs, HBM3E enables memory that can scale to meet the needs of different workloads. HBM3E will be produced by Micron and other manufacturers in 2024 and beyond. High-bandwidth memory is already the most common dedicated memory for AI processing in servers.
- Compute Express LinkTM (CXL) — This technology standardizes the protocols between chips of distinct functions, such as microprocessors to memory, microprocessors to accelerators and memories to each other for purposes of sharing resources. CXL is built on the PCI Express® (PCIe) physical and electrical interface. It includes protocols for input/output (I/O), cache coherency and system memory access. CXL's serial communication and pooling capabilities allow memory to overcome the performance and socket packaging limitations of common DIMM memory when high storage capacities are implemented. Removing such limitations means that data center system designers can worry less and less about memory being the bottleneck to their targeted workloads’ performance needs. Originally released in the 1.0 version in 2019, CXL products in the 1.1 version entered the market in 2022, and the 3.1 version was introduced in late 2023. Micron is a member of the CXL Consortium.
- Universal Chiplet Interconnect ExpressTM (UCIe) — UCIe technology standardizes the interconnect and protocols between silicon die — called chiplets — within a single package. Enabling technology vendors to mix and match functions within a single package, UCIe enables an interoperable and multivendor ecosystem that can produce chips customizable for specific workloads. The UCIe standard was introduced in 2022 by a consortium of technology vendors, including Micron.
The synchronized data processing, storage and movement capabilities of HBM3E, CXL and UCIe technologies will allow system architects to adapt server designs to the local needs of the intended workloads. Common pools of memory, storage, compute and networking resources mean that each resource is coherent with every other resource and that there’s mutual access to others’ resources. The move from fixed building blocks to flexible, on-demand pools of resources shifts from static computing architectures to composable computing architectures. Composable computing is fundamental to AI-predicted data optimization.
The future of the data center
The data center is the focal point for major IT market trends, including more data and data types, diversified workloads, heterogeneous computing, distributed computing and AI. Thus, composable computing is an essential industry response to these trends and one poised to revolutionize system architectures. UCIe, HBM3E and CXL represent fundamental changes to system architecture, and IDC estimates that they will coalesce in mainstream data center servers before the end of this decade.
Composable computing will enable tasks to access the resources they need when they need them, whether the resource is a processor or accelerator's compute power, a main memory or dedicated memory's real-time response, or a network's lag-minimizing intrasystem or intersystem communications. For a large language model (LLM), composable infrastructure means dynamically scaling up its processing power, optimizing its resource utilization and accelerating its training speeds. Figure 1 illustrates how composable computing enables AI infrastructure systems to scale in how they capture, move, store, organize and utilize data to meet the needs of specific workloads.
Figure 1: Composable infrastructure means that traditional systems with private resources give way to AI systems that pool and share resources — such as storage I/O — that draw from a single, unified pool (lake) or data. (Source: Micron)
By breaking AI systems down into modular, reusable components, developers can use composability — implemented via technology standards across systems and data centers — to assume their AI models will exist in predictable system environments. This assumption, in turn, enables them to mix and match pretrained models, algorithms and data pipelines. Accordingly, this approach leads to quicker deployment of AI models adapted to diverse use cases and optimized to predict where data will be needed next.
1 Outlook for AI Semiconductors and Storage Components in IT Infrastructure, IDC # US51851524, February 2024
The opinions expressed in this article are those of the individual contributing author and not Micron Technology, Inc., its subsidiaries, or affiliates. All information is provided “AS-IS,” and neither Micron nor the author make any representations or warranties with respect to the information provided. Micron products are warranted as provided for in the products when sold, applicable data sheets or specifications. Information, products, and/or specifications are subject to change without notice. Micron and the Micron logo are trademarks or registered trademarks of Micron Technology, Inc. Any names or trademarks of third parties are owned by those parties and any references herein do not imply any endorsement, sponsorship, or affiliation with these parties.