AI Server Architecture: CXL, DDR5, PCIe 6.0 and Ethernet
In response to the development trends of AI/Data center/HPC and other technology applications, server-related technologies are keeping pace to meet their high-speed computing and transmission requirements. The development goals are all directed at low latency, high reliability, and improved power efficiency. For example, CXL, which has been gaining attention recently, is used to overcome the limitations and low utilization of bandwidth available to server processors (CPUs) for interconnecting memory. The CXL protocol is designed for high-speed signal transmission applications to establish cross-chip memory interconnection and shared access modules to solve the current performance bottleneck between processors and memory in servers.Credit: Boardcom
1.CXL:
It facilitates seamless high-speed communication between processors (such as CPUs, GPUs, and FPGAs) and accelerators, enabling them to work together efficiently and effectively. CXL3.0/3.1 is capable to provide higher bandwidth and transmission speed that meets massive data transfer of the rapid pace of technology development. CXL 3.0/3.1 features memory sharing that allows individual memory regions within pooled resources to be shared between multiple hosts.. This innovative approach reduces data movement, leading to low latency and better performance . Additionally, memory consistency ensures that all accelerators and processors have a coherent view of the memory pool, further enhancing system efficiency.
there are two main types of products in the development of CXL: "Memory Expansion Modules" and "Memory Pools." The actual product implemented on the market is the "Memory Expansion Module." The composition of a CXL memory expansion module includes a CXL controller chip along with DRAM memory chips. The DRAM memory inside the module is connected to the server CPU via the CXL controller chip through the PCIe interface. The CXL memory expansion module is designed for single-server environments. However, memory pooling enables the maximization of memory capacity utilization.
Through CXL switch interconnect technology, memory from multiple servers is combined to form a shared memory pool, overcoming traditional single-server limitations to meet future technological demands.
2.DDR5:
The features of DDR5 have been further optimized than DDR4 to be more capable to process hyperscale of data, more complex computing of innovative technology, it includes enhanced operating efficiency, less power consumption, higher bandwidth, higher density and larger capacity of memory. It’s especially mentioned that DDR5 fully integrates its own Error-Correcting Code (ECC) functionality, allowing for more effective error detection and correction. This enhances the reliability and stability of systems using DDR5 memory.
For example, we provided the customer DDR5 Post-sim simulation service as following, by layout schematics, we simulated relevant parameters such as return loss and insertion loss. and go with the corresponding signal I/O component models for simulation analysis
After adjusting and re-simulating the original design, the eye diagram results improved, meeting design specifications. For those interested in learning more about simulation analysis, please refer to the videos: "simulation analysis 101"and"SI/PI Simulation Analysis."
3.PCIe 6.0 :
PCIe serves as the physical layer role, providing the I/O interface for connecting expansion cards and external hardware devices. Via CXL, accelerators like GPU/FPGA can connect to the system and share memory. The CXL 3.0 specification has been officially released, with the critical feature being the adoption of PCIe 6.0 based on PAM4 encoding at the physical layer, doubling the transmission bandwidth and speed. PCIe 6.0 has doubled the data transfer rate from PCIe 5.0's 32 GT/s to 64 GT/s. To achieve this, PCIe 6.0 employs PAM4 (Pulse Amplitude Modulation with 4 levels) encoding. Additionally, PCIe 6.0 introduces Flow Control Unit (FLIT) encoding, which works in conjunction with FEC (Forward Error Correction) and CRC (Cyclic Redundancy Check) to further enhance bandwidth. FEC corrects any errors during data transmission, while CRC ensures that no errors slip through. Both steps help to reduce the bit error rate associated with PAM4 signal transmission. Another unique feature of PCIe 6.0 is L0p, a low-power state that can be enabled only in FLIT mode. In L0p, some lanes can be put into a sleep state while others remain active, allowing for flexible power management based on workload. As a result, L0p is able to lower power consumption, thus prolonging the lifespan of the equipment. Given the rising demand for high performance and low power consumption in AI data centers, PCIe 6.0, with its enhanced capabilities, addresses these needs. The PCI-SIG, recognizing the industry's requirements, has accelerated the development of PCIe standards. As a result, PCI Express has become an indispensable high-speed interface for modern technology products.
4.Ethernet :
Ethernet: Not only CXL, but the network interface cards (NICs) used in servers are also provided via PCIe interface to establish Ethernet connections, enabling remote access and memory sharing through 100G/400G/800G/1.6T Ethernet.
With technologies like CXL, PCIe, and DDR continually are developing latest generation to meet the demands of recent data centers for high transmission speeds and efficiency. These efforts enable chipset, memory, and high-speed interfaces to develop symbiotic relationships during the whole operation, it’s as well as the significant progress for data transmission speed, memory utilization, storage efficiency, and overall system performance.
These technologies are critical elements for data centers/servers to achieve higher speeds, lower latency, and less power consumption. iPasslabs is able to provide professional one-stop test services for these technologies, including simulation in the early design stages, validation during the R&D phase, and test fixtures. With the experienced iPasslabs team involved in the product development process, they can also provide professional advice, maximizing efficiency and effectiveness.
Reference: https://www.ithome.com.tw/tech/153366
Reference: https://www.broadcom.com/products/pcie-switches-retimers/expressfabric
entrance picture credit:Google