Speakers
Robert Richter
(AMD Inc.)Mr
Srinivasulu (Srini) Thanneeru
Terry Bowman
(AMD)
Description
Compute Express Link (CXL) is a low-latency, high-bandwidth, heterogeneous, and cache-coherent interconnect between a CPU or a device and other accelerator or memory devices. With CXL Type 3 Devices the memory is located on a device but can be used as system memory, the same as standard memory. This allows a flexible way to assign and manage system memory using memory devices.
As various components and different protocols and subsystems are involved in memory access, the handling of CXL errors becomes challenging. CXL provides RAS features to report and handle errors.
Error handling support has been added to recent kernels and development is still ongoing. This talk gives an update and a development outlook on CXL error handling.