11โ€“13 Dec 2025
Asia/Tokyo timezone

Session

Device and Specific Purpose Memory MC

11 Dec 2025, 10:00

Conveners

Device and Specific Purpose Memory MC

  • Dan Williams (Intel)
  • Adam Manzanares (Samsung Electronics)

Description

The Device and Specific Purpose Memory Microconference is proposed as a space to discuss topics that cross MM, Virtualization, and Memory device-driver boundaries. Beyond CXL this includes software methods for device-coherent memory via ZONE_DEVICE, physical memory pooling / sharing, and specific purpose memory application ABIs like device-dax, hugetlbfs, and guest_memfd. Some suggested topic areas include, but not limited to:

 NUMA vs Specific Purpose Memory challenges
 Core-MM services vs page allocator isolation
 CXL use case challenges
 Hotness Tracking and Migration Offloads
 ZONE_DEVICE future for Accelerator Memory
 ZONE_DEVICE future for CXL Memory Expansion
 PMEM, NVDIMM, and DAX "legacy" challenges
 Memory hotplug vs Device Memory
 Memory RAS and repair gaps and challenges
 Dynamic Capacity Device ABI (sparse memfd?)
 Confidential Memory challenges
 DMABUF beyond DRM use cases
 virtiomem and virtiofs vs DAX and CXL challenges
 Peer-to-peer DMA challenges
 CXL Memory Pool Management
 Device Memory testing

Why not the MM uConf for these topics? One of the observations from MM track at LSF/MM/BPF is that there is consistently an overflow of Device Memory topics that are of key interest to Memory device-driver developers, but lower priority to core MM developers.

Key Attendees:

Rajneesh Bhardwaj
Terry Bowman
Davidlohr Bueso
John Groves
Jason Gunthorpe
David Hildenbrand
John Hubbard
Alistair Popple
Gregory Price

Contingent or unknown travel availability:

Jonathan Cameron
Dave Jiang
David Rientjes
Ira Weiny

Progress made on topics discussed at 2024 Plumbers:

Merged: CXL EDAC support for Memory Repair: http://lore.kernel.org/20250521124749.817-1-shiju.jose@huawei.com
Launched: CXL Management Library: https://github.com/computexpresslink/libcxlmi
Patches Available: FAMFS over FUSE: http://lore.kernel.org/20250703185032.46568-1-john@groves.net
Patches Available: Dynamic Capacity: http://lore.kernel.org/20250413-dcd-type2-upstream-v9-0-1d4911a0b365@intel.com
Patches Available: Type-2 CXL Accelerators: http://lore.kernel.org/20250624141355.269056-1-alejandro.lucero-palau@amd.com

"Device Memory" Background:

"Device Memory" is a catch-all term for the collection of platform
technologies that add memory to a system outside of the typical "System RAM" default pool. Compute Express Link (CXL), a coherent interconnect that allows memory and caching-agent expansion over PCIe electricals, is one such technology. GPU/AI accelerators with hardware coherent memory, or software coherent memory (ZONE_DEVICE::DEVICE_PRIVATE), are another example technology.

In the Memory Management track of the 2025 LSF/MM/BPF Summit it became clear that CXL is one of a class of technologies putting pressure on traditional NUMA memory policy. While solutions like
memory-interleave-sysfs and device-dax mitigate some of the issues there are still lingering concerns about memory of a certain performance class leaking into allocations that assume "default memory pool" performance.

The problem is how to keep Device / Specific Purpose memory contained to its specific consumers while also offering typical core-mm services. Solutions to that problem potentially intersect mechanisms like numactl, hugetlbfs, memfd, and guest_memfd. For example, guest_memfd is a kind of specific-purpose memory allocator.

Presentation materials

  1. Adam Manzanares (Samsung Electronics), Dan Williams (Intel)
    11/12/2025, 10:00

    Offer some quick introductions and welcome to attendees. Convey a few reminders about the rigid timekeeping to fit in all the topics along with the break, and then off we go.

    Go to contribution page
  2. Hannes Reinecke (SUSE Labs)
    11/12/2025, 10:05

    There have been discussions around auto-onlining of CXL memory
    (https://lore.kernel.org/linux-mm/aIcxs2nk3RNWWbD6@localhost.localdomain/)
    but we haven't really made progress there.
    Problem was that we can try to fixup / modfiy / tweak the algorithm for auto-onlining CXL memory, or we could go in the other direction and not online CXL memory (or any memory in ZONE_MOVABLE) but delegate...

    Go to contribution page
  3. Davidlohr Bueso (Samsung Semiconductor)
    11/12/2025, 10:35

    Host-managed Device Memory โ€“ Device-coherent with Back-invalidate support (HDM-DB) is a type of device memory introduced in CXL 3.0. It allows Type 2 and Type 3 devices to manage memory coherence directly. With HDM-DB, the device acts as the final arbiter of coherence for addresses it owns. This mechanism enables devices to implement inclusive snoop filters to track host caching of device...

    Go to contribution page
  4. Bharata Bhasker Rao (AMD)
    11/12/2025, 11:05

    In the Linux kernel, hot page information can potentially be obtained from multiple sources:

    a. PROT_NONE faults (NUMA balancing)
    b. PTE Access bit (LRU scanning)
    c. Hardware provided page hotness info (AMD IBS, CXL HMU)
    

    This information is further used to migrate (or promote) pages from slow to top memory tier for optimal performance.

    Currently, the sources a) and b) above work...

    Go to contribution page
  5. Alejandro Lucero (AMD)
    11/12/2025, 12:00

    With CXL Type2 devices comes CXL cache, implying CXL-capable devices to read/write to Host memory through system cache coherency infrastructure. If virtual machines want to take advantage of this functionality the kernel needs to properly configure the system for avoiding arbitrary access from a device to Host memory not allocated to the related VM controlling such a device. While for DMA...

    Go to contribution page
  6. Mr Rajneesh Bhardwaj (AMD)
    11/12/2025, 12:20

    Background and Motivation
    Highโ€‘bandwidth memory (HBM) has become a critical resource for modern machineโ€‘learning and AI workloads, offering ordersโ€‘ofโ€‘magnitude improvements in bandwidth and latency compared to traditional DDR DRAM. As HBM adoption growsโ€”whether on GPU accelerators like AMDโ€™s MI200 series or NVIDIAโ€™s Grace Hopper/Blackwell architecturesโ€”platform firmware and...

    Go to contribution page
  7. SeongJae Park
    11/12/2025, 12:40

    Modern systems feature increasingly complex NUMA (Non-Uniform Memory Access) topologies, often with multiple nodes that may or may not be equipped with CPUs, GPUs, or other accelerators. This complexity makes it crucial to migrate memory pages efficiently based on access patterns.

    DAMON, a Linux kernel subsystem, offers effective monitoring of system and workload data access patterns. It...

    Go to contribution page
  8. Sumit Garg
    11/12/2025, 13:00

    Protected memory refers to memory buffers behind a hardware enforced firewall. It is not accessible to the kernel during normal circumstances but rather only accessible to certain hardware IPs or CPUs executing in higher or differently privileged mode than the kernel itself. The use-cases driving this feature in TEE subsystem are secure video playback, trusted UI, secure video recording,...

    Go to contribution page
  9. John Groves (Micron)
    11/12/2025, 13:20

    Famfs (the Fabric-Attached Memory File System) formats device memory into a scale-out file systems.
    With large memory appliances now in early deployment, accessing multi-terabyte memory objects as files (with POSIX permissions) is proving valuable.

    Famfs is progressing toward upstreaming, while navigating challenges from the recent DAX subsystem refactoring.
    As the first file...

    Go to contribution page
Building timetable...
Diamond Sponsors
Platinum Sponsors
Gold Sponsors
Silver Sponsors
T-Shirt Sponsor
Conference Services Provided by