Description
The Performance and Scalability microconference focuses on enhancing performance and scalability in both the Linux kernel and userspace projects. In fact, one of the purposes of this microconference is for developers from different projects to meet and collaborate – not only kernel developers but also researchers doing more experimental work. After all, for the user to see good performance and scalability, all relevant projects must perform and scale well.
Because performance and scalability are very generic topics, this track is aimed at issues that may not be addressed in other, more specific sessions. The structure will be similar to what was followed in previous years, including topics such as synchronization primitives, bottlenecks in memory management, testing/validation, lockless algorithms and RCU, among others.
Traditionally, all RAM is DRAM. Some DRAM might be closer/faster than
others, but a byte of media has about the same cost whether it is close
or far. But, with new memory tiers such as High-Bandwidth Memory or
Persistent Memory, there is a choice between fast/expensive and
slow/cheap.
We use the existing reclaim mechanisms for moving cold data out of
fast/expensive tiers. It works...
Large installations require considerable monitoring and control, and the occasional scan of procfs files is often the best tool for the monitoring job at hand. In cases where memory consumption is a concern, /proc/PID/{maps,numa_maps,smaps,smaps_rollup} can be quite helpful.
To your monitoring, anyway.
Unfortunately, some mm-related procfs files need to acquire the dreaded mmap_sem. ...
The maple tree is an RCU-safe range-based B-Tree that was designed to fit a
number of Linux kernel use cases. Most recently the maple tree has been sent
upstream as a patch set that replaces the vma rbtree, the vma linked list, and
the vmacache while maintaining the current performance level. This performance
should improve as the RCU aspect of the tree is leveraged to remove...
It is currently possible to do fast hypervisor update by preserving virtual machine state in memory during reboot. This approach relies on using emulated PMEM, DAX, and local live migration technologies.
As of today, there are a number of limitations with this approach:
- The interface to preserve VM memory is not very flexible. The size and location of PMEM must be determined prior to...
Preserved-over-kexec memory storage or PKRAM provides an API for saving memory pages of the currently executing kernel so that they may be restored after kexec into a new kernel. PKRAM provides a flexible way for doing this without requiring that the amount of memory used be a fixed size created a priori.
One use case for PKRAM is preserving guest memory and/or auxillary supporting
data...
Lock throughput can be increased by handing a lock to a waiter on the
same NUMA node as the lock holder, provided care is taken to avoid
starvation of waiters on other NUMA nodes. This talk will discuss CNA
(compact NUMA-aware lock) as the slow path alternative for the current
implementation of qspinlocks in the kernel.
CNA is a NUMA-aware version of the MCS spin-lock. Spinning threads...