Linux Plumbers Conference 2023

Name: Linux Plumbers Conference 2023
Start: 2023-11-13T09:00:00-05:00
End: 2023-11-15T23:30:00-05:00
Location: No location set

13–15 Nov 2023

America/New_York timezone

2023

contact@linuxplumbersconf.org

Topics

272. Welcome!

Sasha Levin, Shuah Khan

13/11/2023, 09:30

Kernel Testing & Dependability MC

217. Welcome message and DL Server

Daniel Bristot de Oliveira (Red Hat, Inc.)

13/11/2023, 09:30

Real-time and Scheduling MC

The DL server is a method that allows the usage of a SCHED_DEADLINE to schedule an entire scheduler. This mechanism can be used for multiple purposes. The base case is to

For example, to schedule the CFS scheduler, avoiding the starvation from SCHED_FIFO. The server's base was presented by peterz some years ago, but it raised the points. For example, the inversion of priority of CFS and...

323. Introduction

Palmer Dabbelt (Google)

13/11/2023, 09:30

RISC-V MC

208. Deprecating Stuff

Palmer Dabbelt (Google)

13/11/2023, 09:35

RISC-V MC

Let's talk about what we can deprecate and when.

215. the path to achieve a bug-free build on the mainline

philip li

13/11/2023, 09:40

Kernel Testing & Dependability MC

There're a lot of focused testing effort across Linux kernel community to guarantee the quality of kernel from build to runtime. Nowadays, not only the test process has moved towards formalization but also the test coverage has been increased to discover more issues in an earlier time. On the other side, some issues are still escaped to mainline.

In this talk, we will dive into the build...

220. Do nothing fast: How to scale idle cpus ?

Mathieu Desnoyers (EfficiOS Inc.)

13/11/2023, 09:55

Real-time and Scheduling MC

Following surprising benchmark results showing that adding a global raw spinlock in the idle loop significantly improves performance of the scheduler-heavy hackbench benchmark on a 192 core AMD EPYC, a month-long investigation followed to understand the root cause of this behavior.

This presentation is meant to walk the audience through the findings and the resulting solution, opening...

174. Run ILP32 on RV64 ISA (RV64ILP32)

Mr Ren Guo

13/11/2023, 09:55

RISC-V MC

The 64ilp32 ABI is not a fresh topic; x86-x32, mips-n32, and arm64-ilp32 have all appeared for many years but have yet to succeed in wide usage. But running ILP32 on 64-bit ISA still has a magic power to abstract people for continuous trying; now, this is our turn. The rv64ilp32 patch series has iterated to the second version, combining u64ilp32 (User) & s64ilp32 (kernel), supporting the...

114. Storing and Outputting Test Information: KUnit Attributes and KTAPv2

Rae Moar

13/11/2023, 10:05

Kernel Testing & Dependability MC

Current kernel testing frameworks save basic test information including test names, results, and even some diagnostic data. But to what extent should frameworks store supplemental test information? This could include test speed, module name, file path, and even parameters for parameterized tests.

Storing this information could greatly improve the kernel developer experience by allowing test...

211. system pressure on CPUs capacity and feedback to scheduler

Vincent Guittot (Linaro)

13/11/2023, 10:15

Real-time and Scheduling MC

How to reflect better the pressure that can be applied on the CPUs compute capacity into the scheduler to improve task placement deciion and load balancing.
This is a follow-up of the talk at OSPM and patchset will be published before LPC

250. RISC-V patchwork CI

Björn Töpel (N/A)

13/11/2023, 10:15

RISC-V MC

The RISC-V kernel has a number of different continous integration (CI)
instances in the wild. This session covers the "patchwork CI", which
pulls patches from patchwork, and reports build/test results back to
the submitter. We will be presenting how the CI is setup, what builds
are done, and how tests are performed. Further, we will discuss
current limitations, and outline a "patchwork...

115. Testing Drivers with KUnit (Does hardware have to be hard?)

David Gow (Google)

13/11/2023, 10:30

Kernel Testing & Dependability MC

Unit testing common library code is (relatively) easy, but drivers often deal with a lot of global state, both in code and in hardware. New features like static stubbing go some way towards making this easier, but a lot of work still goes into making "fake devices".

There are still many open questions, however:
- Are the existing tools helping? Is there something obviously missing?
- Are...

178. Optimizing Chromium Low-Power Workloads on Intel Notebooks

Len Brown (Intel Corporation), Ricardo Neri (Intel Corporation), Mr Vaibhav Shankar (Intel Corporation)

13/11/2023, 10:35

Real-time and Scheduling MC

At LPC 2022 we hosted an Energy Quality of Service (EQOS) API discussion. The proposed API enables user-space to inform the kernel about something it is expert in: itself. Callers do not require any knowledge of the hardware, unrelated tasks, or the internal workings of the scheduler. The session sparked a lot of follow-on discussions, with the main take-away being “okay, so prototype it and...

200. Proposal of porting Trusted Execution Environment Provisioning (TEEP) Protocol with WorldGuard

Akira Tsukamoto

13/11/2023, 10:40

RISC-V MC

The objective of TEEP Protocol is to install and update the target device or server to have the latest critical software and data which is called Trusted Component (TC) at the IETF.
In the procedure, the server checks the trustworthiness of the target devices remotely whether it is compromised or not, and only installs and updates the software components if confirmed it is not...

66. How to reduce complexity in Proxy Execution

John Stultz (Google)

13/11/2023, 11:25

Real-time and Scheduling MC

The proxy execution patch series continues to be worked on to stabilize and get it ready for validation for use in products.

But its complexity is high.

I want to have a discussion for ideas on how we might break things up into more fine grained patches to iteratively get upstream, without making it an epic effort (hello, PREEMPT_RT!), or overwhelming reviewers ("[PATCH 1/628]...

225. Quality in embargoed patches

Sasha Levin

13/11/2023, 11:30

Kernel Testing & Dependability MC

The bar on the quality of code that fixes embargoed issues is pretty low: usually the case is that the code is only tested by the author, and possibly a handful of other folks who are part of working on the fix.

This session is a discussion to help draft a proposal for a testing story that could be presented to HW vendors that with to publish embargoed code without going through the...

228. SBI Supervisor Software Events

Mr Clément Léger (Rivos Inc)

13/11/2023, 11:30

RISC-V MC

The Supervisor Software Events (SSE) extension provides a
mechanism to inject software events from an SBI implementation
to supervisor software such that it preempts all other traps and
interrupts. This brings interesting challenges for the SBI implementation (OpenSBI,KVM RISC-V, etc) and supervisor software (Linux).

62. Adaptive userspace spinlocks with rseq

André Almeida (Igalia), Mathieu Desnoyers (EfficiOS Inc.)

13/11/2023, 11:45

Real-time and Scheduling MC

Implementing efficient spinlocks in userspace is not possible yet in Linux, even after years of different approaches and proposed solutions.The main gap to achieve it is the lack of ABI providing an easy and low-overhead way to check if the current lock holder is running or not.

In this session, we are going to present the problem, and to propose a solution for it using the restartable...

198. Perf feature improvements in RISC-V

ATISH PATRA (Rivos)

13/11/2023, 11:55

RISC-V MC

RISC-V Linux kernel has some basic perf support with counter overflow and stat until now. This has its own limitations and multiple perf related ISA extensions are being drafted to address these concerns. We would like discuss few of the existing challenges and new issues related to implementation for new ISA extensions. For example, counter event mapping, event encoding, host + guest usage...

194. Detecting failed device probes

Laura Nao, Nicolas Prado (Collabora)

13/11/2023, 12:00

Kernel Testing & Dependability MC

Regressions that cause a device to no longer be probed by a driver can have a
big impact on the platform's functionality, and despite being relatively common
there isn't currently any generic way to detect them.

By enabling the community to catch device probe regressions in a way that
doesn't require additional work for every new platform, and that can catch
issues from config changes...

235. CPU Isolation state of the art

Frederic Weisbecker (Suse)

13/11/2023, 12:05

Real-time and Scheduling MC

Here's a tour of what has been done in the front of CPU isolation
this year and what still need to be achieved. Among which topics will include examples such as:

Memcg cache drain
Vmstat
Disable per-CPU buffer_head cache
IPI deferrals
cpusets v2 improvements
Osnoise tracer
Need for a nohz_full cpuset interface?
Sysidle (energy optimization)

169. RISC-V Vector: Current Status and Next?

Tao Chiu

13/11/2023, 12:10

RISC-V MC

In this talk we are going to briefly share the status of Vector extension support and focus our discussion on the use of Vector in the kernel-mode. We will do it by reviewing others arch approaches and seeking if there is anything we may carry or improve as risc-v.

Most architectures provide SIMD instruction set to improve throughput of some operations. However, the use of SIMD instructions...

197. Unifying and improving test regression reporting and tracking

Gustavo Padovan (Collabora), Ricardo Cañuelo

13/11/2023, 12:25

Kernel Testing & Dependability MC

The current CI systems for the kernel offer basic and low-level
regression detection and handling capabilities based on test results, but they do that in their own specific way. We wonder if we can find more common ways of tackling the problem through post-processing the data provided by the different CI systems. We could then extract additional "hidden" information, look at failure trends,...

76. Improving CPU Isolation with per-cpu spinlocks: performance cost and analysis

Mr Leonardo Bras Soares Passos (Red Hat)

13/11/2023, 12:25

Real-time and Scheduling MC

What do we want?
- Better CPU isolation, in order to run time-sensitive tasks without interruption

What is (one of the things) preventing this?
- queue_work_on(isolated_cpu)

While working on those, an interesting parallel programming strategy was noticed:
- Use per-cpu structures with local_lock, when a remote CPU needs any action performed, use queue_work_on(target_cpu).
- Works...

271. Control Flow Integrity on RISCV

Deepak Gupta

13/11/2023, 12:25

RISC-V MC

Memory safety issues impact program safety and integrity. One of the implications of such issues is subversion of programmer intended control flow of the program and thus violation of control flow integrity of program. There has been various software (and hardware) mechanisms using which one can enforce control flow integrity of the program. One such mechanism is using hardware assisted shadow...

136. Q&A about PREEMP_RT

Thomas Gleixner

13/11/2023, 12:40

Real-time and Scheduling MC

Thomas will be open to people's questions about PREEMPT RT and other topics.

269. RISC-V irqbypass with KVM

Andrew Jones (Ventana Micro Systems)

13/11/2023, 12:45

RISC-V MC

KVM and VFIO provide an architecture-neutral irqbypass framework, but
its enablement requires an implementation of an architecture-specific
function, kvm_arch_irq_bypass_add_producer(). The RISC-V AIA and IOMMU
specifications provide novel support for guest interrupt delivery (most
notably MRIFs), which must be considered for RISC-V KVM's irqbypass
implementation. We have an initial...

183. Improve Xeon IRQ throughput with posted interrupt

Jacob Pan

15/11/2023, 09:30

VFIO/IOMMU/PCI MC

Server SoCs today offer more PCIe lanes as well as the ability to stack more IO devices on a single port. Out of the box, devices such as high-speed NVMe drives can generate a significant number of interrupts at high frequencies. Due to microarchitecture choice and PCIe strong ordering requirements, limited IRQ throughput on Intel Xeon has also become a limiting factor for DMA throughput. IOPS...

119. PCI Endpoint Subsystem Open Items Discussion

Manivannan Sadhasivam

15/11/2023, 10:00

VFIO/IOMMU/PCI MC

PCI Endpoint subsystem allows Linux Kernel to run on the PCI endpoint devices thereby establishing communication with the PCI host for data transfer. There are 3 open items to discuss for the PCI Endpoint subsystem:

The heart of the PCI Endpoint subsystem is the Endpoint Function (EPF) driver that describes the Physical and Virtual functions inside the Endpoint device. So far 3 EPF...

203. Non-discoverable devices in PCI devices

Lizhi Hou (AMD), Rob Herring (Arm)

15/11/2023, 10:30

VFIO/IOMMU/PCI MC

Modern PCI devices can expose a whole slew of hardware behind a single PCI "device". While the PCI device itself is discoverable, everything behind it (via BARs) is not. These devices aren't fixed in what downstream devices are exposed nor their configuration. There's already a solution for discovering devices and their configuration which is Devicetree. There's also already a mechanism to...

256. IOMMU overhead optimizations and observability

Pasha Tatashin, Yu Zhao (Google)

15/11/2023, 11:30

VFIO/IOMMU/PCI MC

IOMMU overhead memory, which is primarily page table memory, is allocated directly from the buddy allocator, and is not charged or accounted for. Also, there is no easy way to debug IOMMU translations as there are no user interfaces that allow walking through IOMMU page tables. Below are the proposals to solve the problems.

**Add an observability for IOMMU page table memory into...

110. iommufd discussion

Mr Jason Gunthorpe (NVIDIA Networking), Kevin Tian (Intel)

15/11/2023, 12:15

VFIO/IOMMU/PCI MC

Open discussion on iommufd topics that have not been settled on the mailing list prior to the conference:

IOMMU based dirty tracking
IOMMU nested translation
IOMMU userspace command queue
Unique driver features
iommufd support of SVA/PRI/PASID
ARM interrupt handling in VMs
Driver enablement for iommufd features

26. Android Microconference

The Android Micro Conference brings the upstream community and Android systems developers together to discuss issues and changes to the Android platform and their dependencies and interactions with the Linux kernel, allowing for collaboration on solutions for upstream.

Since last year's conference, there has been quite...

25. Build Systems Microconference

In the Linux ecosystems there are many ways to build all the software used to put together a running system. Whether it’s building all the binary packages for a binary Linux distribution, using a source-based distribution, or building an embedded system from scratch, there are a lot of shared challenges which each system solves in their own way.

This microconference is a way to get people...

31. Compute eXpress Link (CXL)

Mr Gregory Price (MemVerge Inc)

Compute Express Link is a cache coherent fabric that in recent years has been gaining momentum in the industry. CXL 3.0 launched just before Plumbers 2022 (where very early discussions were had), bringing new challenges such as dynamic capacity devices and large scale fabrics, two features that bring significant challenges to Linux. There has also been controversy and confusion in the Linux...

22. Confidential Computing Microconference

The Confidential Computing microconferences in the past years brought together developers working secure execution features in hypervisors, firmware, Linux Kernel, over low-level user space up to container runtimes. A broad range of topics was discussed ranging from enablement for hardware features up to generic attestation workflows.

Over the last year there was progress on the development...

12. Containers and checkpoint/restore

The usual containers and checkpoint/restore micro-conference.

We will be discussing recent advancements in container technologies with some of the usual candidates being:

CGroupV2 feature parity with CGroupV1
Emulation of various files and system calls through FUSE and/or Seccomp
Dealing with the eBPF-ification of the world
Making user namespaces more accessible
VFS...

33. Internet of Things MC

The IoT Microconference is a forum for developers to discuss all things IoT. Topics include tools, telemetry, device drivers, and protocols in not only the Linux kernel but also Real-Time Operating Systems such as Zephyr.

Since last year, there have been a number of new technical topics with significant updates.

Opportunities in IoT and Edge computing with the Linux /dev/accel API
-...

8. Kernel Testing & Dependability MC

The Linux Plumbers 2023 Kernel Testing & Dependability track focuses on advancing the current state of testing of the Linux Kernel and its related infrastructure. The main purpose is to improve software quality and dependability for applications that require predictability and trust. We aim to create connections between folks working on similar projects, and help individual projects make...

37. KVM

James Gowans (Amazon EC2)

KVM (Kernel-based Virtual Machine) enables the use of hardware features to improve the efficiency, performance, and security of virtual machines created and managed by userspace. KVM was originally developed to host and accelerate "full" virtual machines running a traditional kernel and operating system, but has long since expanded to cover a wide array of use cases, e.g. hosting real time...

14. Linux Kernel Debugging Microconference

When things go wrong, we need to debug the kernel. There are about as many ways to do that as you can imagine: printk, kdb/kgdb over serial, tracing, attaching debuggers to /proc/kcore, and post-mortem debugging using core dumps, just to name a few. Frequently, tools and approaches used by userspace debuggers aren't enough for the requirements of the kernel, so special tools are created to...

15. Live Patching Microconference

The Live Patching microconference at Linux Plumbers 2023 aims to gather stakeholders and interested parties to discuss proposed features and outstanding issues in live patching.

Live patching is a critical tool for maintaining system uptime and security by enabling fixes to be applied to running systems without the need for a reboot. The development of the infrastructure is an ongoing...

36. Power Management and Thermal Control Micro-conference

The Power Management and Thermal Control Microconference focuses on power management and thermal control infrastructure, CPU and device power-management mechanisms, and thermal control methods. In particular, we are interested in improving the thermal control infrastructure in the kernel to cover more use cases and utilizing energy-saving opportunities offered by modern hardware in new...

16. Real-time and Scheduling MC

The real-time and scheduling micro-conference joins these two intrinsically connected communities to discuss the next steps together.

Over the past decade, many parts of PREEMPT_RT have been included in the official Linux codebase. Examples include real-time mutexes, high-resolution timers, lockdep, ftrace, RCU_PREEMPT, threaded interrupt handlers, and more. The number of patches that need...

226. newidle load balance optimization on high core count system

Chen Yu

Real-time and Scheduling MC

Before a CPU becomes idle, it kicks off idle load balance to pull tasks from other run queues to utilize the CPU and prevent it from idling. However, this search has a potential scalability problem when the number of CPUs and sched groups in the sched domain increases.

Idle load balance potentially traverses all sched domains and calculates the statistics one by one. The time cost on idle...

18. RISC-V MC

We'd like to propose another edition of the RISC-V microconference for Plumbers at 2023. Broadly speaking anything related to both Linux and RISC-V is on topic, but discussion tend to involve the following categories:

How to support new RISC-V ISA features in Linux, both for the standards are for vendor-specific extensions.
Discussions related to RISC-V based SOCs, which frequently...

24. Rust

Rust is a systems programming language that is making great strides in becoming the next big one in the domain.

Rust for Linux is the project adding support for the Rust language to the Linux kernel. Rust has a key property that makes it very interesting as the second language in the kernel: it guarantees no undefined behavior takes place (as long as unsafe...

35. Tracing Microconference

The Linux kernel has grown in complexity over the years. Complete understanding of how it works via code inspection has become virtually impossible. Today, tracing is used to follow the kernel as it performs its complex tasks. Tracing is used today for much more than simply debugging. Its framework has become the way for other parts of the Linux kernel to enhance and even make possible new...

29. VFIO/IOMMU/PCI MC

The [PCI interconnect][1] specification, the devices that implement it, and the system IOMMUs that provide memory and access control to them are nowadays a de-facto standard for connecting high-speed components, incorporating more and more features such as:

Address Translation Service (ATS)/Page Request Interface (PRI)
[Single-root I/O Virtualization (SR-IOV)][2]/Process Address...

Choose timezone

Linux Plumbers Conference 2023

2023