Description
The real-time community around Linux has been responsible for important changes in the kernel over the last few decades. Preemptive mode, high-resolution timers, threaded IRQs, sleeping locks, tracing, deadline scheduling, and formal tracing analysis are integral parts of the kernel rooted in real-time efforts, mostly from the PREEMPT_RT patch set. The real-time and low latency properties of Linux have enabled a series of modern use cases, like low latency network communication with NFV and the use of Linux in safety-critical systems.
This MC is the space for the community to discuss the advances of Linux in real-time and low latency features. For example (but not limited to):
- Bits left for the PREEMPT_RT merge
- Advances in the fully preemptive mode
- CPU isolation (mainly about how to make it dynamic)
- Tools for PREEMPT_RT and low latency analysis
- Tools for detecting non-optimal usages of the PREEMPT_RT
- Improvement on locks non-protected for priority inversion
- General improvements for locking
- General improvements for scheduling
- Other RT operating systems that run in parallel with Linux and the integration with Linux
- Real-time virtualization
Examples of topics that the community discussed over the last years that made progress in the RT MC:
- timerlat/osnoise tracers and RTLA
- DL server for starvation avoidance
- Proxy execution (still under discussion)
- Tracing improvements - for example, to trace IPIs
Join us to discuss the future of real-time and low-latency Linux.
Ensuring temporal correctness of real-time systems is challenging.
The level of difficulty is determined by the complexity of hardware, software, and their interaction.
Real-time analysis on modern complex hardware platforms with modern complex software ecosystems, such as the Linux kernel with its userland, is hard or almost impossible with traditional methods like formal verification or...
Some kernel code implement a parallel programming strategy
that grabs local_locks() for most of the work, and then use schedule_work_on(cpu) when some rare remote operations are needed. This is quite efficient for throughput, since it keeps cacheline mostly local and avoid locks in non-RT kernels, paying the price when you need to touch a remote CPU.
On the other hand, that's quite bad...
In the mission of reducing latency in KVM guests, we have seen a lot of missed deadlines caused by RCU core invocation, often causing guest exit only to have a timer interrupt invoking rcu_core() on host and causing a task switch.
While looking to improve that, it was noticed that no RCU lock is held in guest context, and thus it's possible to report a quiescent state in guest exit,...
CPU isolation allows us to shield a subset of CPUs from a lot of kernel interference, but not all of it. Activity on the housekeeping CPUs can and does trigger IPIs which can still end up targeting isolated CPUs. The main culprits here are static key updates and vunmap() + the resulting flush_tlb_kernel_range().
As discussed in previous editions, since these IPIs are only relevant to the...
FIFO tasks may starve other non-RT tasks, which is mitigated by RT throttling.
Deadline servers have been introduced and are still under development as an alternative to mitigate and avoid starvation of non-RT tasks.
There is, however, the chance that some other FIFO tasks will be starved and that could lead to system deadlock.
I would like to open the discussion about the possibility...