Description
The Containers and Checkpoint/Restore micro-conference focuses on both userspace and kernel related work. The micro-conference targets the wider container ecosystem ideally with participants from all major container runtimes as well as init system developers.
The microconference will be discussing recent advancements in container technologies with some of the usual candidates being:
- VFS API improvements (new system calls, idmap, ...)
- CGroupV2 feature parity with CGroupV1 and migration path
- Dealing with the eBPF-ification of the world
- Mediating and intercepting complex system calls
- Making user namespaces more accessible
- Verifying the integrity of containers
On the checkpoint/restore front, some of the potential topics include:
- Making CRIU work with modern Linux distributions
- Handling GPUs
- Restoring FUSE daemons
- Dealing with restartable sequences
And quite likely a variety of other container and checkpoint/restore topics as things evolve between now and the event.
Past editions of this micro-conference have been the source of many developments in the Linux kernel, including:
- PIDfds
- VFS idmap (and adding it to a slew of filesystems)
- FUSE in user namespaces
- Unprivileged overlayfs
- Time namespace
- A variety of CRIU features and checkpoint/restore kernel interfaces with the latest among them being
- Unpriviledged checkpoint/restore
- Support of rseq(2) checkpointing
- IMA/TPM attestation work
Unsolved CRIU problems.
1) Restoring complex process trees.
Processes can not enter into pre-existing process-session (sid), sessions can
only be inherited. (Same for process-groups (pgid) in nested pid namespaces.)
Probable solution 1 - CABA:
The idea was to save as much of the...
Container checkpointing has recently been enabled in orchestration platforms like Kubernetes, where the smallest deployable unit is a Pod (a group of containers). However, these platforms are often used to deploy distributed applications running across multiple nodes, which presents a new challenge: How to create consistent global checkpoints of distributed applications running in multiple...
This talk is about a problem of integration between the concept of an "isolated" ([1], [2], [3], [4]) user namespace and cgroup-v2 delegation model.
The biggest challenge here is that cgroup delegation is based on cgroupfs inodes ownership and cgroupfs superblock is shared between all containers which makes it impossible to deal with cgroupfs as with any other containerized filesystem like...
PuzzleFS is a container filesystem designed to address the limitations of the existing OCI format. The main goals of the project are reduced duplication, reproducible image builds, direct mounting support and memory safety guarantees, some inspired by the OCIv2 brainstorm document.
Reduced...
- New machines with 512+ hardware threads (and thus logical CPUs) bring
interesting challenges for user-space per-CPU data structures due to
their large memory use. - The RSEQ per-memory-map concurrency IDs (upstreamed in Linux v6.3)
allow indexing user-space memory based on indexes derived from the
number of concurrently running threads, - I plan to apply the same concept to...