We are pleased to announce that the Containers and Checkpoint/Restore Microconference has been accepted into the 2021 Linux Plumbers Conference! The Containers and Checkpoint/Restore micro-conference brings together kernel developers, runtime maintainers, and developers working on container- and sandboxing related technologies in general to discuss current problems and agree on new features.
Last year’s meetup resulted in:
- The overlayfs to be mountable inside unprivileged containers
- The idmapped mount patchset that adds the ability to dynamically alter filesystem permissions for unprivileged containers (which was never solved from a decade of mailing list discussions)
- A new checkpoint restore capability was merged
- Hardening of procfs mount points
- Better integration of systemd-udevd with unprivileged containers.
This year’s edition of the Containers and Checkpoint/Restore micro-conference will focus on a variety of topics that are in need of discussion. The list of ideas is constantly evolving and we expect even more topics to pop up during the coming months as past experience has shown. Here is an excerpt:
- How to best use CAP_CHECKPOINT_RESTORE in CRIU to make it possible to run checkpoint/restore as non-root (with CAP_CHECKPOINT_RESTORE)
- Extending the idmapped mount feature to unprivileged containers, i.e. agreeing on a sane and safe delegation mechanism with clean semantics.
- Porting more filesystems to support idmapped mounts.
- Making it possible for unprivileged containers and unprivileged users in general to install fanotify subtree watches.
- Discussing and agreeing on a concept of delegated mounts, i.e. the ability for a privileged process to create a mount context that can be handed of to a lesser privileged process which it can interact with safely.
- Fixing outstanding problems in the seccomp notifier to handle syscall preemption cleanly. A patchset for this is already out but we need a more thorough understanding of the problem and its proposed solution.
- With more container engines and orchestrators supporting checkpoint/restore there has come up the idea to provide an optional interface with which applications can be notified that they are about to be checkpointed. Possible example is a JVM that could do cleanups which do not need to be part of a checkpoint.
- Discussing an extension of the seccomp API to make it possible to ideally attach a seccomp filter to a task, i.e. the inverse of the current model instead of caller-based seccomp sandboxing enabling full supervisor-based sandboxing.
- Integration of the new Landlock LSM into container runtimes.
- Although checkpoint/restore can handle cgroupv1 correctly the cgroupv2 support is very limited and there is a need to figure out what is still missing to have v2 supported just as good as v1.
- Isolated user namespaces (each with full 32bit uid/gid range) and easier way for users to create and manage them.
- Figure out what is missing on the checkpoint/restore level and maybe the container runtime level to support optimal checkpoint/restore integration on the orchestration level. Especially the pod concept of Kubernetes introduces new challenges which have not been part of checkpoint/restore before (containers sharing namespaces for example).
Come join us and participate in the discussion with what holds “The Cloud” together.
We hope to see you there!