Description
The track will be composed of talks, 30 minutes in length (including Q&A discussion). Topics will be advanced Linux networking and/or BPF related.
This year's Networking and BPF track technical committee is comprised of: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet, Alexei Starovoitov, Daniel Borkmann (chair), and Andrii Nakryiko.
-
Alexei Starovoitov (Meta)12/09/2022, 10:00
BPF programs can be written in C, Rust, Assembly, and even in Python. The majority of the programs are in C. The subset of C usable to write BPF programs was never strictly defined. It started very strict. Loops were not allowed, global variables were not available, etc As BPF ecosystem grew the subset of C language became bigger. But then something interesting happened. The C language itself...
Go to contribution page -
Benjamin Tissoires (Red Hat)12/09/2022, 10:30
HID (Human Interface Device) is an old protocol which handles input devices. It is supposed to be standard and to allow devices to work without the need for a driver. Unfortunately, it is not standard, merely โstandardโ.
The HID subsystem has roughly 80 drivers, half of them are fixing only one tiny bit, either in the protocol of the device or in the key mapping for...
Go to contribution page -
Barret Rhoden (Google)12/09/2022, 11:00
Ghost is a kernel scheduling class that allows userspace and eBPF programs, called the "agent", to control the scheduler.
Following up on last year's LPC talk, I'll cover:
Go to contribution page
- How BPF works in Ghost
- An agent that runs completely in BPF: no userspace scheduling required!
- Implementation details of "Biff": a bpf-hello-world example scheduler.
- Future work, including CFS-in-BPF, as... -
Yuchung Cheng (Google)12/09/2022, 12:00
For better or worse, TCP remains the main transport of many hyperscale data-center networks. Optimizing TCP has been a hot topic in both academic research and industry R&D. However individual research paper often focuses on solving a specific problem (e.g. congestion control for data-center incast) and the industry solutions are often not public or generically applicable. Since Linux TCP...
Go to contribution page -
David Ahern12/09/2022, 12:30
Ethernet networking speeds continue to increase โ 100G is common today for both NICs and switches, 200G has been available for years, 400G is the current cutting edge with 800G on the horizon. As the speed of the physical layer increases how does S/W scale - specifically, the Linux IP/TCP stack? Is it possible to leverage the increasing line-rate speeds for a single flow? Consider a few...
Go to contribution page -
Martin Lau (Meta)12/09/2022, 13:00
BPF has grown rapidly. In the networking stack, a BPF program can do much more than a few years ago. It could be overwhelming to figure out which bpf hook should be used, what is available at a particular layer and why. This talk will go through some of the bpf hooks in the networking stack with use cases in Meta. The talk will also get to some common questions/confusions that the users...
Go to contribution page -
KP Singh (Google)12/09/2022, 15:00
Signing BPF programs has been a long ongoing discussion and there has been some more concrete work and discussions since the BPF office hours talk in June.
There was a BoF session at the Linux security summit in Austin between BPF folks (KP and Florent) and IMA developers (Mimi, Stefan and Elaine) to agree on a solution to have IMA use BPF signatures.
The BPF position is to provide...
Go to contribution page -
Jinghao Jia (University of Illinois Urbana-Champaign), Prof. Tianyin Xu (University of Illinois at Urbana-Champaign)12/09/2022, 15:30
Seccomp, the widely used system-call security module in Linux, is among the few that still exposes classic BPF (cBPF) as the programming interface, instead of the modern eBPF. Due to the limited programmability of cBPF, today's Seccomp filters mostly implement static allow-deny lists. The only way to implement advanced policies is to delegate them to user space (e.g., Seccomp Notify); however,...
Go to contribution page -
Jiri Olsa (Isovalent)12/09/2022, 16:00
There's ongoing effort to speed up attaching of multiple probes,
which resulted in new bpf 'kprobe_multi' link interface. This allows
fast attachment of many kprobes (thousands) and is now supported for
example in bpftrace.Similar interface is being developed also for trampolines, but it's
bit more bumpy road than for kprobes for various reasons.I'll shortly sum up multi kprobe...
Go to contribution page -
Vaishali Thakkar, Javier Honduvilla Coto12/09/2022, 17:00
One of the important jobs of system-wide profilers is to capture stack traces without requiring recompilation or redeployment of profiled applications. This becomes difficult when the profiler has to deal with the binaries compiled in different languages. Heavy lifting for the stack unwinding is done by the kernel if frame pointers are present or if the binary has ORC - in kernel debug...
Go to contribution page -
Dmitrii Dolgov (Red Hat)12/09/2022, 17:30
Having full visibility throughout the system you build is well
established best practice. Usually one knows which metrics to collect,
how and what to profile or instrument to understand why the system
exhibits this level of performance. All of this becomes more challenging
as soon as eBPF layer is included.In this talk Dmitrii shed some light on those bits of your service that...
Go to contribution page -
Jakub Kicinski (Meta)13/09/2022, 10:00
Netlink is a TLV based protocol we invented and use in networking for most of our uAPI needs. It supports seamless extensibility, feature discovery and has been hardened over the years to prevent users from falling into uAPI extensibility gotchas.
Nevertheless netlink remains very rarely unused outside of networking. It's considered arcane and too verbose (requires defining operations,...
Go to contribution page -
Daniel Borkmann (Isovalent)13/09/2022, 10:30
Since the early days of eBPF, Cilium's core building block for its datapath is tc BPF. With more adopters of eBPF in the Kubernetes landscape, there is growing risk from a user perspective that Pods orchestrating tc BPF programs might step on each other, leading to hard to debug problems.
We dive into a recently experienced incident, followed by our proposal of a revamped tc ingress/egress...
Go to contribution page -
Anton Protopopov (Isovalent)13/09/2022, 11:00
There is a growing need in online packet classification for BPF-based networking solutions. In particular, in cilium we have two use cases: the PCAP recorder for the standalone XDP load balancer [1] and the k8s network policies. The PCAP recorder implementation suffers from slow and dangerous updates due to runtime recompilation, and both use cases require specifying port ranges in rules,...
Go to contribution page -
Jakub Sitnicki (Cloudflare), Marek Majkowski (Cloudflare)13/09/2022, 12:00
When establishing connections, a client needs a source IP address. For better or worse, network and service operators often assign traits to client IP addresses such as a reputation score, geolocation or traffic category, e.g. mobile, residential, server. These traits influence the way a service responds.
Transparent Web proxies, or VPN services, obfuscate true client IPs. To ensure a good...
Go to contribution page -
Stanislav Fomichev (Google)13/09/2022, 12:30
Google's container management system runs different workloads on the same host. To effectively manage networking resources, the kernel has to apply different networking policies to different containers.
Historically, most of the networking resource control happened inside proprietary Google networking cgroup. That cgroup is an interesting cross between upstream net_cls and net_prio, has a...
Go to contribution page -
Dave Thaler (Microsoft)13/09/2022, 13:00
At LSF/MM/BPF, the topic was raised about better documenting eBPF and making "standards" like documentation, especially since we are having runtimes other than just Linux now supporting eBPF.
This presentation will summarize the current state of the eBPF Foundation effort on these lines, how it is organized, and invite discussion and feedback on this topic.
Go to contribution page -
Toke Hรธiland-Jรธrgensen (Red Hat)13/09/2022, 15:00
Packet forwarding is an important use case for XDP, however, XDP currently offers no mechanism to delay, queue or schedule packets. This limits the practical uses for XDP-based forwarding to those where the capacity of input and output links always match each other (i.e., no rate transitions or many-to-one forwarding). It also prevents an XDP-based router from doing any kind of traffic shaping...
Go to contribution page -
Jesper Dangaard Brouer (Red Hat)13/09/2022, 15:30
The idea for XDP-hints, which is XDP gaining access HW offload hints, dates back to [Nov 2017][1]. We believe the main reason XDP-hints work have stalled are that upstream we couldn't get consensus on the layout of the XDP metadata. BTF was not ready at that time.
We believe the flexibility of BTF can resolve the layout issues, especially since BTF have evolved to include support for...
Go to contribution page -
Saeed Mahameed (Nvidia), Mark Bloch (Nvidia)13/09/2022, 16:00
For a long time now the industry has been building programmable
Go to contribution page
processors into devices to run firmware code. This is a long standing
design approach going back decades at this point. In some devices the
firmware is effectively a fixed function and has little in the way of
RAS features or configurability. However, a growing trend is to push
significant complexity into these devices... -
Aditi Ghag (Isovalent)13/09/2022, 17:00
Socket termination for policy enforcement and load-balancing
Cloud-native environments see a lot of churn where containers can come and go. We have compelling use cases like eBPF enabled policy enforcements and socket load-balancing, where we need an effective way to identify and terminate sockets with active as well as idle connections so that they can reconnect when the remote...
Go to contribution page -
Matthieu Baerts (Tessares)13/09/2022, 17:30
Multipath TCP (MPTCP) was initially supported in v5.6 of the Linux kernel. In subsequent releases, the MPTCP development community has steadily expanded from the initial baseline feature set to now support a broad range of MPTCP features on the wire and through the socket and generic Netlink APIs.
With core MPTCP functionality established, our next goal is to make MPTCP more extensible and...
Go to contribution page -
Brian Vazquez (Google)13/09/2022, 18:00
As platforms grow in cpu count (200+ cpu), using per cpu data structures is becoming more and more expensive. Copying the percpu data from the bpf hashtab map to userspace buffers can take up to 22 us per entry on a platform with 256 cores.
This talk presents a detailed measurement study of the cost of percpu hashtab traversal, covering various methods and systems with core counts.
Go to contribution page
We will... -
Quentin Monnet (Isovalent)14/09/2022, 10:00
So we have [the BPF CI][1], managed by Meta. It picks up patches from Patchwork, turns them into Pull Requests on GitHub, and through the GitHub Actions CI/CD framework runs the selftests with these patches on dedicated runners.
Thanks to this architecture, it is relatively easy to create Pull Requests and run the CI on another Linux repository on GitHub. However, the CI is being worked on...
Go to contribution page -
Joe Stringer (Isovalent)14/09/2022, 10:30
Currently, the BPF API for the LRU map type does not give any indication about when insertions into the map result in the eviction of another entry. This session is to discuss use cases when it would be useful to measure LRU eviction in order to provide insight into load and to tweak control plane behaviour. With this insight we can look at a proposal to make this possible through the BPF API.
Go to contribution page -
Lorenz Bauer14/09/2022, 11:00
While working on github.com/cloudflare/tubular we discovered that itโs possible for a program with CAP_BPF to circumvent file permissions of BPF map fds, effectively making it impossible to enforce read-only access. In our case, a process exporting metrics from maps canโt be prevented from also being able to modify those maps.
Go to contribution page
I will outline how permissions, map flags like BPF_F_RDONLY and... -
Alan Maguire (Oracle)14/09/2022, 12:00
BPF Compile Once - Run Everywhere (CO-RE) is a massive help when writing BPF programs that work across kernel versions, especially in the observability space, where we are often at the mercy of internal kernel changes in data structures and the like. However when writing BPF tracing programs, a major pain point is compiler optimizations which often mean the function - though not inlined - is...
Go to contribution page -
Prof. Theophilus Benson (Brown University), Dr Palanivel Kodeswaran (IBM Research), Dr Sayandeep Sen (IBM Research)14/09/2022, 12:30
Case for OPENED for eBPF NF Development
The recent past has been the emergence of eBPF in building high performance networking usecases such as load balancing, K8s CNI, DDoS protection, traffic shaping etc. However, unlike traditional software datapath technologies, eBPF code development exhibits enormous heterogenity in terms of choice of kernel hook points, data sharing mechanisms as...
Go to contribution page -
Alexandra Winter (IBM)14/09/2022, 13:00
Originally defined for the switchdev model, learning_sync is now used
to create a Linux bridge that provides a network interface that merges two
very different fabrics.In this talk you will learn
Go to contribution page
- About the motivation for our usecase.
- How we converged a vanilla network segment with a high performance fabric,
that connects only subsets of the segment.
- Why we chose the Linux... -
Nikita Baksalyar
eBPF gives us an extraordinary amount of power, allowing to attach custom programs to many subsystems in the kernel. We use it to build lots of helpful observability and security tools, but there is a problem: eBPF is hard and nonintuitive for developers who are not familiar with low-level programming concepts.
In this talk, we will discuss a novel approach to writing programs for the eBPF...
Go to contribution page -
Nikita Baksalyar
eBPF gives us an extraordinary amount of power, allowing to attach custom programs to many subsystems in the kernel. We use it to build lots of helpful observability and security tools, but there is a problem: eBPF is hard and nonintuitive for developers who are not familiar with low-level programming concepts.
In this talk, we will discuss a novel approach to writing programs for the eBPF...
Go to contribution page