Description
The track will be composed of talks, 40 minutes in length (including Q&A discussion). Topics will be advanced Linux networking and/or BPF related.
This year's Networking and BPF track technical committee is comprised of: David S. Miller, Jakub Kicinski, Eric Dumazet, Alexei Starovoitov, Daniel Borkmann, and Andrii Nakryiko.
Short intro/welcome session to the Networking and BPF track.
eBPF has been used extensively in performance profiling and monitoring. In this talk, I will describe a set of eBPF applications that help monitor and enhance cpu scheduling performances. These applications include:
-
Profiling scheduling latencies. I will talk about an application of eBPF to collect scheduling latency stats.
-
Profiling resource efficiency. For background, I will first...
This talk presents our recent work available in the v5.14 kernel, which improves the SO_REUSEPORT
functionality.
The SO_REUSEPORT
option was introduced in v3.9. In the former version, only one socket is allowed to listen()
on any given TCP port. The traditional technique for a high-performance server is to have a single process that accept()
s and distributes connections to other...
XDP is designed for maximum performance which is why certain driver use-cases are not supported (e.g. Jumbo frames or TSO/LRO). The single buffer per-packet design defines a simple and fast memory model and allows eBPF Direct Access (DA) to packet data. Both of them are essential for performance. However, it is the high time we fill the gap with the networking stack and enable non-linear frame...
BPF programs are critical system components performing core networking functionality, system audit logs, tracing, and runtime security enforcement to list a few. Charged with such crucial tasks, how do we audit the BPF subsystem itself to ensure system bugs are noticed and malicious attackers can not subtly manipulate these components, inject new programs, or quietly run their own BPF...
Rust is becoming an increasingly popular choice as a systems programming language. In fact, it's been the #1 most loved language on Stack Overflow for the last 6 years. Aside from being fast, type safe and memory safe, its tooling is excellent which yields high developer productivity. It has been used to write embedded systems software, it is central to the WebAssembly ecosystem, and it is...
The DSA subsystem was originally built around Marvell devices, but has since
been extended to cover a wide variety of hardware with even wider views of
their management model. This presentation discusses the changes in DSA that
took place in the last years for this wide variety of switches to offer more
services, and in a more uniform way, to the larger network stack.
Summarized, these...
This talk will review the goals and requirements for a BPF memory model and look at more recent work on deriving memory-model litmus tests from example BPF programs. These examples will cover ordering within and among BPF programs, but also ordering with the kernel code that BPF programs can interact with.
The report covers the use of flow label in modern network environment and the effect of TCP hash 'rethinking' upon negative routing event on the operations.
Since the cilium/ebpf pure Golang library was last presented at LPC 2019, a lot has changed. eBPF is now seemingly on everyone's radar, the eBPF Foundation is a thing, and more people are using and writing Go-based tools and services than ever. What does this mean for the library and the ecosystem around it? Who uses it, who's been contributing, and which use cases does the library enable...
With the rapid adoption of Cilium as the BPF-based datapath for Kubernetes as
well as integration into popular devops tooling such as kind [0] which allows
for running local Kubernetes clusters using Docker container 'nodes', we see
more advanced use (and corner) cases which have not yet been tackled from an
BPF and networking angle. Therefore, in this slot, we discuss on various...
Motivation
Iptables has become a synonym of a firewall in Linux world. Although there is a
nftables which is supposed to replace iptables, iptables will exist for
decades more because of its popularity and ubiquity.
With the growing widespread use of BPF technology and its benefits there is a
temptation to apply the technology for the firewalling purposes.
Problem...
Prior to LWT (Lightweight Tunnels) and modern eBPF, the only way to send encapsulated packets to multiple destinations was achieved by creating multiple tunnel devices which didn’t scale well when thousands of different destinations were needed.
In the past Google solved this problem by introducing custom patches on top of the ip gre device to allow sockets to provide the destination...
This talk highlights a few rough edges in the overall BPF user experience that we have observed while building services with BPF at Cloudflare. We will showcase a set of problems, analyze their cause, and present possible workarounds. The goal of the talk is to share collected know-how with other users, and trigger discussions on potential improvements.
Collected cases fall into two...
In this talk, we describe important challenges in L4 and L7 load balancing for the consistent routing of packets across hosts as well as across sockets within a host, once a packet is received in the XDP based L4LB. We then describe how we leverage recent additions on the BPF programs to address those challenges.
Typically some form of Consistent Hashing is used to pick an end host for...
The BPF verifier is an integral part of the BPF ecosystem, aiming to
prevent unsafe BPF programs from executing in the kernel. Due to its
complexity, the verifier is susceptible to bugs that can allow malicious
BPF programs through. A number of bugs have been found in the BPF
verifier, some of which have led to CVEs ([1], [2], [3]). These bugs are severe,
since the verifier is on the...
We present Pixie’s protocol tracer, which uses eBPF to provide instant observability into application messaging without requiring code instrumentation. Pixie’s protocol tracer uses eBPF kprobes on networking-related system calls to capture communication data, which it then parses into protocol messages. The messages are inserted into structured data tables that are easily queried by...
As eBPF is getting more popular and mainstream, one of the challenges of making it accessible to more users is how to distribute eBPF powered applications. Unlike simpler applications which involves shipping a binary or a container image, with eBPF we usually need to compile the program for the target kernel. This is a hurdle in adoption by both users and vendors. The CO-RE (Compile Once - Run...
This talk will present K2, an optimizing compiler that uses program synthesis to automatically produce both safe, compact, more performant BPF bytecode. K2 compresses BPF bytecode by 6-26%, improves throughput by 0–4.75%, and reduces average latency by 1.36–55.03%, across benchmarks from Cilium, Facebook Katran, hXDP, and the Linux kernel. We designed several domain-specific techniques to make...
We’ll discuss some recent and ongoing work we’ve been doing to audit Google’s Linux systems with eBPF. We’ll look at a case study of the problems we’ve solved for logging process lifecycles, and then look at the challenges we’re facing to make these systems as reliable and maintainable as possible. The topics we’ll cover include:
- A brief overview of the BPF LSM
- Why and how we ended up...
Although an IPv6 only environment is ideal, the path to migration from an IPv4 environment is gradual and will present situations where an IPv6 client will need ongoing connectivity to an IPv4-only server. Such a communication path will need to use one of the existing IPv6 to IPv4 transition mechanisms (such as NAT or a dual IPv4 + IPv6 stack).
We will demonstrate a novel approach to this...
In Linux, the IPv4 code generally uses IPTOS_TOS_MASK (0x1e) when
handling the TOS (Type of Service) of IPv4. This mask follows the
definition of RFC 1349:
0 1 2 3 4 5 6 7
+-----+-----+-----+-----+-----+-----+-----+-----+
| | | |
| PRECEDENCE | TOS | MBZ |
| ...