Description
The track will be composed of talks, 45 minutes in length (including Q&A discussion). Topics will be advanced Linux networking and/or BPF related.
This year's Networking and BPF track technical committee is comprised of: David S. Miller, Daniel Borkmann, Alexei Starovoitov, Jakub Sitnicki, Paolo Abeni, Jakub Kicinski, Michal Kubecek, and Sabrina Dubroca.
We will present traceloop, a tool for tracing system calls in cgroups or in containers using in-kernel Berkeley Packet Filter (BPF) programs.
Many people use the “strace” tool to synchronously trace system calls using ptrace. Traceloop similarly traces system calls but with low overhead (no context switches) and asynchronously in the background, using BPF and tracing per cgroup. We will...
The 32-bit "mark" associated with the skb has served as a metadata exchange format for Linux networking subsystems since the beginning of the century. Over that time, the interpretation and reuse of the field has grown to encapsulate a wide range of networking use cases, expanding to touch everything from iptables, tc, xfrm, openvswitch, sockets, routing, to eBPF. In recent years, more than a...
We would like to present results of an estimation of tail calls costs between eBPF programs. This was carried out for two kernel versions, 5.4 and 5.5. The latter introduces an optimization to remove the retpoline mitigating spectre flaws, in certain conditions. The numbers come from 2 benchmarks, executed over our eBPF software stack. The first one uses the in-kernel testing...
In the proposed talk I would like to discuss the opportunity to create a core for XDP program offloading from a guest to a host. The main goal here is to increase packet processing speed.
There was an attempt to merge offloading for virtio-net but the work is in progress.
After addition XDP processing to the xen-netfront driver the similar Xen task has to be solved as well.
vmxnet3...
The d_path is eBPF tracing helper, that returns string with
full path for given 'struct path' object and was requested
long time ago by many people.
Along the way of implementing it, other features had to be
added to the verifier:
-
compile time BTF IDs resolving
This allows using of kernel objects BTF IDs without resolving
them in runtime and saves few cycles on...
This introduces a working proof-of-concept alternative to RDMA, implementing a zero-copy DMA transfer between the NIC and GPU, while still performing the protocol processing on the host CPU. A normal NIC/host memory implementation is also presented.
By offloading most of the data transfer from the CPU, while not needing to reimplement the protocol stack, this should provide a balance...
As UDP does not have flood attack protections such as SYN cookies, we developed a novel fair-share ratelimiter in unprivileged BPF, designed for a UDP reverse proxy, that is capable of applying rate limits to specific traffic streams while minimizing the impact on others. To achieve this, we base our work on [Hierarchical Heavy Hitters][1], which proposes a method to group packets on source...
The BPF LSM or Kernel Runtime Security Instrumentation (KRSI) aims to provide an extensible LSM by allowing privileged users to attach eBPF programs to security hooks to dynamically implement MAC and Audit Policies.
KRSI was introduced in LSS-US 2019 and has since then had multiple interesting updates and triggered some meaningful discussions. The talk provides an update on:
- Progress...
At last year's LPC I presented a proposal for how to attach multiple XDP programs to a single interface and have them run in sequence. In this presentation I will follow up on that, and present the current status and next steps on this feature.
Briefly, the solution we ended up with was a bit different from what I envisioned at the last LPC: We now rely on the new 'freplace' functionality...
In this talk we introduce Per Thread Queues (PTQ). PTQ is a type of network packet steering that allows application threads to be assigned dedicated network queues for both transmit and receive. This facility provides highly granular traffic isolation between applications and can also help facilitate high performance when combined with other techniques such as busy polling. PTQ extends both...
Today we have a few dozens of Qdisc’s available in Linux kernel, offering various algorithms to schedule network packets. You can change the parameters of each Qdisc, but you can not change the core algorithm of a given Qdisc. A programmable Qdisc offers a way to customize your own scheduling algorithms without writing a Qdisc kernel module from scratch. With eBPF emerges across the Linux...
Linux has a new 'lockdown' security mode where changes to the running kernel
requires verification with a cryptographic signature and restrictions to
accesses to kernel memory that may leak to userspace.
Lockdown's 'integrity' mode requires just the signature, while in
'confidentiality' mode in addition to requiring a signature the system can't
leak confidential information to...
With the incredible pace of containerisation in enterprises, the combination of Linux and Kubernetes as an orchestration base layer is often considered as the "cloud OS". In this talk we provide a deep dive on Kubernetes's service abstraction and related to it the path of getting external network traffic into one's cluster.
With this understanding in mind, we then discuss issues and...
Android Networking - update for 2020:
- what are our pain points wrt. kernel & networking in general,
- progress on upstreaming Android Common Kernel networking code,
- and the unknown depths of non-common vendor changes,
- how we're using bpf,
- how it's working,
- what's not working,
- how it's better then writing kernel code,
- why it's so much worse,
- etc...
Right-sizing BPF maps is hard. By allocating for a worse case scenario we build large maps consuming large chunks of memory for a corner case that may never occur. Alternatively, we may try to allocate for the normal case choosing to ignore or fail in the corner cases. But, for programs running across many different workloads and system parameters its difficult to even decide what a normal...
In this talk we will present Magic Transit, Cloudflare's layer 3 DDoS protection service, as a case study in building a network product from the standard linux networking stack. Linux provided us with flexibility and isolation that allowed us to stand up this product and on-board more than fifty customers within a year of conceptualization. Cloudflare runs all of our services on every server...
This talk will present our ongoing efforts of using formal verification
to eliminate bugs in BPF JITs in the Linux kernel. Formal verification
rules out classes of bugs by mechanically proving that an implementation
adheres to an abstract specification of its desired behavior.
We have used our automated verification framework, Serval, to find 30+
new bugs in JITs for the x86-32, x86-64,...
This talk will discuss some recent works that extend the TCP stack with BPF: TCP header option, TCP Congestion Control (CC), and socket local storage.
Hopefully the talk can end with getting ideas/desires on which part of the stack can practically be realized in BPF.
OVS has two major datapaths: 1) the Linux kernel datapath, which shipped with Linux distributions and 2) the userspace datapath, which usually coupled with DPDK library as packet I/O interface, and called OVS-DPDK. Recent OVS also supports two offload mechanisms: the TC-flower for the kernel datapath, and the DPDK rte_flow for the userspace datapath. The tc-flower API with kernel datapath...