Speaker
Description
The aim of this project is to accelerate network packet processing with AMDGPU in the Linux kernel without the need for user-space libraries.
The Linux kernel has not utilised the GPU directly without any userspace frameworks, such as CUDA or ROCm. However, AMD has published all the necessary data and source code for the GPU, enabling the Linux kernel to utilise AMDGPU itself without any userspace.
The Linux Network Subsystem includes many network features, such as protocol handlers, netfilter, OVS, tunnelling, crypto, TC and XDP. This project will enable these features to be offloaded with AMDGPU. XDP is the first use case of this project, so this session will focus on the XDP case.
The most important aspect of this project is the transfer of packets between NIC/host memory and the GPU. To transfer packets efficiently, Device Memory TCP and the GPU's copy engine should be utilised. This session will therefore briefly explain DMA-BUF, P2PDMA, Device Memory TCP and AMDGPU internals, showing how these functions are utilised.
To build binary code from BPF instructions using JIT, an understanding of GPU instructions and some internals is required, which is what this session will focus on as well.