As platforms grow in cpu count (200+ cpu), using per cpu data structures is becoming more and more expensive. Copying the percpu data from the bpf hashtab map to userspace buffers can take up to 22 us per entry on a platform with 256 cores.
This talk presents a detailed measurement study of the cost of percpu hashtab traversal, covering various methods and systems with core counts.
We will discuss how the current implementation of this data structure makes it hard to amortize cache misses, and solicit proposal for possible enhancements.
|I agree to abide by the anti-harassment policy||Yes|