At MongoDB, we implemented an eBPF tool to collect and display a complete time-series view of information about all threads whether they are on- or off-CPU. This allows us to inspect where the database server spends its time, both in userspace and in kernel. Its minimal overhead allows to deploy it in production.
This can be an effective method to collect diagnostic information in the field and surface a specific workload which is bound by a syscall. It would be interesting to hear what solution other vendors use to profile in production.
|I agree to abide by the anti-harassment policy||Yes|