Speakers
Description
To support various Linux Kernels in hyperscale data centers, it is important to aggregate signals (console output, crash dump, etc.) among millions of servers. One of the key types of information in this massive dataset is the Kernel version running on each host.
At Meta, we use netconsole to analyze console outputs from millions of servers. Recent work [1] added the kernel version to each netconsole message. This allows us to answer questions like “which kernels get error messages like XXX?”.
However, with the presence of livepatch, kernel version alone is not enough to differentiate systems with Kernel X and systems with Kernel X and livepatch Y.
To better understand the impact of a livepatch on thousands of hosts, i.e. during a livepatch roll out, we use some hack [2] to append a suffix to the kernel version in the netconsole record. This approach, however, is not sufficient for upstream use. Specifically, it cannot handle systems with multiple livepatches attached at the same time.
In this livepatch MC, we would like to discuss different options to include livepatch information in netconsole (and maybe also other data sources). The following are a few premature proposals:
- Expand
struct klp_patch
to contain a tag, and send it with Netconsole - Send
struct klp_patch->mod->name
in Netconsole - Append the information to
struct new_utsname
[1] https://lore.kernel.org/lkml/20230707132911.2033870-1-leitao@debian.org/
[2] https://pastebin.com/tcJTkNP2