The kernel comes with its own implementation of common routines of the C libraries (memcpy, memcmp, strcmp, etc.). Since the kernel already has a rich infrastructure to handle architecture and platform-specific features, such as code patching or static calls, there is an opportunity to speed up these routines for certain platforms. The goal of this discussion is to identify the routines that can benefit from such an optimized implementation, what ISA extensions would be in focus (e.g. only stateless - so no vector instructions?) and what a reasonable grouping of target platforms could look like (e.g. a generic C implementation, one for each profile plus one for additional fast unaligned memory accesses).
|I agree to abide by the anti-harassment policy||Yes|