# Safety in process CPU execution state

Ben Dooks Codethink Ltd

Dr Jens Petersohn Elektrobit Automotive GmbH





Pre Release



#### **About Ben**

- Ben is senior engineer and long time Linux kernel contributor at Codethink.
- Codethink is an ethical, independent, and versatile software services company, expert in the use of Open Source technologies for systems software engineering.
  - More info at https://www.codethink.co.uk/



# **About Jens**

- Jens Petersohn has been active in a variety of industries, the last 12 years in the automotive industry. He has been at Elektrobit for two years and prior to that at Continental AG for 10 years. In the past Jens has been employed at Silicon Graphics, Inc. in their Cray Supercomputer Division and has helped port Linux to the Intel IA-64 processor family.
- At Elektrobit Jens is responsible for ADAS and HAD products and has supported the development of EB corbos Linux for automotive applications for the last year.
- Elektrobit (EB) is an award-winning and visionary global supplier of embedded and connected software products and services for the automotive industry.
- A leader in automotive software with over 30 years serving the industry, EB's software powers over one billion devices in more than 100 million vehicles and offers flexible, innovative solutions for car infrastructure software, connectivity & security, automated driving and related tools, and user experience.
- EB is a wholly owned subsidiary of Continental AG. V1, 24-Aug-2020 © Elektrobit Automotive GmbH / Codethink Ltd. CC-BY 4.0



#### Introduction

- What are we protecting and why
- The system
  - Linux with IEC61508 SIL-2 mixed criticality
- The code flow
- Possible faults and mitigations
- A review of our mitigation



#### The CPU state

- Concentrating on per-core state
- Directly accessible registers
  - Integer
  - Floating point
  - Accelerators (MMX, SSE, AVX, etc)
- Indirect
  - Core control registers (debug, interrupt, etc)



#### X86 64-bit core registers

| ZMM0                                                  | YMM0 XMM0                          | ZMM1     | YMM1 XMM1        | ST(0)  | MMO   | ST(1) M | IM1            | ALAHAXEA       |          | R8B R8W R8D   | R8 R12BR12V  | R12DR12    | MSWC             | RO CR | 4    |  |
|-------------------------------------------------------|------------------------------------|----------|------------------|--------|-------|---------|----------------|----------------|----------|---------------|--------------|------------|------------------|-------|------|--|
| ZMM2                                                  | YMM2 XMM2                          | ZMM3     | YMM3 XMM3        | ST(2)  |       | ST(3) M |                | вівнВХЕЕ       |          |               | R9 R138R13V  | (R13DR13   | CR1              |       |      |  |
| ZMM4                                                  | YMM4 XMM4                          | ZMM5     | YMM5 XMM5        | ST(4)  |       | ST(5) M | IM5            | СССНСХЕ(       |          | R10BR10W R10D | R10 R148R14V | (R14DR14   | CR2              | CR    | 6    |  |
| ZMM6                                                  | YMM6 XMM6                          | ZMM7     | YMM7 XMM7        | ST(6)  | MM6   | ST(7) M | IM7            | <b>DLDHDXE</b> |          | R11BR11W R11D | R11 R158R15V | / R15D R15 | CR3              | CR    | 7    |  |
| ZMM8                                                  | YMM8 XMM8                          | ZMM9     | YMM9 XMM9        |        |       |         |                | BPLBPEB        | RBP      |               | DI IP        | EIP RIP    | MXCS             | RCR   | 8    |  |
| ZMM10                                                 | YMM10 XMM10                        | ZMM11    | YMM11 XMM11      | CW     | FP_IP | FP_DP F | P_CS           | SIL SI ES      | I RSI    |               | SP           |            |                  | CR    | 9    |  |
| ZMM12                                                 | YMM12 XMM12                        | ZMM13    | YMM13 XMM13      | SW     |       |         |                |                |          |               |              |            |                  | CR1   | 10   |  |
| ZMM14                                                 | MM14 YMM14 XMM14 ZMM15 YMM15 XMM15 |          |                  | TW     |       |         | 8-bit register |                | register |               |              |            | 256-bit register |       | CR11 |  |
| ZMM16 ZMM17 ZMM18 ZMM19 ZMM20 ZMM21 ZMM22 ZMM23 FP_DS |                                    |          |                  |        |       |         |                |                | CR1      | .2            |              |            |                  |       |      |  |
| ZMM24 ZMI                                             | M25 ZMM26 ZMM27                    | ZMM28 ZM | IM29 ZMM30 ZMM31 | FP_OPC | FP_DP | FP_IP   | CS             | SS             | DS       | GDTR          | IDTR         | DR0        | DR6              | CR1   | 13   |  |
|                                                       |                                    |          |                  |        |       |         | ES             | FS             | GS       | TR            | LDTR         | DR1        | DR7              | CR1   | 4    |  |
|                                                       |                                    |          |                  |        |       |         |                |                |          | FLAGS EFLAGS  | RFLAGS       | DR2        | DR8              | CR1   | 15   |  |
|                                                       |                                    |          |                  |        |       |         |                |                |          |               |              | DR3        | DR9              |       |      |  |
|                                                       |                                    |          |                  |        |       |         |                |                |          |               |              | DR4        | DR10             | DR12  | DR14 |  |
|                                                       |                                    |          |                  |        |       |         |                |                |          |               |              | DR5        | DR11             | DR13  | DR1  |  |

https://en.wikipedia.org/wiki/X86#/media/File:Table\_of\_x86\_Registers\_svg.svg

V1, 24-Aug-2020 © Elektrobit Automotive GmbH / Codethink Ltd. CC-BY 4.0



## How the code flows

- CPU executes instructions
- Intentional diversions
  - System calls
- External events
  - Interrupts
  - Exceptions (sync or async)
  - Signals (software events)
  - Architecture specific events





#### **Faults and errors**

- Not a complete list
- Mitigations
- Avoidance
- Useful Linux Kernel features



#### **Hardware faults**

| Failure              | Mitigations                                          |
|----------------------|------------------------------------------------------|
| Multiple or No entry | Verify actions post call<br>Sequence numbers         |
| Partial entry        | KPTI<br>SMAP, SMEP<br>Memory permissions<br>watchdog |



#### **Software faults**

- Data corruption
  - Whole other topic
- Incorrect task switching
  - Kernel saves essential state on entry
  - Only swaps everything on re-schedule
- Bad kernel code
  - Non-integer use requires notification to kernel



## **Mitigation strategies**

- Task isolation
  - Kernel threads still run
  - Interrupts and other events cannot be blocked
  - TL;DR you can reduce but not stop
- Kernel checking
  - Kernel sanitisers
  - Rewrite in safe language



# **Codethink mitigation**

- Kernel code to detect errors
  - Using shadow state
  - $\sim$ 2000 lines of C
- Wraps syscall and other entry points
  - Save state on entry
  - Compare on exit
- Detection not correction



# **Our mitigation issues**

- Significant overhead to kernel access
  - 170% slower for integer
  - 460% slower for fp/mmx/sse
  - Tested with getpid() call
- Does not cover 100% of the kernel code
  - The entry\_64.s not covered
- Upstream acceptability



### **Testing issues**

- Time
  - Kernel oops requires reboot
  - Number of test combinations
- Virtual vs Real
  - qemu issues with things like segment registers
  - And sometimes it just crashes with little explanation
- How to induce actual CPU hardware/microcode faults?



# Conclusions

- Mitigations can impact performance
- Difficult/impossible covering 100% of core failures
- Testing can be time consuming
- Going forward:
  - More user-space mitigations?
  - Partial task isolation?
- Any other suggestions



# **Presentation copyright**

- Creative Commons Attribution 4.0 International License
  - https://creativecommons.org/licenses/by/4.0/



Elektrobit Automotive GmbH Am Wolfsmantel 46 D-91058 Erlangen / Germany

www.elektrobit.com

Codethink Ltd 3<sup>rd</sup> Floor Dale House 35 Dale Street Manchester, M1 2HF, UK

www.codethink.co.uk