

# Plumbers Conference

Richmond, Virginia | November 13-15, 2023

# Taming the Incoherent Cache Issue in Confidential VMs

Jacky Li <jackyli@google.com> Mingwei Zhang <mizhang@google.com>



# • Problem Statement Incoherent cache lines Performance degradation Solution: Selective Cache Flushing • MMU Notifier Introduction filtering the reason

### Incoherent cache lines

C-bit  $\left( \right)$ () $\left( \right)$ 

#### • **C-bit**: mark whether a memory page is encrypted.

|                                                                     | 0 | 0 | 0 |        | 010       | 00010  |
|---------------------------------------------------------------------|---|---|---|--------|-----------|--------|
|                                                                     | 1 | 0 | 0 |        | 010       | 00010  |
| I Cache Tag                                                         |   |   |   | he Tag | Set Index | Offset |
| <ul> <li>Memory Management (kernel) recognizes the C-bit</li> </ul> |   |   |   |        |           |        |
| <ul> <li>Cache (hardware) doesn't know about C-bit</li> </ul>       |   |   |   |        |           |        |





# Incoherent cache lines

# TO: Cache Line 1 T1: Cache Line 2 0



 CVM releases the page -> non-CVM gets the same page 2 conflicting cache lines => Data Corruption Solution (2017): flush cache[1]



Conference | Richmond, VA | Nov. 13-15, 2023

### Performance degradation

so no need to flush. [2]

• **CPU => CPU** 

Solution: cache flush in mmu notifier when the page

leaves CVM [3]

• Perf impact [4]

# • SME\_COHERENT (2020): CPU cache recognized the C-bit

- Vulnerability CVE-2022-0171.

  - CPU => DMA devices





Solution: Selective Cache Flushing

• We are trying to...

• We wish...

• Non-trivial changes on KVM.

• Non-trivial changes on MM.

**Note: this problem will only affect SEV/SEV-ES** 

#### • Cache Flush Only When VM deallocate memory?

- KVM MMU does not manage memory...
- We have to do it at MMU\_NOTIFIER
- Cache Flush in **SMALLER** Granularity?

### **MMU Notifiers: reasons**



#### KVM MMU represents guest VM

#### MMU notifier invalidation does contain a reason parameter which is unused currently.

Host MMU represents the process



#### MMU Notifier: filtering the reason



#### MMU Notifier: filtering the reason

# Flush cache selectively on mmu\_notifier is the most cost effective approach with minimum changes to KVM

# • We have had discussions with AMD about addressing this issue in future HW



# Thank You! Q&A

#### Jacky Li <jackyli@google.com> Mingwei Zhang <mizhang@google.com>



#### **Appendix:**

[1] [2] encryption domains") [3] [4] 84f6058b01e30c63

89c505809052 ("KVM: SVM: Add support for KVM\_SEV\_LAUNCH\_UPDATE\_DATA command") elebb2b49048 ("KVM: SVM: Don't flush cache if hardware enforces cache coherency across

683412ccf612 ("KVM: SEV: add cache flush to solve SEV cache incoherency issues")

https://lore.kernel.org/kvm/YzJFvWPb1syXcVQm@google.com/T/#mb79712b3d141cabb166b5049









# **MMU Notifiers: reasons**



KVM MMU represents guest VM

# MMU notifier is the memory reclaim interface between KVM and host MM

Host MMU represents the process

#### Host OS



#### VM\_PAGEFLU SH: limited functionality

#### • MSR\_AMD64\_VM\_PAGE\_FLUSH (0xc001011e)

- CPUID level 0x8000001f (EAX), bit 2
- O X86\_FEATURE\_VM\_PAGE\_FLUSH
- Available on AMD EPYC v1 and later

#### • VM\_PAGEFLUSH MSR does not work on user addresses

• Even if we disable SMAP (EFLAGS.AC)

#### AMD APM updated on this at the end of 2022