<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/x86, branch v6.8</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm</title>
<updated>2024-03-10T16:27:39+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-03-10T16:27:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=137e0ec05aeb657472d2f84dbb3081016160334b'/>
<id>137e0ec05aeb657472d2f84dbb3081016160334b</id>
<content type='text'>
Pull kvm fixes from Paolo Bonzini:
 "KVM GUEST_MEMFD fixes for 6.8:

   - Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY
     to avoid creating an inconsistent ABI (KVM_MEM_GUEST_MEMFD is not
     writable from userspace, so there would be no way to write to a
     read-only guest_memfd).

   - Update documentation for KVM_SW_PROTECTED_VM to make it abundantly
     clear that such VMs are purely for development and testing.

   - Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term
     plan is to support confidential VMs with deterministic private
     memory (SNP and TDX) only in the TDP MMU.

   - Fix a bug in a GUEST_MEMFD dirty logging test that caused false
     passes.

  x86 fixes:

   - Fix missing marking of a guest page as dirty when emulating an
     atomic access.

   - Check for mmu_notifier invalidation events before faulting in the
     pfn, and before acquiring mmu_lock, to avoid unnecessary work and
     lock contention with preemptible kernels (including
     CONFIG_PREEMPT_DYNAMIC in non-preemptible mode).

   - Disable AMD DebugSwap by default, it breaks VMSA signing and will
     be re-enabled with a better VM creation API in 6.10.

   - Do the cache flush of converted pages in svm_register_enc_region()
     before dropping kvm-&gt;lock, to avoid a race with unregistering of
     the same region and the consequent use-after-free issue"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  SEV: disable SEV-ES DebugSwap by default
  KVM: x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing
  KVM: SVM: Flush pages under kvm-&gt;lock to fix UAF in svm_register_enc_region()
  KVM: selftests: Add a testcase to verify GUEST_MEMFD and READONLY are exclusive
  KVM: selftests: Create GUEST_MEMFD for relevant invalid flags testcases
  KVM: x86/mmu: Restrict KVM_SW_PROTECTED_VM to the TDP MMU
  KVM: x86: Update KVM_SW_PROTECTED_VM docs to make it clear they're a WIP
  KVM: Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY
  KVM: x86: Mark target gfn of emulated atomic instruction as dirty
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull kvm fixes from Paolo Bonzini:
 "KVM GUEST_MEMFD fixes for 6.8:

   - Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY
     to avoid creating an inconsistent ABI (KVM_MEM_GUEST_MEMFD is not
     writable from userspace, so there would be no way to write to a
     read-only guest_memfd).

   - Update documentation for KVM_SW_PROTECTED_VM to make it abundantly
     clear that such VMs are purely for development and testing.

   - Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term
     plan is to support confidential VMs with deterministic private
     memory (SNP and TDX) only in the TDP MMU.

   - Fix a bug in a GUEST_MEMFD dirty logging test that caused false
     passes.

  x86 fixes:

   - Fix missing marking of a guest page as dirty when emulating an
     atomic access.

   - Check for mmu_notifier invalidation events before faulting in the
     pfn, and before acquiring mmu_lock, to avoid unnecessary work and
     lock contention with preemptible kernels (including
     CONFIG_PREEMPT_DYNAMIC in non-preemptible mode).

   - Disable AMD DebugSwap by default, it breaks VMSA signing and will
     be re-enabled with a better VM creation API in 6.10.

   - Do the cache flush of converted pages in svm_register_enc_region()
     before dropping kvm-&gt;lock, to avoid a race with unregistering of
     the same region and the consequent use-after-free issue"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  SEV: disable SEV-ES DebugSwap by default
  KVM: x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing
  KVM: SVM: Flush pages under kvm-&gt;lock to fix UAF in svm_register_enc_region()
  KVM: selftests: Add a testcase to verify GUEST_MEMFD and READONLY are exclusive
  KVM: selftests: Create GUEST_MEMFD for relevant invalid flags testcases
  KVM: x86/mmu: Restrict KVM_SW_PROTECTED_VM to the TDP MMU
  KVM: x86: Update KVM_SW_PROTECTED_VM docs to make it clear they're a WIP
  KVM: Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY
  KVM: x86: Mark target gfn of emulated atomic instruction as dirty
</pre>
</div>
</content>
</entry>
<entry>
<title>SEV: disable SEV-ES DebugSwap by default</title>
<updated>2024-03-09T16:42:25+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2024-03-09T16:24:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5abf6dceb066f2b02b225fd561440c98a8062681'/>
<id>5abf6dceb066f2b02b225fd561440c98a8062681</id>
<content type='text'>
The DebugSwap feature of SEV-ES provides a way for confidential guests to use
data breakpoints.  However, because the status of the DebugSwap feature is
recorded in the VMSA, enabling it by default invalidates the attestation
signatures.  In 6.10 we will introduce a new API to create SEV VMs that
will allow enabling DebugSwap based on what the user tells KVM to do.
Contextually, we will change the legacy KVM_SEV_ES_INIT API to never
enable DebugSwap.

For compatibility with kernels that pre-date the introduction of DebugSwap,
as well as with those where KVM_SEV_ES_INIT will never enable it, do not enable
the feature by default.  If anybody wants to use it, for now they can enable
the sev_es_debug_swap_enabled module parameter, but this will result in a
warning.

Fixes: d1f85fbe836e ("KVM: SEV: Enable data breakpoints in SEV-ES")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The DebugSwap feature of SEV-ES provides a way for confidential guests to use
data breakpoints.  However, because the status of the DebugSwap feature is
recorded in the VMSA, enabling it by default invalidates the attestation
signatures.  In 6.10 we will introduce a new API to create SEV VMs that
will allow enabling DebugSwap based on what the user tells KVM to do.
Contextually, we will change the legacy KVM_SEV_ES_INIT API to never
enable DebugSwap.

For compatibility with kernels that pre-date the introduction of DebugSwap,
as well as with those where KVM_SEV_ES_INIT will never enable it, do not enable
the feature by default.  If anybody wants to use it, for now they can enable
the sev_es_debug_swap_enabled module parameter, but this will result in a
warning.

Fixes: d1f85fbe836e ("KVM: SEV: Enable data breakpoints in SEV-ES")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'kvm-x86-guest_memfd_fixes-6.8' of https://github.com/kvm-x86/linux into HEAD</title>
<updated>2024-03-09T16:42:17+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2024-03-09T16:20:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=39fee313fdd489480eb2c453535ab73672c39a27'/>
<id>39fee313fdd489480eb2c453535ab73672c39a27</id>
<content type='text'>
KVM GUEST_MEMFD fixes for 6.8:

 - Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to
   avoid creating ABI that KVM can't sanely support.

 - Update documentation for KVM_SW_PROTECTED_VM to make it abundantly
   clear that such VMs are purely a development and testing vehicle, and
   come with zero guarantees.

 - Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan
   is to support confidential VMs with deterministic private memory (SNP
   and TDX) only in the TDP MMU.

 - Fix a bug in a GUEST_MEMFD negative test that resulted in false passes
   when verifying that KVM_MEM_GUEST_MEMFD memslots can't be dirty logged.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
KVM GUEST_MEMFD fixes for 6.8:

 - Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to
   avoid creating ABI that KVM can't sanely support.

 - Update documentation for KVM_SW_PROTECTED_VM to make it abundantly
   clear that such VMs are purely a development and testing vehicle, and
   come with zero guarantees.

 - Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan
   is to support confidential VMs with deterministic private memory (SNP
   and TDX) only in the TDP MMU.

 - Fix a bug in a GUEST_MEMFD negative test that resulted in false passes
   when verifying that KVM_MEM_GUEST_MEMFD memslots can't be dirty logged.
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'kvm-x86-fixes-6.8-2' of https://github.com/kvm-x86/linux into HEAD</title>
<updated>2024-03-09T16:42:06+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2024-03-09T16:18:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1b6c146df555bcfb9ad55dbb745ee364a9a0159f'/>
<id>1b6c146df555bcfb9ad55dbb745ee364a9a0159f</id>
<content type='text'>
KVM x86 fixes for 6.8, round 2:

 - When emulating an atomic access, mark the gfn as dirty in the memslot
   to fix a bug where KVM could fail to mark the slot as dirty during live
   migration, ultimately resulting in guest data corruption due to a dirty
   page not being re-copied from the source to the target.

 - Check for mmu_notifier invalidation events before faulting in the pfn,
   and before acquiring mmu_lock, to avoid unnecessary work and lock
   contention.  Contending mmu_lock is especially problematic on preemptible
   kernels, as KVM may yield mmu_lock in response to the contention, which
   severely degrades overall performance due to vCPUs making it difficult
   for the task that triggered invalidation to make forward progress.

   Note, due to another kernel bug, this fix isn't limited to preemtible
   kernels, as any kernel built with CONFIG_PREEMPT_DYNAMIC=y will yield
   contended rwlocks and spinlocks.

   https://lore.kernel.org/all/20240110214723.695930-1-seanjc@google.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
KVM x86 fixes for 6.8, round 2:

 - When emulating an atomic access, mark the gfn as dirty in the memslot
   to fix a bug where KVM could fail to mark the slot as dirty during live
   migration, ultimately resulting in guest data corruption due to a dirty
   page not being re-copied from the source to the target.

 - Check for mmu_notifier invalidation events before faulting in the pfn,
   and before acquiring mmu_lock, to avoid unnecessary work and lock
   contention.  Contending mmu_lock is especially problematic on preemptible
   kernels, as KVM may yield mmu_lock in response to the contention, which
   severely degrades overall performance due to vCPUs making it difficult
   for the task that triggered invalidation to make forward progress.

   Note, due to another kernel bug, this fix isn't limited to preemtible
   kernels, as any kernel built with CONFIG_PREEMPT_DYNAMIC=y will yield
   contended rwlocks and spinlocks.

   https://lore.kernel.org/all/20240110214723.695930-1-seanjc@google.com
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'hyperv-fixes-signed-20240303' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux</title>
<updated>2024-03-05T20:38:50+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-03-05T20:38:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1c46d04a0dae3fcb8d6505de0091499b626cdb35'/>
<id>1c46d04a0dae3fcb8d6505de0091499b626cdb35</id>
<content type='text'>
Pull hyperv fixes from Wei Liu:

 - Multiple fixes, cleanups and documentations for Hyper-V core code and
   drivers

* tag 'hyperv-fixes-signed-20240303' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: make hv_bus const
  x86/hyperv: Allow 15-bit APIC IDs for VTL platforms
  x86/hyperv: Make encrypted/decrypted changes safe for load_unaligned_zeropad()
  x86/mm: Regularize set_memory_p() parameters and make non-static
  x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor callback
  Documentation: hyperv: Add overview of PCI pass-thru device support
  Drivers: hv: vmbus: Update indentation in create_gpadl_header()
  Drivers: hv: vmbus: Remove duplication and cleanup code in create_gpadl_header()
  fbdev/hyperv_fb: Fix logic error for Gen2 VMs in hvfb_getmem()
  Drivers: hv: vmbus: Calculate ring buffer size for more efficient use of memory
  hv_utils: Allow implicit ICTIMESYNCFLAG_SYNC
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull hyperv fixes from Wei Liu:

 - Multiple fixes, cleanups and documentations for Hyper-V core code and
   drivers

* tag 'hyperv-fixes-signed-20240303' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: make hv_bus const
  x86/hyperv: Allow 15-bit APIC IDs for VTL platforms
  x86/hyperv: Make encrypted/decrypted changes safe for load_unaligned_zeropad()
  x86/mm: Regularize set_memory_p() parameters and make non-static
  x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor callback
  Documentation: hyperv: Add overview of PCI pass-thru device support
  Drivers: hv: vmbus: Update indentation in create_gpadl_header()
  Drivers: hv: vmbus: Remove duplication and cleanup code in create_gpadl_header()
  fbdev/hyperv_fb: Fix logic error for Gen2 VMs in hvfb_getmem()
  Drivers: hv: vmbus: Calculate ring buffer size for more efficient use of memory
  hv_utils: Allow implicit ICTIMESYNCFLAG_SYNC
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'x86_urgent_for_v6.8_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2024-03-03T17:43:03+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-03-03T17:43:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=73d35f83354c2b54a0bf52fef94b40f483680b91'/>
<id>73d35f83354c2b54a0bf52fef94b40f483680b91</id>
<content type='text'>
Pull x86 fixes from Borislav Petkov:

 - Do not reserve SETUP_RNG_SEED setup data in the e820 map as it should
   be used by kexec only

 - Make sure MKTME feature detection happens at an earlier time in the
   boot process so that the physical address size supported by the CPU
   is properly corrected and MTRR masks are programmed properly, leading
   to TDX systems booting without disable_mtrr_cleanup on the cmdline

 - Make sure the different address sizes supported by the CPU are read
   out as early as possible

* tag 'x86_urgent_for_v6.8_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/e820: Don't reserve SETUP_RNG_SEED in e820
  x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers
  x86/cpu: Allow reducing x86_phys_bits during early_identify_cpu()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull x86 fixes from Borislav Petkov:

 - Do not reserve SETUP_RNG_SEED setup data in the e820 map as it should
   be used by kexec only

 - Make sure MKTME feature detection happens at an earlier time in the
   boot process so that the physical address size supported by the CPU
   is properly corrected and MTRR masks are programmed properly, leading
   to TDX systems booting without disable_mtrr_cleanup on the cmdline

 - Make sure the different address sizes supported by the CPU are read
   out as early as possible

* tag 'x86_urgent_for_v6.8_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/e820: Don't reserve SETUP_RNG_SEED in e820
  x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers
  x86/cpu: Allow reducing x86_phys_bits during early_identify_cpu()
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/e820: Don't reserve SETUP_RNG_SEED in e820</title>
<updated>2024-03-01T18:27:20+00:00</updated>
<author>
<name>Jiri Bohac</name>
<email>jbohac@suse.cz</email>
</author>
<published>2024-01-31T00:04:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7fd817c906503b6813ea3b41f5fdf4192449a707'/>
<id>7fd817c906503b6813ea3b41f5fdf4192449a707</id>
<content type='text'>
SETUP_RNG_SEED in setup_data is supplied by kexec and should
not be reserved in the e820 map.

Doing so reserves 16 bytes of RAM when booting with kexec.
(16 bytes because data-&gt;len is zeroed by parse_setup_data so only
sizeof(setup_data) is reserved.)

When kexec is used repeatedly, each boot adds two entries in the
kexec-provided e820 map as the 16-byte range splits a larger
range of usable memory. Eventually all of the 128 available entries
get used up. The next split will result in losing usable memory
as the new entries cannot be added to the e820 map.

Fixes: 68b8e9713c8e ("x86/setup: Use rng seeds from setup_data")
Signed-off-by: Jiri Bohac &lt;jbohac@suse.cz&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: &lt;stable@kernel.org&gt;
Link: https://lore.kernel.org/r/ZbmOjKnARGiaYBd5@dwarf.suse.cz
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
SETUP_RNG_SEED in setup_data is supplied by kexec and should
not be reserved in the e820 map.

Doing so reserves 16 bytes of RAM when booting with kexec.
(16 bytes because data-&gt;len is zeroed by parse_setup_data so only
sizeof(setup_data) is reserved.)

When kexec is used repeatedly, each boot adds two entries in the
kexec-provided e820 map as the 16-byte range splits a larger
range of usable memory. Eventually all of the 128 available entries
get used up. The next split will result in losing usable memory
as the new entries cannot be added to the e820 map.

Fixes: 68b8e9713c8e ("x86/setup: Use rng seeds from setup_data")
Signed-off-by: Jiri Bohac &lt;jbohac@suse.cz&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: &lt;stable@kernel.org&gt;
Link: https://lore.kernel.org/r/ZbmOjKnARGiaYBd5@dwarf.suse.cz
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/hyperv: Allow 15-bit APIC IDs for VTL platforms</title>
<updated>2024-03-01T08:36:22+00:00</updated>
<author>
<name>Saurabh Sengar</name>
<email>ssengar@linux.microsoft.com</email>
</author>
<published>2024-01-15T17:57:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0d63e4c0ebc2b5c329babde44fd61d3f08db814d'/>
<id>0d63e4c0ebc2b5c329babde44fd61d3f08db814d</id>
<content type='text'>
The current method for signaling the compatibility of a Hyper-V host
with MSIs featuring 15-bit APIC IDs relies on a synthetic cpuid leaf.
However, for higher VTLs, this leaf is not reported, due to the absence
of an IO-APIC.

As an alternative, assume that when running at a high VTL, the host
supports 15-bit APIC IDs. This assumption is safe, as Hyper-V does not
employ any architectural MSIs at higher VTLs

This unblocks startup of VTL2 environments with more than 256 CPUs.

Signed-off-by: Saurabh Sengar &lt;ssengar@linux.microsoft.com&gt;
Reviewed-by: Michael Kelley &lt;mhklinux@outlook.com&gt;
Link: https://lore.kernel.org/r/1705341460-18394-1-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Message-ID: &lt;1705341460-18394-1-git-send-email-ssengar@linux.microsoft.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The current method for signaling the compatibility of a Hyper-V host
with MSIs featuring 15-bit APIC IDs relies on a synthetic cpuid leaf.
However, for higher VTLs, this leaf is not reported, due to the absence
of an IO-APIC.

As an alternative, assume that when running at a high VTL, the host
supports 15-bit APIC IDs. This assumption is safe, as Hyper-V does not
employ any architectural MSIs at higher VTLs

This unblocks startup of VTL2 environments with more than 256 CPUs.

Signed-off-by: Saurabh Sengar &lt;ssengar@linux.microsoft.com&gt;
Reviewed-by: Michael Kelley &lt;mhklinux@outlook.com&gt;
Link: https://lore.kernel.org/r/1705341460-18394-1-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Message-ID: &lt;1705341460-18394-1-git-send-email-ssengar@linux.microsoft.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/hyperv: Make encrypted/decrypted changes safe for load_unaligned_zeropad()</title>
<updated>2024-03-01T08:31:42+00:00</updated>
<author>
<name>Michael Kelley</name>
<email>mhklinux@outlook.com</email>
</author>
<published>2024-01-16T02:20:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0f34d11234868dc979730a905717c15067a7d205'/>
<id>0f34d11234868dc979730a905717c15067a7d205</id>
<content type='text'>
In a CoCo VM, when transitioning memory from encrypted to decrypted, or
vice versa, the caller of set_memory_encrypted() or set_memory_decrypted()
is responsible for ensuring the memory isn't in use and isn't referenced
while the transition is in progress.  The transition has multiple steps,
and the memory is in an inconsistent state until all steps are complete.
A reference while the state is inconsistent could result in an exception
that can't be cleanly fixed up.

However, the kernel load_unaligned_zeropad() mechanism could cause a stray
reference that can't be prevented by the caller of set_memory_encrypted()
or set_memory_decrypted(), so there's specific code to handle this case.
But a CoCo VM running on Hyper-V may be configured to run with a paravisor,
with the #VC or #VE exception routed to the paravisor. There's no
architectural way to forward the exceptions back to the guest kernel, and
in such a case, the load_unaligned_zeropad() specific code doesn't work.

To avoid this problem, mark pages as "not present" while a transition
is in progress. If load_unaligned_zeropad() causes a stray reference, a
normal page fault is generated instead of #VC or #VE, and the
page-fault-based fixup handlers for load_unaligned_zeropad() resolve the
reference. When the encrypted/decrypted transition is complete, mark the
pages as "present" again.

Signed-off-by: Michael Kelley &lt;mhklinux@outlook.com&gt;
Reviewed-by: Kuppuswamy Sathyanarayanan &lt;sathyanarayanan.kuppuswamy@linux.intel.com&gt;
Link: https://lore.kernel.org/r/20240116022008.1023398-4-mhklinux@outlook.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Message-ID: &lt;20240116022008.1023398-4-mhklinux@outlook.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In a CoCo VM, when transitioning memory from encrypted to decrypted, or
vice versa, the caller of set_memory_encrypted() or set_memory_decrypted()
is responsible for ensuring the memory isn't in use and isn't referenced
while the transition is in progress.  The transition has multiple steps,
and the memory is in an inconsistent state until all steps are complete.
A reference while the state is inconsistent could result in an exception
that can't be cleanly fixed up.

However, the kernel load_unaligned_zeropad() mechanism could cause a stray
reference that can't be prevented by the caller of set_memory_encrypted()
or set_memory_decrypted(), so there's specific code to handle this case.
But a CoCo VM running on Hyper-V may be configured to run with a paravisor,
with the #VC or #VE exception routed to the paravisor. There's no
architectural way to forward the exceptions back to the guest kernel, and
in such a case, the load_unaligned_zeropad() specific code doesn't work.

To avoid this problem, mark pages as "not present" while a transition
is in progress. If load_unaligned_zeropad() causes a stray reference, a
normal page fault is generated instead of #VC or #VE, and the
page-fault-based fixup handlers for load_unaligned_zeropad() resolve the
reference. When the encrypted/decrypted transition is complete, mark the
pages as "present" again.

Signed-off-by: Michael Kelley &lt;mhklinux@outlook.com&gt;
Reviewed-by: Kuppuswamy Sathyanarayanan &lt;sathyanarayanan.kuppuswamy@linux.intel.com&gt;
Link: https://lore.kernel.org/r/20240116022008.1023398-4-mhklinux@outlook.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Message-ID: &lt;20240116022008.1023398-4-mhklinux@outlook.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mm: Regularize set_memory_p() parameters and make non-static</title>
<updated>2024-03-01T08:31:41+00:00</updated>
<author>
<name>Michael Kelley</name>
<email>mhklinux@outlook.com</email>
</author>
<published>2024-01-16T02:20:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=030ad7af94371f1faeecfc12dda296d8b5a17ef8'/>
<id>030ad7af94371f1faeecfc12dda296d8b5a17ef8</id>
<content type='text'>
set_memory_p() is currently static.  It has parameters that don't
match set_memory_p() under arch/powerpc and that aren't congruent
with the other set_memory_* functions. There's no good reason for
the difference.

Fix this by making the parameters consistent, and update the one
existing call site.  Make the function non-static and add it to
include/asm/set_memory.h so that it is completely parallel to
set_memory_np() and is usable in other modules.

No functional change.

Signed-off-by: Michael Kelley &lt;mhklinux@outlook.com&gt;
Reviewed-by: Rick Edgecombe &lt;rick.p.edgecombe@intel.com&gt;
Acked-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Link: https://lore.kernel.org/r/20240116022008.1023398-3-mhklinux@outlook.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Message-ID: &lt;20240116022008.1023398-3-mhklinux@outlook.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
set_memory_p() is currently static.  It has parameters that don't
match set_memory_p() under arch/powerpc and that aren't congruent
with the other set_memory_* functions. There's no good reason for
the difference.

Fix this by making the parameters consistent, and update the one
existing call site.  Make the function non-static and add it to
include/asm/set_memory.h so that it is completely parallel to
set_memory_np() and is usable in other modules.

No functional change.

Signed-off-by: Michael Kelley &lt;mhklinux@outlook.com&gt;
Reviewed-by: Rick Edgecombe &lt;rick.p.edgecombe@intel.com&gt;
Acked-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Link: https://lore.kernel.org/r/20240116022008.1023398-3-mhklinux@outlook.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Message-ID: &lt;20240116022008.1023398-3-mhklinux@outlook.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
