<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/kernel/kexec_handover.c, branch linux-rolling-lts</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>kho: skip memoryless NUMA nodes when reserving scratch areas</title>
<updated>2026-03-04T12:21:20+00:00</updated>
<author>
<name>Evangelos Petrongonas</name>
<email>epetron@amazon.de</email>
</author>
<published>2026-01-20T17:59:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f86582890bd7a72a75293e4abbf84c17bca4e754'/>
<id>f86582890bd7a72a75293e4abbf84c17bca4e754</id>
<content type='text'>
[ Upstream commit 427b2535f51342de3156babc6bdc3f3b7dd2c707 ]

kho_reserve_scratch() iterates over all online NUMA nodes to allocate
per-node scratch memory.  On systems with memoryless NUMA nodes (nodes
that have CPUs but no memory), memblock_alloc_range_nid() fails because
there is no memory available on that node.  This causes KHO initialization
to fail and kho_enable to be set to false.

Some ARM64 systems have NUMA topologies where certain nodes contain only
CPUs without any associated memory.  These configurations are valid and
should not prevent KHO from functioning.

Fix this by only counting nodes that have memory (N_MEMORY state) and skip
memoryless nodes in the per-node scratch allocation loop.

Link: https://lkml.kernel.org/r/20260120175913.34368-1-epetron@amazon.de
Fixes: 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers").
Signed-off-by: Evangelos Petrongonas &lt;epetron@amazon.de&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 427b2535f51342de3156babc6bdc3f3b7dd2c707 ]

kho_reserve_scratch() iterates over all online NUMA nodes to allocate
per-node scratch memory.  On systems with memoryless NUMA nodes (nodes
that have CPUs but no memory), memblock_alloc_range_nid() fails because
there is no memory available on that node.  This causes KHO initialization
to fail and kho_enable to be set to false.

Some ARM64 systems have NUMA topologies where certain nodes contain only
CPUs without any associated memory.  These configurations are valid and
should not prevent KHO from functioning.

Fix this by only counting nodes that have memory (N_MEMORY state) and skip
memoryless nodes in the per-node scratch allocation loop.

Link: https://lkml.kernel.org/r/20260120175913.34368-1-epetron@amazon.de
Fixes: 3dc92c311498 ("kexec: add Kexec HandOver (KHO) generation helpers").
Signed-off-by: Evangelos Petrongonas &lt;epetron@amazon.de&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kho: warn and exit when unpreserved page wasn't preserved</title>
<updated>2025-11-10T05:19:47+00:00</updated>
<author>
<name>Pratyush Yadav</name>
<email>pratyush@kernel.org</email>
</author>
<published>2025-11-03T18:02:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b05addf6f0596edb1f82ab4059438c7ef2d2686d'/>
<id>b05addf6f0596edb1f82ab4059438c7ef2d2686d</id>
<content type='text'>
Calling __kho_unpreserve() on a pair of (pfn, end_pfn) that wasn't
preserved is a bug.  Currently, if that is done, the physxa or bits can be
NULL.  This results in a soft lockup since a NULL physxa or bits results
in redoing the loop without ever making any progress.

Return when physxa or bits are not found, but WARN first to loudly
indicate invalid behaviour.

Link: https://lkml.kernel.org/r/20251103180235.71409-3-pratyush@kernel.org
Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation")
Signed-off-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Calling __kho_unpreserve() on a pair of (pfn, end_pfn) that wasn't
preserved is a bug.  Currently, if that is done, the physxa or bits can be
NULL.  This results in a soft lockup since a NULL physxa or bits results
in redoing the loop without ever making any progress.

Return when physxa or bits are not found, but WARN first to loudly
indicate invalid behaviour.

Link: https://lkml.kernel.org/r/20251103180235.71409-3-pratyush@kernel.org
Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation")
Signed-off-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kho: fix unpreservation of higher-order vmalloc preservations</title>
<updated>2025-11-10T05:19:47+00:00</updated>
<author>
<name>Pratyush Yadav</name>
<email>pratyush@kernel.org</email>
</author>
<published>2025-11-03T18:02:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7ecd2e439d1272ac02d798b0033a426e3b00dff5'/>
<id>7ecd2e439d1272ac02d798b0033a426e3b00dff5</id>
<content type='text'>
kho_vmalloc_unpreserve_chunk() calls __kho_unpreserve() with end_pfn as
pfn + 1.  This happens to work for 0-order pages, but leaks higher order
pages.

For example, say order 2 pages back the allocation.  During preservation,
they get preserved in the order 2 bitmaps, but
kho_vmalloc_unpreserve_chunk() would try to unpreserve them from the order
0 bitmaps, which should not have these bits set anyway, leaving the order
2 bitmaps untouched.  This results in the pages being carried over to the
next kernel.  Nothing will free those pages in the next boot, leaking
them.

Fix this by taking the order into account when calculating the end PFN for
__kho_unpreserve().

Link: https://lkml.kernel.org/r/20251103180235.71409-2-pratyush@kernel.org
Fixes: a667300bd53f ("kho: add support for preserving vmalloc allocations")
Signed-off-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
kho_vmalloc_unpreserve_chunk() calls __kho_unpreserve() with end_pfn as
pfn + 1.  This happens to work for 0-order pages, but leaks higher order
pages.

For example, say order 2 pages back the allocation.  During preservation,
they get preserved in the order 2 bitmaps, but
kho_vmalloc_unpreserve_chunk() would try to unpreserve them from the order
0 bitmaps, which should not have these bits set anyway, leaving the order
2 bitmaps untouched.  This results in the pages being carried over to the
next kernel.  Nothing will free those pages in the next boot, leaking
them.

Fix this by taking the order into account when calculating the end PFN for
__kho_unpreserve().

Link: https://lkml.kernel.org/r/20251103180235.71409-2-pratyush@kernel.org
Fixes: a667300bd53f ("kho: add support for preserving vmalloc allocations")
Signed-off-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kho: fix out-of-bounds access of vmalloc chunk</title>
<updated>2025-11-10T05:19:47+00:00</updated>
<author>
<name>Pratyush Yadav</name>
<email>pratyush@kernel.org</email>
</author>
<published>2025-11-03T11:01:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0b07092d09e54e49b85379a9c60f82d54a881514'/>
<id>0b07092d09e54e49b85379a9c60f82d54a881514</id>
<content type='text'>
The list of pages in a vmalloc chunk is NULL-terminated.  So when looping
through the pages in a vmalloc chunk, both kho_restore_vmalloc() and
kho_vmalloc_unpreserve_chunk() rightly make sure to stop when encountering
a NULL page.  But when the chunk is full, the loops do not stop and go
past the bounds of chunk-&gt;phys, resulting in out-of-bounds memory access,
and possibly the restoration or unpreservation of an invalid page.

Fix this by making sure the processing of chunk stops at the end of the
array.

Link: https://lkml.kernel.org/r/20251103110159.8399-1-pratyush@kernel.org
Fixes: a667300bd53f ("kho: add support for preserving vmalloc allocations")
Signed-off-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The list of pages in a vmalloc chunk is NULL-terminated.  So when looping
through the pages in a vmalloc chunk, both kho_restore_vmalloc() and
kho_vmalloc_unpreserve_chunk() rightly make sure to stop when encountering
a NULL page.  But when the chunk is full, the loops do not stop and go
past the bounds of chunk-&gt;phys, resulting in out-of-bounds memory access,
and possibly the restoration or unpreservation of an invalid page.

Fix this by making sure the processing of chunk stops at the end of the
array.

Link: https://lkml.kernel.org/r/20251103110159.8399-1-pratyush@kernel.org
Fixes: a667300bd53f ("kho: add support for preserving vmalloc allocations")
Signed-off-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kho: allocate metadata directly from the buddy allocator</title>
<updated>2025-11-10T05:19:42+00:00</updated>
<author>
<name>Pasha Tatashin</name>
<email>pasha.tatashin@soleen.com</email>
</author>
<published>2025-10-21T00:08:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fa759cd75bce5489eed34596daa53f721849a86f'/>
<id>fa759cd75bce5489eed34596daa53f721849a86f</id>
<content type='text'>
KHO allocates metadata for its preserved memory map using the slab
allocator via kzalloc().  This metadata is temporary and is used by the
next kernel during early boot to find preserved memory.

A problem arises when KFENCE is enabled.  kzalloc() calls can be randomly
intercepted by kfence_alloc(), which services the allocation from a
dedicated KFENCE memory pool.  This pool is allocated early in boot via
memblock.

When booting via KHO, the memblock allocator is restricted to a "scratch
area", forcing the KFENCE pool to be allocated within it.  This creates a
conflict, as the scratch area is expected to be ephemeral and
overwriteable by a subsequent kexec.  If KHO metadata is placed in this
KFENCE pool, it leads to memory corruption when the next kernel is loaded.

To fix this, modify KHO to allocate its metadata directly from the buddy
allocator instead of slab.

Link: https://lkml.kernel.org/r/20251021000852.2924827-4-pasha.tatashin@soleen.com
Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation")
Signed-off-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: David Matlack &lt;dmatlack@google.com&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Samiullah Khawaja &lt;skhawaja@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
KHO allocates metadata for its preserved memory map using the slab
allocator via kzalloc().  This metadata is temporary and is used by the
next kernel during early boot to find preserved memory.

A problem arises when KFENCE is enabled.  kzalloc() calls can be randomly
intercepted by kfence_alloc(), which services the allocation from a
dedicated KFENCE memory pool.  This pool is allocated early in boot via
memblock.

When booting via KHO, the memblock allocator is restricted to a "scratch
area", forcing the KFENCE pool to be allocated within it.  This creates a
conflict, as the scratch area is expected to be ephemeral and
overwriteable by a subsequent kexec.  If KHO metadata is placed in this
KFENCE pool, it leads to memory corruption when the next kernel is loaded.

To fix this, modify KHO to allocate its metadata directly from the buddy
allocator instead of slab.

Link: https://lkml.kernel.org/r/20251021000852.2924827-4-pasha.tatashin@soleen.com
Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation")
Signed-off-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: David Matlack &lt;dmatlack@google.com&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Samiullah Khawaja &lt;skhawaja@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kho: increase metadata bitmap size to PAGE_SIZE</title>
<updated>2025-11-10T05:19:41+00:00</updated>
<author>
<name>Pasha Tatashin</name>
<email>pasha.tatashin@soleen.com</email>
</author>
<published>2025-10-21T00:08:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a2fff99f92dae9c0eaf0d75de3def70ec68dad92'/>
<id>a2fff99f92dae9c0eaf0d75de3def70ec68dad92</id>
<content type='text'>
KHO memory preservation metadata is preserved in 512 byte chunks which
requires their allocation from slab allocator.  Slabs are not safe to be
used with KHO because of kfence, and because partial slabs may lead leaks
to the next kernel.  Change the size to be PAGE_SIZE.

The kfence specifically may cause memory corruption, where it randomly
provides slab objects that can be within the scratch area.  The reason for
that is that kfence allocates its objects prior to KHO scratch is marked
as CMA region.

While this change could potentially increase metadata overhead on systems
with sparsely preserved memory, this is being mitigated by ongoing work to
reduce sparseness during preservation via 1G guest pages.  Furthermore,
this change aligns with future work on a stateless KHO, which will also
use page-sized bitmaps for its radix tree metadata.

Link: https://lkml.kernel.org/r/20251021000852.2924827-3-pasha.tatashin@soleen.com
Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation")
Signed-off-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: David Matlack &lt;dmatlack@google.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Samiullah Khawaja &lt;skhawaja@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
KHO memory preservation metadata is preserved in 512 byte chunks which
requires their allocation from slab allocator.  Slabs are not safe to be
used with KHO because of kfence, and because partial slabs may lead leaks
to the next kernel.  Change the size to be PAGE_SIZE.

The kfence specifically may cause memory corruption, where it randomly
provides slab objects that can be within the scratch area.  The reason for
that is that kfence allocates its objects prior to KHO scratch is marked
as CMA region.

While this change could potentially increase metadata overhead on systems
with sparsely preserved memory, this is being mitigated by ongoing work to
reduce sparseness during preservation via 1G guest pages.  Furthermore,
this change aligns with future work on a stateless KHO, which will also
use page-sized bitmaps for its radix tree metadata.

Link: https://lkml.kernel.org/r/20251021000852.2924827-3-pasha.tatashin@soleen.com
Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation")
Signed-off-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: David Matlack &lt;dmatlack@google.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Samiullah Khawaja &lt;skhawaja@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kho: warn and fail on metadata or preserved memory in scratch area</title>
<updated>2025-11-10T05:19:41+00:00</updated>
<author>
<name>Pasha Tatashin</name>
<email>pasha.tatashin@soleen.com</email>
</author>
<published>2025-10-21T00:08:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e38f65d317df1fd2dcafe614d9c537475ecf9992'/>
<id>e38f65d317df1fd2dcafe614d9c537475ecf9992</id>
<content type='text'>
Patch series "KHO: kfence + KHO memory corruption fix", v3.

This series fixes a memory corruption bug in KHO that occurs when KFENCE
is enabled.

The root cause is that KHO metadata, allocated via kzalloc(), can be
randomly serviced by kfence_alloc().  When a kernel boots via KHO, the
early memblock allocator is restricted to a "scratch area".  This forces
the KFENCE pool to be allocated within this scratch area, creating a
conflict.  If KHO metadata is subsequently placed in this pool, it gets
corrupted during the next kexec operation.

Google is using KHO and have had obscure crashes due to this memory
corruption, with stacks all over the place.  I would prefer this fix to be
properly backported to stable so we can also automatically consume it once
we switch to the upstream KHO.

Patch 1/3 introduces a debug-only feature (CONFIG_KEXEC_HANDOVER_DEBUG)
that adds checks to detect and fail any operation that attempts to place
KHO metadata or preserved memory within the scratch area.  This serves as
a validation and diagnostic tool to confirm the problem without affecting
production builds.

Patch 2/3 Increases bitmap to PAGE_SIZE, so buddy allocator can be used.

Patch 3/3 Provides the fix by modifying KHO to allocate its metadata
directly from the buddy allocator instead of slab.  This bypasses the
KFENCE interception entirely.


This patch (of 3):

It is invalid for KHO metadata or preserved memory regions to be located
within the KHO scratch area, as this area is overwritten when the next
kernel is loaded, and used early in boot by the next kernel.  This can
lead to memory corruption.

Add checks to kho_preserve_* and KHO's internal metadata allocators
(xa_load_or_alloc, new_chunk) to verify that the physical address of the
memory does not overlap with any defined scratch region.  If an overlap is
detected, the operation will fail and a WARN_ON is triggered.  To avoid
performance overhead in production kernels, these checks are enabled only
when CONFIG_KEXEC_HANDOVER_DEBUG is selected.

[rppt@kernel.org: fix KEXEC_HANDOVER_DEBUG Kconfig dependency]
  Link: https://lkml.kernel.org/r/aQHUyyFtiNZhx8jo@kernel.org
[pasha.tatashin@soleen.com: build fix]
  Link: https://lkml.kernel.org/r/CA+CK2bBnorfsTymKtv4rKvqGBHs=y=MjEMMRg_tE-RME6n-zUw@mail.gmail.com
Link: https://lkml.kernel.org/r/20251021000852.2924827-1-pasha.tatashin@soleen.com
Link: https://lkml.kernel.org/r/20251021000852.2924827-2-pasha.tatashin@soleen.com
Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation")
Signed-off-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Signed-off-by: Mike Rapoport &lt;rppt@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: David Matlack &lt;dmatlack@google.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Samiullah Khawaja &lt;skhawaja@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "KHO: kfence + KHO memory corruption fix", v3.

This series fixes a memory corruption bug in KHO that occurs when KFENCE
is enabled.

The root cause is that KHO metadata, allocated via kzalloc(), can be
randomly serviced by kfence_alloc().  When a kernel boots via KHO, the
early memblock allocator is restricted to a "scratch area".  This forces
the KFENCE pool to be allocated within this scratch area, creating a
conflict.  If KHO metadata is subsequently placed in this pool, it gets
corrupted during the next kexec operation.

Google is using KHO and have had obscure crashes due to this memory
corruption, with stacks all over the place.  I would prefer this fix to be
properly backported to stable so we can also automatically consume it once
we switch to the upstream KHO.

Patch 1/3 introduces a debug-only feature (CONFIG_KEXEC_HANDOVER_DEBUG)
that adds checks to detect and fail any operation that attempts to place
KHO metadata or preserved memory within the scratch area.  This serves as
a validation and diagnostic tool to confirm the problem without affecting
production builds.

Patch 2/3 Increases bitmap to PAGE_SIZE, so buddy allocator can be used.

Patch 3/3 Provides the fix by modifying KHO to allocate its metadata
directly from the buddy allocator instead of slab.  This bypasses the
KFENCE interception entirely.


This patch (of 3):

It is invalid for KHO metadata or preserved memory regions to be located
within the KHO scratch area, as this area is overwritten when the next
kernel is loaded, and used early in boot by the next kernel.  This can
lead to memory corruption.

Add checks to kho_preserve_* and KHO's internal metadata allocators
(xa_load_or_alloc, new_chunk) to verify that the physical address of the
memory does not overlap with any defined scratch region.  If an overlap is
detected, the operation will fail and a WARN_ON is triggered.  To avoid
performance overhead in production kernels, these checks are enabled only
when CONFIG_KEXEC_HANDOVER_DEBUG is selected.

[rppt@kernel.org: fix KEXEC_HANDOVER_DEBUG Kconfig dependency]
  Link: https://lkml.kernel.org/r/aQHUyyFtiNZhx8jo@kernel.org
[pasha.tatashin@soleen.com: build fix]
  Link: https://lkml.kernel.org/r/CA+CK2bBnorfsTymKtv4rKvqGBHs=y=MjEMMRg_tE-RME6n-zUw@mail.gmail.com
Link: https://lkml.kernel.org/r/20251021000852.2924827-1-pasha.tatashin@soleen.com
Link: https://lkml.kernel.org/r/20251021000852.2924827-2-pasha.tatashin@soleen.com
Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation")
Signed-off-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Signed-off-by: Mike Rapoport &lt;rppt@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: David Matlack &lt;dmatlack@google.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Samiullah Khawaja &lt;skhawaja@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kho: add support for preserving vmalloc allocations</title>
<updated>2025-10-07T20:48:55+00:00</updated>
<author>
<name>Mike Rapoport (Microsoft)</name>
<email>rppt@kernel.org</email>
</author>
<published>2025-09-21T05:44:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a667300bd53f272a3055238bcefe108f88836270'/>
<id>a667300bd53f272a3055238bcefe108f88836270</id>
<content type='text'>
A vmalloc allocation is preserved using binary structure similar to global
KHO memory tracker.  It's a linked list of pages where each page is an
array of physical address of pages in vmalloc area.

kho_preserve_vmalloc() hands out the physical address of the head page to
the caller.  This address is used as the argument to kho_vmalloc_restore()
to restore the mapping in the vmalloc address space and populate it with
the preserved pages.

[pasha.tatashin@soleen.com: free chunks using free_page() not kfree()]
  Link: https://lkml.kernel.org/r/mafs0a52idbeg.fsf@kernel.org
[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/20250921054458.4043761-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Changyuan Lyu &lt;changyuanl@google.com&gt;
Cc: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A vmalloc allocation is preserved using binary structure similar to global
KHO memory tracker.  It's a linked list of pages where each page is an
array of physical address of pages in vmalloc area.

kho_preserve_vmalloc() hands out the physical address of the head page to
the caller.  This address is used as the argument to kho_vmalloc_restore()
to restore the mapping in the vmalloc address space and populate it with
the preserved pages.

[pasha.tatashin@soleen.com: free chunks using free_page() not kfree()]
  Link: https://lkml.kernel.org/r/mafs0a52idbeg.fsf@kernel.org
[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/20250921054458.4043761-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Changyuan Lyu &lt;changyuanl@google.com&gt;
Cc: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kho: replace kho_preserve_phys() with kho_preserve_pages()</title>
<updated>2025-10-07T20:48:55+00:00</updated>
<author>
<name>Mike Rapoport (Microsoft)</name>
<email>rppt@kernel.org</email>
</author>
<published>2025-09-21T05:44:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8375b76517cb52bac0903071feedc218c45d74d2'/>
<id>8375b76517cb52bac0903071feedc218c45d74d2</id>
<content type='text'>
to make it clear that KHO operates on pages rather than on a random
physical address.

The kho_preserve_pages() will be also used in upcoming support for vmalloc
preservation.

Link: https://lkml.kernel.org/r/20250921054458.4043761-3-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Changyuan Lyu &lt;changyuanl@google.com&gt;
Cc: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
to make it clear that KHO operates on pages rather than on a random
physical address.

The kho_preserve_pages() will be also used in upcoming support for vmalloc
preservation.

Link: https://lkml.kernel.org/r/20250921054458.4043761-3-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Changyuan Lyu &lt;changyuanl@google.com&gt;
Cc: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kho: check if kho is finalized in __kho_preserve_order()</title>
<updated>2025-10-07T20:48:55+00:00</updated>
<author>
<name>Mike Rapoport (Microsoft)</name>
<email>rppt@kernel.org</email>
</author>
<published>2025-09-21T05:44:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=469661d0d3a55a7ba1e7cb847c26baf78cace086'/>
<id>469661d0d3a55a7ba1e7cb847c26baf78cace086</id>
<content type='text'>
Patch series "kho: add support for preserving vmalloc allocations", v5.

Following the discussion about preservation of memfd with LUO [1] these
patches add support for preserving vmalloc allocations.

Any KHO uses case presumes that there's a data structure that lists
physical addresses of preserved folios (and potentially some additional
metadata).  Allowing vmalloc preservations with KHO allows scalable
preservation of such data structures.

For instance, instead of allocating array describing preserved folios in
the fdt, memfd preservation can use vmalloc:

        preserved_folios = vmalloc_array(nr_folios, sizeof(*preserved_folios));
        memfd_luo_preserve_folios(preserved_folios, folios, nr_folios);
        kho_preserve_vmalloc(preserved_folios, &amp;folios_info);


This patch (of 4):

Instead of checking if kho is finalized in each caller of
__kho_preserve_order(), do it in the core function itself.

Link: https://lkml.kernel.org/r/20250921054458.4043761-1-rppt@kernel.org
Link: https://lkml.kernel.org/r/20250921054458.4043761-2-rppt@kernel.org
Link: https://lore.kernel.org/all/20250807014442.3829950-30-pasha.tatashin@soleen.com [1]
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Changyuan Lyu &lt;changyuanl@google.com&gt;
Cc: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "kho: add support for preserving vmalloc allocations", v5.

Following the discussion about preservation of memfd with LUO [1] these
patches add support for preserving vmalloc allocations.

Any KHO uses case presumes that there's a data structure that lists
physical addresses of preserved folios (and potentially some additional
metadata).  Allowing vmalloc preservations with KHO allows scalable
preservation of such data structures.

For instance, instead of allocating array describing preserved folios in
the fdt, memfd preservation can use vmalloc:

        preserved_folios = vmalloc_array(nr_folios, sizeof(*preserved_folios));
        memfd_luo_preserve_folios(preserved_folios, folios, nr_folios);
        kho_preserve_vmalloc(preserved_folios, &amp;folios_info);


This patch (of 4):

Instead of checking if kho is finalized in each caller of
__kho_preserve_order(), do it in the core function itself.

Link: https://lkml.kernel.org/r/20250921054458.4043761-1-rppt@kernel.org
Link: https://lkml.kernel.org/r/20250921054458.4043761-2-rppt@kernel.org
Link: https://lore.kernel.org/all/20250807014442.3829950-30-pasha.tatashin@soleen.com [1]
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Changyuan Lyu &lt;changyuanl@google.com&gt;
Cc: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
