<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/tools/testing/memblock, branch master</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge tag 'liveupdate-v7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/liveupdate/linux</title>
<updated>2026-06-21T16:46:14+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-06-21T16:46:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d639d9fa162aadec1ae9980c4dcf6e50bd2f8290'/>
<id>d639d9fa162aadec1ae9980c4dcf6e50bd2f8290</id>
<content type='text'>
Pull liveupdate updates from Mike Rapoport:
 "Kexec Handover (KHO):

   - make memory preservation compatible with deferred initialization
     of the memory map

  Live Update Orchestrator (LUO):

   - add LIVEUPDATE_SESSION_GET_NAME ioctl and parameter verification
     for LIVEUPDATE_IOCTL_CREATE_SESSION ioctl

   - documentation updates for liveupdate=on command line option,
     systemd support and the current compatibility status

   - remove the fixed limits on the number of files that can be
     preserved within a single session, and the total number of
     sessions managed by the LUO

  Misc fixes:

   - reference count incoming File-Lifecycle-Bound (FLB) data so
     it cannot be freed while a subsystem is still using it

   - fixes for a TOCTOU race in luo_session_retrieve(), a use-
     after-free in the file finish and unpreserve paths, concurrent
     session mutations during reboot and serialization on
     preserve_context kexec

   - make sure ioctls for incoming LUO sessions are blocked for
     outgoing sessions and vice versa

   - make sure KHO scratch size is always aligned by
     CMA_MIN_ALIGNMENT_BYTES

   - fix memblock tests build issue introduced by KHO changes"

* tag 'liveupdate-v7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/liveupdate/linux: (36 commits)
  liveupdate: Document that retrieve failure is permanent
  docs: memfd_preservation: fix rendering of ABI documentation
  selftests/liveupdate: Add stress-files kexec test
  selftests/liveupdate: Add stress-sessions kexec test
  selftests/liveupdate: Test session and file limit removal
  liveupdate: Remove limit on the number of files per session
  liveupdate: Remove limit on the number of sessions
  liveupdate: defer session block allocation and physical address setting
  kho: add support for linked-block serialization
  liveupdate: Extract luo_session_deserialize_one helper
  liveupdate: Extract luo_file_deserialize_one helper
  liveupdate: register luo_ser as KHO subtree
  liveupdate: centralize state management into struct luo_ser
  liveupdate: avoid mixing cleanup guards with goto in luo_session_retrieve_fd
  liveupdate: change file_set-&gt;count type to u64 for type safety
  liveupdate: Remove unused ser field from struct luo_session
  liveupdate: fix u-a-f in luo_file_unpreserve_files() and luo_file_finish()
  liveupdate: block session mutations during reboot
  liveupdate: fix TOCTOU race in luo_session_retrieve()
  liveupdate: skip serialization for context-preserving kexec
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull liveupdate updates from Mike Rapoport:
 "Kexec Handover (KHO):

   - make memory preservation compatible with deferred initialization
     of the memory map

  Live Update Orchestrator (LUO):

   - add LIVEUPDATE_SESSION_GET_NAME ioctl and parameter verification
     for LIVEUPDATE_IOCTL_CREATE_SESSION ioctl

   - documentation updates for liveupdate=on command line option,
     systemd support and the current compatibility status

   - remove the fixed limits on the number of files that can be
     preserved within a single session, and the total number of
     sessions managed by the LUO

  Misc fixes:

   - reference count incoming File-Lifecycle-Bound (FLB) data so
     it cannot be freed while a subsystem is still using it

   - fixes for a TOCTOU race in luo_session_retrieve(), a use-
     after-free in the file finish and unpreserve paths, concurrent
     session mutations during reboot and serialization on
     preserve_context kexec

   - make sure ioctls for incoming LUO sessions are blocked for
     outgoing sessions and vice versa

   - make sure KHO scratch size is always aligned by
     CMA_MIN_ALIGNMENT_BYTES

   - fix memblock tests build issue introduced by KHO changes"

* tag 'liveupdate-v7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/liveupdate/linux: (36 commits)
  liveupdate: Document that retrieve failure is permanent
  docs: memfd_preservation: fix rendering of ABI documentation
  selftests/liveupdate: Add stress-files kexec test
  selftests/liveupdate: Add stress-sessions kexec test
  selftests/liveupdate: Test session and file limit removal
  liveupdate: Remove limit on the number of files per session
  liveupdate: Remove limit on the number of sessions
  liveupdate: defer session block allocation and physical address setting
  kho: add support for linked-block serialization
  liveupdate: Extract luo_session_deserialize_one helper
  liveupdate: Extract luo_file_deserialize_one helper
  liveupdate: register luo_ser as KHO subtree
  liveupdate: centralize state management into struct luo_ser
  liveupdate: avoid mixing cleanup guards with goto in luo_session_retrieve_fd
  liveupdate: change file_set-&gt;count type to u64 for type safety
  liveupdate: Remove unused ser field from struct luo_session
  liveupdate: fix u-a-f in luo_file_unpreserve_files() and luo_file_finish()
  liveupdate: block session mutations during reboot
  liveupdate: fix TOCTOU race in luo_session_retrieve()
  liveupdate: skip serialization for context-preserving kexec
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>tools/testing/memblock: fix stale NUMA reservation tests</title>
<updated>2026-06-02T05:34:03+00:00</updated>
<author>
<name>Priyanshu Kumar</name>
<email>priyanshukumarpu@gmail.com</email>
</author>
<published>2026-04-15T12:27:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9588076fb21992a5d14efeb99134ebf034c7b84f'/>
<id>9588076fb21992a5d14efeb99134ebf034c7b84f</id>
<content type='text'>
memblock allocations now reserve memory with MEMBLOCK_RSRV_KERN and,
on NUMA configurations, record the requested node on the reserved
region. Several memblock simulator NUMA tests still expected merges
that only worked before those reservation semantics changed, so the
suite aborted even though the allocator behavior was correct.

Update the NUMA merge expectations in the memblock_alloc_try_nid()
and memblock_alloc_exact_nid_raw() tests to match the current reserved
region metadata rules. For cases that should still merge, create the
pre-existing reservation with matching nid and MEMBLOCK_RSRV_KERN
metadata. Also strengthen the memblock_alloc_node() coverage by
checking the newly created reserved region directly instead of
re-reading the source memory node descriptor.

Finally, drop the stale README/TODO notes that still claimed
memblock_alloc_node() could not be tested.

The memblock simulator passes again with NUMA enabled after these
updates.

Signed-off-by: Priyanshu Kumar &lt;priyanshukumarpu@gmail.com&gt;
Link: https://patch.msgid.link/20260415122731.1768912-1-priyanshukumarpu@gmail.com
[rppt: dropped unrelated changes]
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
memblock allocations now reserve memory with MEMBLOCK_RSRV_KERN and,
on NUMA configurations, record the requested node on the reserved
region. Several memblock simulator NUMA tests still expected merges
that only worked before those reservation semantics changed, so the
suite aborted even though the allocator behavior was correct.

Update the NUMA merge expectations in the memblock_alloc_try_nid()
and memblock_alloc_exact_nid_raw() tests to match the current reserved
region metadata rules. For cases that should still merge, create the
pre-existing reservation with matching nid and MEMBLOCK_RSRV_KERN
metadata. Also strengthen the memblock_alloc_node() coverage by
checking the newly created reserved region directly instead of
re-reading the source memory node descriptor.

Finally, drop the stale README/TODO notes that still claimed
memblock_alloc_node() could not be tested.

The memblock simulator passes again with NUMA enabled after these
updates.

Signed-off-by: Priyanshu Kumar &lt;priyanshukumarpu@gmail.com&gt;
Link: https://patch.msgid.link/20260415122731.1768912-1-priyanshukumarpu@gmail.com
[rppt: dropped unrelated changes]
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>memblock tests: define MIGRATE_CMA</title>
<updated>2026-05-31T23:31:38+00:00</updated>
<author>
<name>Pratyush Yadav (Google)</name>
<email>pratyush@kernel.org</email>
</author>
<published>2026-05-04T10:27:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=051a224c4933e58a0592c5528e89831099c65d6b'/>
<id>051a224c4933e58a0592c5528e89831099c65d6b</id>
<content type='text'>
kho_scratch_migratetype(), defined in include/linux/memblock.h uses enum
migratetype. This breaks build for memblock tests with:

./linux/memblock.h:634:73: error: parameter 2 (‘mt’) has incomplete type
  634 |                                                        enum migratetype mt)

Fix it by defining enum migratetype and MIGRATE_CMA. As is the case with
the other headers in tools/testing/memblock, do not bring in the whole
thing, only what is needed.

Reported-by: Mike Rapoport &lt;rppt@kernel.org&gt;
Closes: https://lore.kernel.org/linux-mm/afcdDm4aAJvNaQqH@kernel.org/
Signed-off-by: Pratyush Yadav (Google) &lt;pratyush@kernel.org&gt;
Link: https://patch.msgid.link/20260504102742.3833159-1-pratyush@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
kho_scratch_migratetype(), defined in include/linux/memblock.h uses enum
migratetype. This breaks build for memblock tests with:

./linux/memblock.h:634:73: error: parameter 2 (‘mt’) has incomplete type
  634 |                                                        enum migratetype mt)

Fix it by defining enum migratetype and MIGRATE_CMA. As is the case with
the other headers in tools/testing/memblock, do not bring in the whole
thing, only what is needed.

Reported-by: Mike Rapoport &lt;rppt@kernel.org&gt;
Closes: https://lore.kernel.org/linux-mm/afcdDm4aAJvNaQqH@kernel.org/
Signed-off-by: Pratyush Yadav (Google) &lt;pratyush@kernel.org&gt;
Link: https://patch.msgid.link/20260504102742.3833159-1-pratyush@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: move free_reserved_area() to mm/memblock.c</title>
<updated>2026-04-01T08:19:46+00:00</updated>
<author>
<name>Mike Rapoport (Microsoft)</name>
<email>rppt@kernel.org</email>
</author>
<published>2026-03-23T07:48:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0510bdab538e2af07a67bc58a0c6c4547b83f8d5'/>
<id>0510bdab538e2af07a67bc58a0c6c4547b83f8d5</id>
<content type='text'>
free_reserved_area() is related to memblock as it frees reserved memory
back to the buddy allocator, similar to what memblock_free_late() does.

Move free_reserved_area() to mm/memblock.c to prepare for further
consolidation of the functions that free reserved memory.

No functional changes.

Link: https://patch.msgid.link/20260323074836.3653702-5-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Acked-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
free_reserved_area() is related to memblock as it frees reserved memory
back to the buddy allocator, similar to what memblock_free_late() does.

Move free_reserved_area() to mm/memblock.c to prepare for further
consolidation of the functions that free reserved memory.

No functional changes.

Link: https://patch.msgid.link/20260323074836.3653702-5-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Acked-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>memblock: move reserve_bootmem_range() to memblock.c and make it static</title>
<updated>2026-04-01T08:19:45+00:00</updated>
<author>
<name>Mike Rapoport (Microsoft)</name>
<email>rppt@kernel.org</email>
</author>
<published>2026-03-23T07:20:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8b7b85384fad6e21e8a28628e7ebacb5a6329de4'/>
<id>8b7b85384fad6e21e8a28628e7ebacb5a6329de4</id>
<content type='text'>
reserve_bootmem_region() is only called from
memmap_init_reserved_pages() and it was in mm/mm_init.c because of its
dependecies on static init_deferred_page().

Since init_deferred_page() is not static anymore, move
reserve_bootmem_region(), rename it to memmap_init_reserved_range() and
make it static.

Update the comment describing it to better reflect what the function
does and drop bogus comment about reserved pages in free_bootmem_page().

Update memblock test stubs to reflect the core changes.

Reviewed-by: Lorenzo Stoakes (Oracle) &lt;ljs@kernel.org&gt;
Reviewed-by: David Hildenbrand (Arm) &lt;david@kernel.org&gt;
Link: https://patch.msgid.link/20260323072042.3651061-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
reserve_bootmem_region() is only called from
memmap_init_reserved_pages() and it was in mm/mm_init.c because of its
dependecies on static init_deferred_page().

Since init_deferred_page() is not static anymore, move
reserve_bootmem_region(), rename it to memmap_init_reserved_range() and
make it static.

Update the comment describing it to better reflect what the function
does and drop bogus comment about reserved pages in free_bootmem_page().

Update memblock test stubs to reflect the core changes.

Reviewed-by: Lorenzo Stoakes (Oracle) &lt;ljs@kernel.org&gt;
Reviewed-by: David Hildenbrand (Arm) &lt;david@kernel.org&gt;
Link: https://patch.msgid.link/20260323072042.3651061-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>memblock: Add reserve_mem debugfs info</title>
<updated>2026-04-01T08:19:43+00:00</updated>
<author>
<name>Guilherme G. Piccoli</name>
<email>gpiccoli@igalia.com</email>
</author>
<published>2026-03-24T01:22:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0709682cdb4ac77e3f78ea9c10d7f74b41a12518'/>
<id>0709682cdb4ac77e3f78ea9c10d7f74b41a12518</id>
<content type='text'>
When using the "reserve_mem" parameter, users aim at having an
area that (hopefully) persists across boots, so pstore infrastructure
(like ramoops module) can make use of that to save oops/ftrace logs,
for example.

There is no easy way to determine if this kernel parameter is properly
set though; the kernel doesn't show information about this memory in
memblock debugfs, neither in /proc/iomem nor dmesg. This is a relevant
information for tools like kdumpst[0], to determine if it's reliable
to use the reserved area as ramoops persistent storage; checking only
/proc/cmdline is not sufficient as it doesn't tell if the reservation
effectively succeeded or not.

Add here a new file under memblock debugfs showing properly set memory
reservations, with name and size as passed to "reserve_mem". Notice that
if no "reserve_mem=" is passed on command-line or if the reservation
attempts fail, the file is not created.

[0] https://aur.archlinux.org/packages/kdumpst

Reviewed-by: SeongJae Park &lt;sj@kernel.org&gt;
Signed-off-by: Guilherme G. Piccoli &lt;gpiccoli@igalia.com&gt;
Link: https://patch.msgid.link/20260324012839.1991765-2-gpiccoli@igalia.com
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When using the "reserve_mem" parameter, users aim at having an
area that (hopefully) persists across boots, so pstore infrastructure
(like ramoops module) can make use of that to save oops/ftrace logs,
for example.

There is no easy way to determine if this kernel parameter is properly
set though; the kernel doesn't show information about this memory in
memblock debugfs, neither in /proc/iomem nor dmesg. This is a relevant
information for tools like kdumpst[0], to determine if it's reliable
to use the reserved area as ramoops persistent storage; checking only
/proc/cmdline is not sufficient as it doesn't tell if the reservation
effectively succeeded or not.

Add here a new file under memblock debugfs showing properly set memory
reservations, with name and size as passed to "reserve_mem". Notice that
if no "reserve_mem=" is passed on command-line or if the reservation
attempts fail, the file is not created.

[0] https://aur.archlinux.org/packages/kdumpst

Reviewed-by: SeongJae Park &lt;sj@kernel.org&gt;
Signed-off-by: Guilherme G. Piccoli &lt;gpiccoli@igalia.com&gt;
Link: https://patch.msgid.link/20260324012839.1991765-2-gpiccoli@igalia.com
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>memblock: drop redundant 'struct page *' argument from memblock_free_pages()</title>
<updated>2026-01-09T09:53:51+00:00</updated>
<author>
<name>Shengming Hu</name>
<email>hu.shengming@zte.com.cn</email>
</author>
<published>2025-12-29T13:52:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=58e3e5265484a1bf39569903630a45a924621aaa'/>
<id>58e3e5265484a1bf39569903630a45a924621aaa</id>
<content type='text'>
memblock_free_pages() currently takes both a struct page * and the
corresponding PFN. The page pointer is always derived from the PFN at
call sites (pfn_to_page(pfn)), making the parameter redundant and also
allowing accidental mismatches between the two arguments.

Simplify the interface by removing the struct page * argument and
deriving the page locally from the PFN, after the deferred struct page
initialization check. This keeps the behavior unchanged while making
the helper harder to misuse.

Signed-off-by: Shengming Hu &lt;hu.shengming@zte.com.cn&gt;
Reviewed-by: David Hildenbrand (Red Hat) &lt;david@kernel.org&gt;
Link: https://patch.msgid.link/tencent_F741CE6ECC49EE099736685E60C0DBD4A209@qq.com
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
memblock_free_pages() currently takes both a struct page * and the
corresponding PFN. The page pointer is always derived from the PFN at
call sites (pfn_to_page(pfn)), making the parameter redundant and also
allowing accidental mismatches between the two arguments.

Simplify the interface by removing the struct page * argument and
deriving the page locally from the PFN, after the deferred struct page
initialization check. This keeps the behavior unchanged while making
the helper harder to misuse.

Signed-off-by: Shengming Hu &lt;hu.shengming@zte.com.cn&gt;
Reviewed-by: David Hildenbrand (Red Hat) &lt;david@kernel.org&gt;
Link: https://patch.msgid.link/tencent_F741CE6ECC49EE099736685E60C0DBD4A209@qq.com
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>memblock: add MEMBLOCK_RSRV_KERN flag</title>
<updated>2025-05-13T06:50:38+00:00</updated>
<author>
<name>Mike Rapoport (Microsoft)</name>
<email>rppt@kernel.org</email>
</author>
<published>2025-05-09T07:46:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4c78cc596bb8d39532f059e0198eeabf370c50f5'/>
<id>4c78cc596bb8d39532f059e0198eeabf370c50f5</id>
<content type='text'>
Patch series "kexec: introduce Kexec HandOver (KHO)", v8.

Kexec today considers itself purely a boot loader: When we enter the new
kernel, any state the previous kernel left behind is irrelevant and the
new kernel reinitializes the system.

However, there are use cases where this mode of operation is not what we
actually want.  In virtualization hosts for example, we want to use kexec
to update the host kernel while virtual machine memory stays untouched. 
When we add device assignment to the mix, we also need to ensure that
IOMMU and VFIO states are untouched.  If we add PCIe peer to peer DMA, we
need to do the same for the PCI subsystem.  If we want to kexec while an
SEV-SNP enabled virtual machine is running, we need to preserve the VM
context pages and physical memory.  See "pkernfs: Persisting guest memory
and kernel/device state safely across kexec" Linux Plumbers Conference
2023 presentation for details:

  https://lpc.events/event/17/contributions/1485/

To start us on the journey to support all the use cases above, this patch
implements basic infrastructure to allow hand over of kernel state across
kexec (Kexec HandOver, aka KHO).  As a really simple example target, we
use memblock's reserve_mem.

With this patchset applied, memory that was reserved using "reserve_mem"
command line options remains intact after kexec and it is guaranteed to
reside at the same physical address.

== Alternatives ==

There are alternative approaches to (parts of) the problems above:

  * Memory Pools [1] - preallocated persistent memory region + allocator
  * PRMEM [2] - resizable persistent memory regions with fixed metadata
                pointer on the kernel command line + allocator
  * Pkernfs [3] - preallocated file system for in-kernel data with fixed
                  address location on the kernel command line
  * PKRAM [4] - handover of user space pages using a fixed metadata page
                specified via command line

All of the approaches above fundamentally have the same problem: They
require the administrator to explicitly carve out a physical memory
location because they have no mechanism outside of the kernel command line
to pass data (including memory reservations) between kexec'ing kernels.

KHO provides that base foundation.  We will determine later whether we
still need any of the approaches above for fast bulk memory handover of
for example IOMMU page tables.  But IMHO they would all be users of KHO,
with KHO providing the foundational primitive to pass metadata and bulk
memory reservations as well as provide easy versioning for data.

== Overview ==

We introduce a metadata file that the kernels pass between each other. 
How they pass it is architecture specific.  The file's format is a
Flattened Device Tree (fdt) which has a generator and parser already
included in Linux.  KHO is enabled in the kernel command line by `kho=on`.
When the root user enables KHO through
/sys/kernel/debug/kho/out/finalize, the kernel invokes callbacks to every
KHO users to register preserved memory regions, which contain drivers'
states.

When the actual kexec happens, the fdt is part of the image set that we
boot into.  In addition, we keep "scratch regions" available for kexec:
physically contiguous memory regions that are guaranteed to not have any
memory that KHO would preserve.  The new kernel bootstraps itself using
the scratch regions and sets all handed over memory as in use.  When
drivers initialize that support KHO, they introspect the fdt, restore
preserved memory regions, and retrieve their states stored in the
preserved memory.

== Limitations ==

Currently KHO is only implemented for file based kexec.  The kernel
interfaces in the patch set are already in place to support user space
kexec as well, but it is still not implemented it yet inside kexec tools.

== How to Use ==

To use the code, please boot the kernel with the "kho=on" command line
parameter.  KHO will automatically create scratch regions.  If you want to
set the scratch size explicitly you can use "kho_scratch=" command line
parameter.  For instance, "kho_scratch=16M,512M,256M" will reserve a 16
MiB low memory scratch area, a 512 MiB global scratch region, and 256 MiB
per NUMA node scratch regions on boot.

Make sure to have a reserved memory range requested with reserv_mem
command line option, for example, "reserve_mem=64m:4k:n1".

Then before you invoke file based "kexec -l", finalize KHO FDT:

  # echo 1 &gt; /sys/kernel/debug/kho/out/finalize

You can preview the generated FDT using `dtc`,

  # dtc /sys/kernel/debug/kho/out/fdt
  # dtc /sys/kernel/debug/kho/out/sub_fdts/memblock

`dtc` is available on ubuntu by `sudo apt-get install device-tree-compiler`.

Now kexec into the new kernel,

  # kexec -l Image --initrd=initrd -s
  # kexec -e

(The order of KHO finalization and "kexec -l" does not matter.)

The new kernel will boot up and contain the previous kernel's reserve_mem
contents at the same physical address as the first kernel.

You can also review the FDT passed from the old kernel,

  # dtc /sys/kernel/debug/kho/in/fdt
  # dtc /sys/kernel/debug/kho/in/sub_fdts/memblock


This patch (of 17):

To denote areas that were reserved for kernel use either directly with
memblock_reserve_kern() or via memblock allocations.

Link: https://lore.kernel.org/lkml/20250424083258.2228122-1-changyuanl@google.com/
Link: https://lore.kernel.org/lkml/aAeaJ2iqkrv_ffhT@kernel.org/
Link: https://lore.kernel.org/lkml/35c58191-f774-40cf-8d66-d1e2aaf11a62@intel.com/
Link: https://lore.kernel.org/lkml/20250424093302.3894961-1-arnd@kernel.org/
Link: https://lkml.kernel.org/r/20250509074635.3187114-1-changyuanl@google.com
Link: https://lkml.kernel.org/r/20250509074635.3187114-2-changyuanl@google.com
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Co-developed-by: Changyuan Lyu &lt;changyuanl@google.com&gt;
Signed-off-by: Changyuan Lyu &lt;changyuanl@google.com&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Anthony Yznaga &lt;anthony.yznaga@oracle.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Ashish Kalra &lt;ashish.kalra@amd.com&gt;
Cc: Ben Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Borislav Betkov &lt;bp@alien8.de&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: David Woodhouse &lt;dwmw2@infradead.org&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Gowans &lt;jgowans@amazon.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Krzysztof Kozlowski &lt;krzk@kernel.org&gt;
Cc: Marc Rutland &lt;mark.rutland@arm.com&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Pratyush Yadav &lt;ptyadav@amazon.de&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Saravana Kannan &lt;saravanak@google.com&gt;
Cc: Stanislav Kinsburskii &lt;skinsburskii@linux.microsoft.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Cc: Thomas Lendacky &lt;thomas.lendacky@amd.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "kexec: introduce Kexec HandOver (KHO)", v8.

Kexec today considers itself purely a boot loader: When we enter the new
kernel, any state the previous kernel left behind is irrelevant and the
new kernel reinitializes the system.

However, there are use cases where this mode of operation is not what we
actually want.  In virtualization hosts for example, we want to use kexec
to update the host kernel while virtual machine memory stays untouched. 
When we add device assignment to the mix, we also need to ensure that
IOMMU and VFIO states are untouched.  If we add PCIe peer to peer DMA, we
need to do the same for the PCI subsystem.  If we want to kexec while an
SEV-SNP enabled virtual machine is running, we need to preserve the VM
context pages and physical memory.  See "pkernfs: Persisting guest memory
and kernel/device state safely across kexec" Linux Plumbers Conference
2023 presentation for details:

  https://lpc.events/event/17/contributions/1485/

To start us on the journey to support all the use cases above, this patch
implements basic infrastructure to allow hand over of kernel state across
kexec (Kexec HandOver, aka KHO).  As a really simple example target, we
use memblock's reserve_mem.

With this patchset applied, memory that was reserved using "reserve_mem"
command line options remains intact after kexec and it is guaranteed to
reside at the same physical address.

== Alternatives ==

There are alternative approaches to (parts of) the problems above:

  * Memory Pools [1] - preallocated persistent memory region + allocator
  * PRMEM [2] - resizable persistent memory regions with fixed metadata
                pointer on the kernel command line + allocator
  * Pkernfs [3] - preallocated file system for in-kernel data with fixed
                  address location on the kernel command line
  * PKRAM [4] - handover of user space pages using a fixed metadata page
                specified via command line

All of the approaches above fundamentally have the same problem: They
require the administrator to explicitly carve out a physical memory
location because they have no mechanism outside of the kernel command line
to pass data (including memory reservations) between kexec'ing kernels.

KHO provides that base foundation.  We will determine later whether we
still need any of the approaches above for fast bulk memory handover of
for example IOMMU page tables.  But IMHO they would all be users of KHO,
with KHO providing the foundational primitive to pass metadata and bulk
memory reservations as well as provide easy versioning for data.

== Overview ==

We introduce a metadata file that the kernels pass between each other. 
How they pass it is architecture specific.  The file's format is a
Flattened Device Tree (fdt) which has a generator and parser already
included in Linux.  KHO is enabled in the kernel command line by `kho=on`.
When the root user enables KHO through
/sys/kernel/debug/kho/out/finalize, the kernel invokes callbacks to every
KHO users to register preserved memory regions, which contain drivers'
states.

When the actual kexec happens, the fdt is part of the image set that we
boot into.  In addition, we keep "scratch regions" available for kexec:
physically contiguous memory regions that are guaranteed to not have any
memory that KHO would preserve.  The new kernel bootstraps itself using
the scratch regions and sets all handed over memory as in use.  When
drivers initialize that support KHO, they introspect the fdt, restore
preserved memory regions, and retrieve their states stored in the
preserved memory.

== Limitations ==

Currently KHO is only implemented for file based kexec.  The kernel
interfaces in the patch set are already in place to support user space
kexec as well, but it is still not implemented it yet inside kexec tools.

== How to Use ==

To use the code, please boot the kernel with the "kho=on" command line
parameter.  KHO will automatically create scratch regions.  If you want to
set the scratch size explicitly you can use "kho_scratch=" command line
parameter.  For instance, "kho_scratch=16M,512M,256M" will reserve a 16
MiB low memory scratch area, a 512 MiB global scratch region, and 256 MiB
per NUMA node scratch regions on boot.

Make sure to have a reserved memory range requested with reserv_mem
command line option, for example, "reserve_mem=64m:4k:n1".

Then before you invoke file based "kexec -l", finalize KHO FDT:

  # echo 1 &gt; /sys/kernel/debug/kho/out/finalize

You can preview the generated FDT using `dtc`,

  # dtc /sys/kernel/debug/kho/out/fdt
  # dtc /sys/kernel/debug/kho/out/sub_fdts/memblock

`dtc` is available on ubuntu by `sudo apt-get install device-tree-compiler`.

Now kexec into the new kernel,

  # kexec -l Image --initrd=initrd -s
  # kexec -e

(The order of KHO finalization and "kexec -l" does not matter.)

The new kernel will boot up and contain the previous kernel's reserve_mem
contents at the same physical address as the first kernel.

You can also review the FDT passed from the old kernel,

  # dtc /sys/kernel/debug/kho/in/fdt
  # dtc /sys/kernel/debug/kho/in/sub_fdts/memblock


This patch (of 17):

To denote areas that were reserved for kernel use either directly with
memblock_reserve_kern() or via memblock allocations.

Link: https://lore.kernel.org/lkml/20250424083258.2228122-1-changyuanl@google.com/
Link: https://lore.kernel.org/lkml/aAeaJ2iqkrv_ffhT@kernel.org/
Link: https://lore.kernel.org/lkml/35c58191-f774-40cf-8d66-d1e2aaf11a62@intel.com/
Link: https://lore.kernel.org/lkml/20250424093302.3894961-1-arnd@kernel.org/
Link: https://lkml.kernel.org/r/20250509074635.3187114-1-changyuanl@google.com
Link: https://lkml.kernel.org/r/20250509074635.3187114-2-changyuanl@google.com
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Co-developed-by: Changyuan Lyu &lt;changyuanl@google.com&gt;
Signed-off-by: Changyuan Lyu &lt;changyuanl@google.com&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Anthony Yznaga &lt;anthony.yznaga@oracle.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Ashish Kalra &lt;ashish.kalra@amd.com&gt;
Cc: Ben Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Borislav Betkov &lt;bp@alien8.de&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: David Woodhouse &lt;dwmw2@infradead.org&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Gowans &lt;jgowans@amazon.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Krzysztof Kozlowski &lt;krzk@kernel.org&gt;
Cc: Marc Rutland &lt;mark.rutland@arm.com&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Pratyush Yadav &lt;ptyadav@amazon.de&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Saravana Kannan &lt;saravanak@google.com&gt;
Cc: Stanislav Kinsburskii &lt;skinsburskii@linux.microsoft.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Cc: Thomas Lendacky &lt;thomas.lendacky@amd.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>memblock tests: add test for memblock_set_node</title>
<updated>2025-04-07T06:28:01+00:00</updated>
<author>
<name>Wei Yang</name>
<email>richard.weiyang@gmail.com</email>
</author>
<published>2025-03-18T07:19:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3b394dff15e14550a26b133fc7b556b5b526f6a5'/>
<id>3b394dff15e14550a26b133fc7b556b5b526f6a5</id>
<content type='text'>
Add a test to check memblock_set_node() behavior.

And create a corner case in which the memblock.reserved array is doubled
during memblock_set_node(). And finally make sure all regions in
memblock.reserved are with valid node id.

Signed-off-by: Wei Yang &lt;richard.weiyang@gmail.com&gt;
CC: Mike Rapoport &lt;rppt@kernel.org&gt;
CC: Yajun Deng &lt;yajun.deng@linux.dev&gt;
Link: https://lore.kernel.org/r/20250318071948.23854-4-richard.weiyang@gmail.com
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a test to check memblock_set_node() behavior.

And create a corner case in which the memblock.reserved array is doubled
during memblock_set_node(). And finally make sure all regions in
memblock.reserved are with valid node id.

Signed-off-by: Wei Yang &lt;richard.weiyang@gmail.com&gt;
CC: Mike Rapoport &lt;rppt@kernel.org&gt;
CC: Yajun Deng &lt;yajun.deng@linux.dev&gt;
Link: https://lore.kernel.org/r/20250318071948.23854-4-richard.weiyang@gmail.com
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>memblock tests: Fix mutex related build error</title>
<updated>2025-04-07T06:28:01+00:00</updated>
<author>
<name>Masami Hiramatsu (Google)</name>
<email>mhiramat@kernel.org</email>
</author>
<published>2025-04-07T01:43:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ed471e1984939a500eea179bc16e1c2aadf00db5'/>
<id>ed471e1984939a500eea179bc16e1c2aadf00db5</id>
<content type='text'>
Fix mutex and free_reserved_area() related build errors which have
been introduced by commit 74e2498ccf7b ("mm/memblock: Add reserved
memory release function").

Fixes: 74e2498ccf7b ("mm/memblock: Add reserved memory release function")
Reported-by: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Closes: https://lore.kernel.org/all/20250405023018.g2ae52nrz2757b3n@master/
Signed-off-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Link: https://lore.kernel.org/r/174399023133.47537.7375975856054461445.stgit@devnote2
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix mutex and free_reserved_area() related build errors which have
been introduced by commit 74e2498ccf7b ("mm/memblock: Add reserved
memory release function").

Fixes: 74e2498ccf7b ("mm/memblock: Add reserved memory release function")
Reported-by: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Closes: https://lore.kernel.org/all/20250405023018.g2ae52nrz2757b3n@master/
Signed-off-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Link: https://lore.kernel.org/r/174399023133.47537.7375975856054461445.stgit@devnote2
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
