<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/arch/x86/kernel/e820.c, branch linux-rolling-stable</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>x86: Restrict KVM-induced symbol exports to KVM modules where obvious/possible</title>
<updated>2025-11-12T23:29:38+00:00</updated>
<author>
<name>Sean Christopherson</name>
<email>seanjc@google.com</email>
</author>
<published>2025-11-12T17:39:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6276c67f2bc4aeaf350a7cf889c33c38b3330ea9'/>
<id>6276c67f2bc4aeaf350a7cf889c33c38b3330ea9</id>
<content type='text'>
Extend KVM's export macro framework to provide EXPORT_SYMBOL_FOR_KVM(),
and use the helper macro to export symbols for KVM throughout x86 if and
only if KVM will build one or more modules, and only for those modules.

To avoid unnecessary exports when CONFIG_KVM=m but kvm.ko will not be
built (because no vendor modules are selected), let arch code #define
EXPORT_SYMBOL_FOR_KVM to suppress/override the exports.

Note, the set of symbols to restrict to KVM was generated by manual search
and audit; any "misses" are due to human error, not some grand plan.

Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Acked-by: Kai Huang &lt;kai.huang@intel.com&gt;
Tested-by: Kai Huang &lt;kai.huang@intel.com&gt;
Link: https://patch.msgid.link/20251112173944.1380633-5-seanjc%40google.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Extend KVM's export macro framework to provide EXPORT_SYMBOL_FOR_KVM(),
and use the helper macro to export symbols for KVM throughout x86 if and
only if KVM will build one or more modules, and only for those modules.

To avoid unnecessary exports when CONFIG_KVM=m but kvm.ko will not be
built (because no vendor modules are selected), let arch code #define
EXPORT_SYMBOL_FOR_KVM to suppress/override the exports.

Note, the set of symbols to restrict to KVM was generated by manual search
and audit; any "misses" are due to human error, not some grand plan.

Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Acked-by: Kai Huang &lt;kai.huang@intel.com&gt;
Tested-by: Kai Huang &lt;kai.huang@intel.com&gt;
Link: https://patch.msgid.link/20251112173944.1380633-5-seanjc%40google.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/e820: temporarily enable KHO scratch for memory below 1M</title>
<updated>2025-05-13T06:50:41+00:00</updated>
<author>
<name>Alexander Graf</name>
<email>graf@amazon.com</email>
</author>
<published>2025-05-09T07:46:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a2daf83e10378ff4ef61f75da710cac9b84e3eaa'/>
<id>a2daf83e10378ff4ef61f75da710cac9b84e3eaa</id>
<content type='text'>
KHO kernels are special and use only scratch memory for memblock
allocations, but memory below 1M is ignored by kernel after early boot and
cannot be naturally marked as scratch.

To allow allocation of the real-mode trampoline and a few (if any) other
very early allocations from below 1M forcibly mark the memory below 1M as
scratch.

After real mode trampoline is allocated, clear that scratch marking.

Link: https://lkml.kernel.org/r/20250509074635.3187114-13-changyuanl@google.com
Signed-off-by: Alexander Graf &lt;graf@amazon.com&gt;
Co-developed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Co-developed-by: Changyuan Lyu &lt;changyuanl@google.com&gt;
Signed-off-by: Changyuan Lyu &lt;changyuanl@google.com&gt;
Acked-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Anthony Yznaga &lt;anthony.yznaga@oracle.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Ashish Kalra &lt;ashish.kalra@amd.com&gt;
Cc: Ben Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Borislav Betkov &lt;bp@alien8.de&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: David Woodhouse &lt;dwmw2@infradead.org&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Gowans &lt;jgowans@amazon.com&gt;
Cc: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Krzysztof Kozlowski &lt;krzk@kernel.org&gt;
Cc: Marc Rutland &lt;mark.rutland@arm.com&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Pratyush Yadav &lt;ptyadav@amazon.de&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Saravana Kannan &lt;saravanak@google.com&gt;
Cc: Stanislav Kinsburskii &lt;skinsburskii@linux.microsoft.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Cc: Thomas Lendacky &lt;thomas.lendacky@amd.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
KHO kernels are special and use only scratch memory for memblock
allocations, but memory below 1M is ignored by kernel after early boot and
cannot be naturally marked as scratch.

To allow allocation of the real-mode trampoline and a few (if any) other
very early allocations from below 1M forcibly mark the memory below 1M as
scratch.

After real mode trampoline is allocated, clear that scratch marking.

Link: https://lkml.kernel.org/r/20250509074635.3187114-13-changyuanl@google.com
Signed-off-by: Alexander Graf &lt;graf@amazon.com&gt;
Co-developed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Co-developed-by: Changyuan Lyu &lt;changyuanl@google.com&gt;
Signed-off-by: Changyuan Lyu &lt;changyuanl@google.com&gt;
Acked-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Anthony Yznaga &lt;anthony.yznaga@oracle.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Ashish Kalra &lt;ashish.kalra@amd.com&gt;
Cc: Ben Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Borislav Betkov &lt;bp@alien8.de&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: David Woodhouse &lt;dwmw2@infradead.org&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Gowans &lt;jgowans@amazon.com&gt;
Cc: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Krzysztof Kozlowski &lt;krzk@kernel.org&gt;
Cc: Marc Rutland &lt;mark.rutland@arm.com&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Pratyush Yadav &lt;ptyadav@amazon.de&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Saravana Kannan &lt;saravanak@google.com&gt;
Cc: Stanislav Kinsburskii &lt;skinsburskii@linux.microsoft.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Cc: Thomas Lendacky &lt;thomas.lendacky@amd.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/e820: Discard high memory that can't be addressed by 32-bit systems</title>
<updated>2025-04-19T14:48:18+00:00</updated>
<author>
<name>Mike Rapoport (Microsoft)</name>
<email>rppt@kernel.org</email>
</author>
<published>2025-04-13T08:08:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=83b2d345e1786fdab96fc2b52942eebde125e7cd'/>
<id>83b2d345e1786fdab96fc2b52942eebde125e7cd</id>
<content type='text'>
Dave Hansen reports the following crash on a 32-bit system with
CONFIG_HIGHMEM=y and CONFIG_X86_PAE=y:

  &gt; 0xf75fe000 is the mem_map[] entry for the first page &gt;4GB. It
  &gt; obviously wasn't allocated, thus the oops.

  BUG: unable to handle page fault for address: f75fe000
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  *pdpt = 0000000002da2001 *pde = 000000000300c067 *pte = 0000000000000000
  Oops: Oops: 0002 [#1] SMP NOPTI
  CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc1-00288-ge618ee89561b-dirty #311 PREEMPT(undef)
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
  EIP: __free_pages_core+0x3c/0x74
  ...
  Call Trace:
   memblock_free_pages+0x11/0x2c
   memblock_free_all+0x2ce/0x3a0
   mm_core_init+0xf5/0x320
   start_kernel+0x296/0x79c
   i386_start_kernel+0xad/0xb0
   startup_32_smp+0x151/0x154

The mem_map[] is allocated up to the end of ZONE_HIGHMEM which is defined
by max_pfn.

The bug was introduced by this recent commit:

  6faea3422e3b ("arch, mm: streamline HIGHMEM freeing")

Previously, freeing of high memory was also clamped to the end of
ZONE_HIGHMEM but after this change, memblock_free_all() tries to
free memory above the of ZONE_HIGHMEM as well and that causes
access to mem_map[] entries beyond the end of the memory map.

To fix this, discard the memory after max_pfn from memblock on
32-bit systems so that core MM would be aware only of actually
usable memory.

Fixes: 6faea3422e3b ("arch, mm: streamline HIGHMEM freeing")
Reported-by: Dave Hansen &lt;dave.hansen@intel.com&gt;
Tested-by: Arnd Bergmann &lt;arnd@kernel.org&gt;
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Andy Shevchenko &lt;andy@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Davide Ciminaghi &lt;ciminaghi@gnudd.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Sean Christopherson &lt;seanjc@google.com&gt;
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20250413080858.743221-1-rppt@kernel.org # discussion and submission
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Dave Hansen reports the following crash on a 32-bit system with
CONFIG_HIGHMEM=y and CONFIG_X86_PAE=y:

  &gt; 0xf75fe000 is the mem_map[] entry for the first page &gt;4GB. It
  &gt; obviously wasn't allocated, thus the oops.

  BUG: unable to handle page fault for address: f75fe000
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  *pdpt = 0000000002da2001 *pde = 000000000300c067 *pte = 0000000000000000
  Oops: Oops: 0002 [#1] SMP NOPTI
  CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc1-00288-ge618ee89561b-dirty #311 PREEMPT(undef)
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
  EIP: __free_pages_core+0x3c/0x74
  ...
  Call Trace:
   memblock_free_pages+0x11/0x2c
   memblock_free_all+0x2ce/0x3a0
   mm_core_init+0xf5/0x320
   start_kernel+0x296/0x79c
   i386_start_kernel+0xad/0xb0
   startup_32_smp+0x151/0x154

The mem_map[] is allocated up to the end of ZONE_HIGHMEM which is defined
by max_pfn.

The bug was introduced by this recent commit:

  6faea3422e3b ("arch, mm: streamline HIGHMEM freeing")

Previously, freeing of high memory was also clamped to the end of
ZONE_HIGHMEM but after this change, memblock_free_all() tries to
free memory above the of ZONE_HIGHMEM as well and that causes
access to mem_map[] entries beyond the end of the memory map.

To fix this, discard the memory after max_pfn from memblock on
32-bit systems so that core MM would be aware only of actually
usable memory.

Fixes: 6faea3422e3b ("arch, mm: streamline HIGHMEM freeing")
Reported-by: Dave Hansen &lt;dave.hansen@intel.com&gt;
Tested-by: Arnd Bergmann &lt;arnd@kernel.org&gt;
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Andy Shevchenko &lt;andy@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Davide Ciminaghi &lt;ciminaghi@gnudd.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Sean Christopherson &lt;seanjc@google.com&gt;
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20250413080858.743221-1-rppt@kernel.org # discussion and submission
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/e820: Fix handling of subpage regions when calculating nosave ranges in e820__register_nosave_regions()</title>
<updated>2025-04-07T17:20:08+00:00</updated>
<author>
<name>Myrrh Periwinkle</name>
<email>myrrhperiwinkle@qtmlabs.xyz</email>
</author>
<published>2025-04-06T04:45:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f2f29da9f0d4367f6ff35e0d9d021257bb53e273'/>
<id>f2f29da9f0d4367f6ff35e0d9d021257bb53e273</id>
<content type='text'>
While debugging kexec/hibernation hangs and crashes, it turned out that
the current implementation of e820__register_nosave_regions() suffers from
multiple serious issues:

 - The end of last region is tracked by PFN, causing it to find holes
   that aren't there if two consecutive subpage regions are present

 - The nosave PFN ranges derived from holes are rounded out (instead of
   rounded in) which makes it inconsistent with how explicitly reserved
   regions are handled

Fix this by:

 - Treating reserved regions as if they were holes, to ensure consistent
   handling (rounding out nosave PFN ranges is more correct as the
   kernel does not use partial pages)

 - Tracking the end of the last RAM region by address instead of pages
   to detect holes more precisely

These bugs appear to have been introduced about ~18 years ago with the very
first version of e820_mark_nosave_regions(), and its flawed assumptions were
carried forward uninterrupted through various waves of rewrites and renames.

[ mingo: Added Git archeology details, for kicks and giggles. ]

Fixes: e8eff5ac294e ("[PATCH] Make swsusp avoid memory holes and reserved memory regions on x86_64")
Reported-by: Roberto Ricci &lt;io@r-ricci.it&gt;
Tested-by: Roberto Ricci &lt;io@r-ricci.it&gt;
Signed-off-by: Myrrh Periwinkle &lt;myrrhperiwinkle@qtmlabs.xyz&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Cc: Len Brown &lt;len.brown@intel.com&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250406-fix-e820-nosave-v3-1-f3787bc1ee1d@qtmlabs.xyz
Closes: https://lore.kernel.org/all/Z4WFjBVHpndct7br@desktop0a/
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While debugging kexec/hibernation hangs and crashes, it turned out that
the current implementation of e820__register_nosave_regions() suffers from
multiple serious issues:

 - The end of last region is tracked by PFN, causing it to find holes
   that aren't there if two consecutive subpage regions are present

 - The nosave PFN ranges derived from holes are rounded out (instead of
   rounded in) which makes it inconsistent with how explicitly reserved
   regions are handled

Fix this by:

 - Treating reserved regions as if they were holes, to ensure consistent
   handling (rounding out nosave PFN ranges is more correct as the
   kernel does not use partial pages)

 - Tracking the end of the last RAM region by address instead of pages
   to detect holes more precisely

These bugs appear to have been introduced about ~18 years ago with the very
first version of e820_mark_nosave_regions(), and its flawed assumptions were
carried forward uninterrupted through various waves of rewrites and renames.

[ mingo: Added Git archeology details, for kicks and giggles. ]

Fixes: e8eff5ac294e ("[PATCH] Make swsusp avoid memory holes and reserved memory regions on x86_64")
Reported-by: Roberto Ricci &lt;io@r-ricci.it&gt;
Tested-by: Roberto Ricci &lt;io@r-ricci.it&gt;
Signed-off-by: Myrrh Periwinkle &lt;myrrhperiwinkle@qtmlabs.xyz&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Cc: Len Brown &lt;len.brown@intel.com&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250406-fix-e820-nosave-v3-1-f3787bc1ee1d@qtmlabs.xyz
Closes: https://lore.kernel.org/all/Z4WFjBVHpndct7br@desktop0a/
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/kexec: Export e820_table_kexec[] to sysfs</title>
<updated>2025-02-22T11:44:45+00:00</updated>
<author>
<name>Dave Young</name>
<email>dyoung@redhat.com</email>
</author>
<published>2025-01-28T13:32:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a2498e5c453b3d8d054d77751487cd593332f8c2'/>
<id>a2498e5c453b3d8d054d77751487cd593332f8c2</id>
<content type='text'>
Previously the e820_table_kexec[] was exported to sysfs since kexec-tools uses
the memmap entries to prepare the e820 table for the new kernel.

The following commit, ~8 years ago, introduced e820_table_firmware[] and changed
the behavior to export the firmware table instead:

  12df216c61c8 ("x86/boot/e820: Introduce the bootloader provided e820_table_firmware[] table")

Originally the kexec_file_load and kexec_load syscalls both used e820_table_kexec[].
Since the sysfs exported entries are from e820_table_firmware[] people
now need to tune both tables for kexec.

Restore the old behavior so the kexec_load and kexec_file_load syscalls work with
only one table update.  The e820_table_firmware[] is used by hibernation kernel
code and it works without the sysfs exporting. Also remove the SEV
e820_table_firmware[] updating code.

Also update the code comments here and drop the comments about setup_data
reservation since it is not needed any more after this change was made
a year ago:

  fc7f27cda843 ("x86/kexec: Do not update E820 kexec table for setup_data")

[ mingo: Tidy up the changelog and comments. ]

Signed-off-by: Dave Young &lt;dyoung@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Ashish Kalra &lt;ashish.kalra@amd.com&gt;
Cc: Sean Christopherson &lt;seanjc@google.com&gt;
Cc: Joerg Roedel &lt;jroedel@suse.de&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Link: https://lore.kernel.org/r/Z5jcb1GKhLvH8kDc@darkstar.users.ipa.redhat.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously the e820_table_kexec[] was exported to sysfs since kexec-tools uses
the memmap entries to prepare the e820 table for the new kernel.

The following commit, ~8 years ago, introduced e820_table_firmware[] and changed
the behavior to export the firmware table instead:

  12df216c61c8 ("x86/boot/e820: Introduce the bootloader provided e820_table_firmware[] table")

Originally the kexec_file_load and kexec_load syscalls both used e820_table_kexec[].
Since the sysfs exported entries are from e820_table_firmware[] people
now need to tune both tables for kexec.

Restore the old behavior so the kexec_load and kexec_file_load syscalls work with
only one table update.  The e820_table_firmware[] is used by hibernation kernel
code and it works without the sysfs exporting. Also remove the SEV
e820_table_firmware[] updating code.

Also update the code comments here and drop the comments about setup_data
reservation since it is not needed any more after this change was made
a year ago:

  fc7f27cda843 ("x86/kexec: Do not update E820 kexec table for setup_data")

[ mingo: Tidy up the changelog and comments. ]

Signed-off-by: Dave Young &lt;dyoung@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Ashish Kalra &lt;ashish.kalra@amd.com&gt;
Cc: Sean Christopherson &lt;seanjc@google.com&gt;
Cc: Joerg Roedel &lt;jroedel@suse.de&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Link: https://lore.kernel.org/r/Z5jcb1GKhLvH8kDc@darkstar.users.ipa.redhat.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/e820: Drop obsolete E820_TYPE_RESERVED_KERN and related code</title>
<updated>2025-02-21T15:05:00+00:00</updated>
<author>
<name>Mike Rapoport (Microsoft)</name>
<email>rppt@kernel.org</email>
</author>
<published>2025-02-14T09:06:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=efe659ac014647eb048d62475e55cce42d8951d7'/>
<id>efe659ac014647eb048d62475e55cce42d8951d7</id>
<content type='text'>
E820_TYPE_RESERVED_KERN is a relict from the ancient history that was used
to early reserve setup_data, see:

  28bb22379513 ("x86: move reserve_setup_data to setup.c")

Nowadays setup_data is anyway reserved in memblock and there is no point in
carrying E820_TYPE_RESERVED_KERN that behaves exactly like E820_TYPE_RAM
but only complicates the code.

A bonus for removing E820_TYPE_RESERVED_KERN is a small but measurable
speedup of 20 microseconds in init_mem_mappings() on a VM with 32GB or RAM.

Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20250214090651.3331663-5-rppt@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
E820_TYPE_RESERVED_KERN is a relict from the ancient history that was used
to early reserve setup_data, see:

  28bb22379513 ("x86: move reserve_setup_data to setup.c")

Nowadays setup_data is anyway reserved in memblock and there is no point in
carrying E820_TYPE_RESERVED_KERN that behaves exactly like E820_TYPE_RAM
but only complicates the code.

A bonus for removing E820_TYPE_RESERVED_KERN is a small but measurable
speedup of 20 microseconds in init_mem_mappings() on a VM with 32GB or RAM.

Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20250214090651.3331663-5-rppt@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/boot: Move setting of memblock parameters to e820__memblock_setup()</title>
<updated>2025-02-21T15:05:00+00:00</updated>
<author>
<name>Mike Rapoport (Microsoft)</name>
<email>rppt@kernel.org</email>
</author>
<published>2025-02-14T09:06:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2d6bff31399b2443aba230cd20f120acb2d4eeb1'/>
<id>2d6bff31399b2443aba230cd20f120acb2d4eeb1</id>
<content type='text'>
Changing memblock parameters, namely bottom_up and allocation upper
limit does not have any effect before memblock initialization in
e820__memblock_setup().

Move the calls to memblock_set_bottom_up() and memblock_set_current_limit()
to e820__memblock_setup() to group all the memblock initial setup and make
setup_arch() more readable.

No functional changes.

Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20250214090651.3331663-2-rppt@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Changing memblock parameters, namely bottom_up and allocation upper
limit does not have any effect before memblock initialization in
e820__memblock_setup().

Move the calls to memblock_set_bottom_up() and memblock_set_current_limit()
to e820__memblock_setup() to group all the memblock initial setup and make
setup_arch() more readable.

No functional changes.

Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20250214090651.3331663-2-rppt@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/memblock: add memblock_alloc_or_panic interface</title>
<updated>2025-01-26T04:22:38+00:00</updated>
<author>
<name>Guo Weikang</name>
<email>guoweikang.kernel@gmail.com</email>
</author>
<published>2025-01-02T07:25:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c6f239796b55dbc4225a6fca9f96232092b9df83'/>
<id>c6f239796b55dbc4225a6fca9f96232092b9df83</id>
<content type='text'>
Before SLUB initialization, various subsystems used memblock_alloc to
allocate memory.  In most cases, when memory allocation fails, an
immediate panic is required.  To simplify this behavior and reduce
repetitive checks, introduce `memblock_alloc_or_panic`.  This function
ensures that memory allocation failures result in a panic automatically,
improving code readability and consistency across subsystems that require
this behavior.

[guoweikang.kernel@gmail.com: arch/s390: save_area_alloc default failure behavior changed to panic]
  Link: https://lkml.kernel.org/r/20250109033136.2845676-1-guoweikang.kernel@gmail.com
  Link: https://lore.kernel.org/lkml/Z2fknmnNtiZbCc7x@kernel.org/
Link: https://lkml.kernel.org/r/20250102072528.650926-1-guoweikang.kernel@gmail.com
Signed-off-by: Guo Weikang &lt;guoweikang.kernel@gmail.com&gt;
Acked-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;	[m68k]
Reviewed-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;	[s390]
Acked-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Before SLUB initialization, various subsystems used memblock_alloc to
allocate memory.  In most cases, when memory allocation fails, an
immediate panic is required.  To simplify this behavior and reduce
repetitive checks, introduce `memblock_alloc_or_panic`.  This function
ensures that memory allocation failures result in a panic automatically,
improving code readability and consistency across subsystems that require
this behavior.

[guoweikang.kernel@gmail.com: arch/s390: save_area_alloc default failure behavior changed to panic]
  Link: https://lkml.kernel.org/r/20250109033136.2845676-1-guoweikang.kernel@gmail.com
  Link: https://lore.kernel.org/lkml/Z2fknmnNtiZbCc7x@kernel.org/
Link: https://lkml.kernel.org/r/20250102072528.650926-1-guoweikang.kernel@gmail.com
Signed-off-by: Guo Weikang &lt;guoweikang.kernel@gmail.com&gt;
Acked-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;	[m68k]
Reviewed-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;	[s390]
Acked-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mm: Make e820__end_ram_pfn() cover E820_TYPE_ACPI ranges</title>
<updated>2024-06-17T15:46:08+00:00</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2024-06-14T09:58:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=06fa48d85b09b3e67afeda220bc19f7102b53beb'/>
<id>06fa48d85b09b3e67afeda220bc19f7102b53beb</id>
<content type='text'>
e820__end_of_ram_pfn() is used to calculate max_pfn which, among other things,
guides where direct mapping ends. Any memory above max_pfn is not going to be
present in the direct mapping.

e820__end_of_ram_pfn() finds the end of the RAM based on the highest
E820_TYPE_RAM range. But it doesn't includes E820_TYPE_ACPI ranges into
calculation.

Despite the name, E820_TYPE_ACPI covers not only ACPI data, but also EFI tables
and might be required by kernel to function properly.

Usually the problem is hidden because there is some E820_TYPE_RAM memory above
E820_TYPE_ACPI. But crashkernel only presents pre-allocated crash memory as
E820_TYPE_RAM on boot. If the pre-allocated range is small, it can fit under the
last E820_TYPE_ACPI range.

Modify e820__end_of_ram_pfn() and e820__end_of_low_ram_pfn() to cover
E820_TYPE_ACPI memory.

The problem was discovered during debugging kexec for TDX guest. TDX guest uses
E820_TYPE_ACPI to store the unaccepted memory bitmap and pass it between the
kernels on kexec.

Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Dave Hansen &lt;dave.hansen@intel.com&gt;
Tested-by: Tao Liu &lt;ltao@redhat.com&gt;
Link: https://lore.kernel.org/r/20240614095904.1345461-13-kirill.shutemov@linux.intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
e820__end_of_ram_pfn() is used to calculate max_pfn which, among other things,
guides where direct mapping ends. Any memory above max_pfn is not going to be
present in the direct mapping.

e820__end_of_ram_pfn() finds the end of the RAM based on the highest
E820_TYPE_RAM range. But it doesn't includes E820_TYPE_ACPI ranges into
calculation.

Despite the name, E820_TYPE_ACPI covers not only ACPI data, but also EFI tables
and might be required by kernel to function properly.

Usually the problem is hidden because there is some E820_TYPE_RAM memory above
E820_TYPE_ACPI. But crashkernel only presents pre-allocated crash memory as
E820_TYPE_RAM on boot. If the pre-allocated range is small, it can fit under the
last E820_TYPE_ACPI range.

Modify e820__end_of_ram_pfn() and e820__end_of_low_ram_pfn() to cover
E820_TYPE_ACPI memory.

The problem was discovered during debugging kexec for TDX guest. TDX guest uses
E820_TYPE_ACPI to store the unaccepted memory bitmap and pass it between the
kernels on kexec.

Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Dave Hansen &lt;dave.hansen@intel.com&gt;
Tested-by: Tao Liu &lt;ltao@redhat.com&gt;
Link: https://lore.kernel.org/r/20240614095904.1345461-13-kirill.shutemov@linux.intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/e820: Add a new e820 table update helper</title>
<updated>2024-04-29T09:15:31+00:00</updated>
<author>
<name>Ashish Kalra</name>
<email>ashish.kalra@amd.com</email>
</author>
<published>2024-04-26T00:41:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d6d85ac15cce4dcf02cf8c96cb970562be6a3529'/>
<id>d6d85ac15cce4dcf02cf8c96cb970562be6a3529</id>
<content type='text'>
Add a new API helper e820__range_update_table() with which to update an
arbitrary e820 table. Move all current users of
e820__range_update_kexec() to this new helper.

  [ bp: Massage. ]

Signed-off-by: Ashish Kalra &lt;ashish.kalra@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Link: https://lore.kernel.org/r/b726af213ad55053f8a7a1e793b01bb3f1ca9dd5.1714090302.git.ashish.kalra@amd.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a new API helper e820__range_update_table() with which to update an
arbitrary e820 table. Move all current users of
e820__range_update_kexec() to this new helper.

  [ bp: Massage. ]

Signed-off-by: Ashish Kalra &lt;ashish.kalra@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Link: https://lore.kernel.org/r/b726af213ad55053f8a7a1e793b01bb3f1ca9dd5.1714090302.git.ashish.kalra@amd.com
</pre>
</div>
</content>
</entry>
</feed>
