<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/x86/kernel, branch v6.11</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge tag 'hyperv-fixes-signed-20240908' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux</title>
<updated>2024-09-09T16:31:55+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-09-09T16:31:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fb92a1ffc121e65ffed13c6bfe01c190487d791e'/>
<id>fb92a1ffc121e65ffed13c6bfe01c190487d791e</id>
<content type='text'>
Pull hyperv fixes from Wei Liu:

 - Add a documentation overview of Confidential Computing VM support
   (Michael Kelley)

 - Use lapic timer in a TDX VM without paravisor (Dexuan Cui)

 - Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency
   (Michael Kelley)

 - Fix a kexec crash due to VP assist page corruption (Anirudh
   Rayabharam)

 - Python3 compatibility fix for lsvmbus (Anthony Nandaa)

 - Misc fixes (Rachel Menge, Roman Kisel, zhang jiao, Hongbo Li)

* tag 'hyperv-fixes-signed-20240908' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  hv: vmbus: Constify struct kobj_type and struct attribute_group
  tools: hv: rm .*.cmd when make clean
  x86/hyperv: fix kexec crash due to VP assist page corruption
  Drivers: hv: vmbus: Fix the misplaced function description
  tools: hv: lsvmbus: change shebang to use python3
  x86/hyperv: Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency
  Documentation: hyperv: Add overview of Confidential Computing VM support
  clocksource: hyper-v: Use lapic timer in a TDX VM without paravisor
  Drivers: hv: Remove deprecated hv_fcopy declarations
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull hyperv fixes from Wei Liu:

 - Add a documentation overview of Confidential Computing VM support
   (Michael Kelley)

 - Use lapic timer in a TDX VM without paravisor (Dexuan Cui)

 - Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency
   (Michael Kelley)

 - Fix a kexec crash due to VP assist page corruption (Anirudh
   Rayabharam)

 - Python3 compatibility fix for lsvmbus (Anthony Nandaa)

 - Misc fixes (Rachel Menge, Roman Kisel, zhang jiao, Hongbo Li)

* tag 'hyperv-fixes-signed-20240908' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  hv: vmbus: Constify struct kobj_type and struct attribute_group
  tools: hv: rm .*.cmd when make clean
  x86/hyperv: fix kexec crash due to VP assist page corruption
  Drivers: hv: vmbus: Fix the misplaced function description
  tools: hv: lsvmbus: change shebang to use python3
  x86/hyperv: Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency
  Documentation: hyperv: Add overview of Confidential Computing VM support
  clocksource: hyper-v: Use lapic timer in a TDX VM without paravisor
  Drivers: hv: Remove deprecated hv_fcopy declarations
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/hyperv: fix kexec crash due to VP assist page corruption</title>
<updated>2024-09-05T07:21:37+00:00</updated>
<author>
<name>Anirudh Rayabharam (Microsoft)</name>
<email>anirudh@anirudhrb.com</email>
</author>
<published>2024-08-28T11:21:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b9af6418279c4cf73ca073f8ea024992b38be8ab'/>
<id>b9af6418279c4cf73ca073f8ea024992b38be8ab</id>
<content type='text'>
commit 9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when
CPUs go online/offline") introduces a new cpuhp state for hyperv
initialization.

cpuhp_setup_state() returns the state number if state is
CPUHP_AP_ONLINE_DYN or CPUHP_BP_PREPARE_DYN and 0 for all other states.
For the hyperv case, since a new cpuhp state was introduced it would
return 0. However, in hv_machine_shutdown(), the cpuhp_remove_state() call
is conditioned upon "hyperv_init_cpuhp &gt; 0". This will never be true and
so hv_cpu_die() won't be called on all CPUs. This means the VP assist page
won't be reset. When the kexec kernel tries to setup the VP assist page
again, the hypervisor corrupts the memory region of the old VP assist page
causing a panic in case the kexec kernel is using that memory elsewhere.
This was originally fixed in commit dfe94d4086e4 ("x86/hyperv: Fix kexec
panic/hang issues").

Get rid of hyperv_init_cpuhp entirely since we are no longer using a
dynamic cpuhp state and use CPUHP_AP_HYPERV_ONLINE directly with
cpuhp_remove_state().

Cc: stable@vger.kernel.org
Fixes: 9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline")
Signed-off-by: Anirudh Rayabharam (Microsoft) &lt;anirudh@anirudhrb.com&gt;
Reviewed-by: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Reviewed-by: Michael Kelley &lt;mhklinux@outlook.com&gt;
Link: https://lore.kernel.org/r/20240828112158.3538342-1-anirudh@anirudhrb.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Message-ID: &lt;20240828112158.3538342-1-anirudh@anirudhrb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when
CPUs go online/offline") introduces a new cpuhp state for hyperv
initialization.

cpuhp_setup_state() returns the state number if state is
CPUHP_AP_ONLINE_DYN or CPUHP_BP_PREPARE_DYN and 0 for all other states.
For the hyperv case, since a new cpuhp state was introduced it would
return 0. However, in hv_machine_shutdown(), the cpuhp_remove_state() call
is conditioned upon "hyperv_init_cpuhp &gt; 0". This will never be true and
so hv_cpu_die() won't be called on all CPUs. This means the VP assist page
won't be reset. When the kexec kernel tries to setup the VP assist page
again, the hypervisor corrupts the memory region of the old VP assist page
causing a panic in case the kexec kernel is using that memory elsewhere.
This was originally fixed in commit dfe94d4086e4 ("x86/hyperv: Fix kexec
panic/hang issues").

Get rid of hyperv_init_cpuhp entirely since we are no longer using a
dynamic cpuhp state and use CPUHP_AP_HYPERV_ONLINE directly with
cpuhp_remove_state().

Cc: stable@vger.kernel.org
Fixes: 9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline")
Signed-off-by: Anirudh Rayabharam (Microsoft) &lt;anirudh@anirudhrb.com&gt;
Reviewed-by: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Reviewed-by: Michael Kelley &lt;mhklinux@outlook.com&gt;
Link: https://lore.kernel.org/r/20240828112158.3538342-1-anirudh@anirudhrb.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Message-ID: &lt;20240828112158.3538342-1-anirudh@anirudhrb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Fix arch_mbm_* array overrun on SNC</title>
<updated>2024-08-28T09:13:08+00:00</updated>
<author>
<name>Peter Newman</name>
<email>peternewman@google.com</email>
</author>
<published>2024-08-22T19:02:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a547a5880cba6f287179135381f1b484b251be31'/>
<id>a547a5880cba6f287179135381f1b484b251be31</id>
<content type='text'>
When using resctrl on systems with Sub-NUMA Clustering enabled, monitoring
groups may be allocated RMID values which would overrun the
arch_mbm_{local,total} arrays.

This is due to inconsistencies in whether the SNC-adjusted num_rmid value or
the unadjusted value in resctrl_arch_system_num_rmid_idx() is used. The
num_rmid value for the L3 resource is currently:

  resctrl_arch_system_num_rmid_idx() / snc_nodes_per_l3_cache

As a simple fix, make resctrl_arch_system_num_rmid_idx() return the
SNC-adjusted, L3 num_rmid value on x86.

Fixes: e13db55b5a0d ("x86/resctrl: Introduce snc_nodes_per_l3_cache")
Signed-off-by: Peter Newman &lt;peternewman@google.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Link: https://lore.kernel.org/r/20240822190212.1848788-1-peternewman@google.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When using resctrl on systems with Sub-NUMA Clustering enabled, monitoring
groups may be allocated RMID values which would overrun the
arch_mbm_{local,total} arrays.

This is due to inconsistencies in whether the SNC-adjusted num_rmid value or
the unadjusted value in resctrl_arch_system_num_rmid_idx() is used. The
num_rmid value for the L3 resource is currently:

  resctrl_arch_system_num_rmid_idx() / snc_nodes_per_l3_cache

As a simple fix, make resctrl_arch_system_num_rmid_idx() return the
SNC-adjusted, L3 num_rmid value on x86.

Fixes: e13db55b5a0d ("x86/resctrl: Introduce snc_nodes_per_l3_cache")
Signed-off-by: Peter Newman &lt;peternewman@google.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Link: https://lore.kernel.org/r/20240822190212.1848788-1-peternewman@google.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/fpu: Avoid writing LBR bit to IA32_XSS unless supported</title>
<updated>2024-08-14T17:40:20+00:00</updated>
<author>
<name>Mitchell Levy</name>
<email>levymitchell0@gmail.com</email>
</author>
<published>2024-08-12T20:44:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2848ff28d180bd63a95da8e5dcbcdd76c1beeb7b'/>
<id>2848ff28d180bd63a95da8e5dcbcdd76c1beeb7b</id>
<content type='text'>
There are two distinct CPU features related to the use of XSAVES and LBR:
whether LBR is itself supported and whether XSAVES supports LBR. The LBR
subsystem correctly checks both in intel_pmu_arch_lbr_init(), but the
XSTATE subsystem does not.

The LBR bit is only removed from xfeatures_mask_independent when LBR is not
supported by the CPU, but there is no validation of XSTATE support.

If XSAVES does not support LBR the write to IA32_XSS causes a #GP fault,
leaving the state of IA32_XSS unchanged, i.e. zero. The fault is handled
with a warning and the boot continues.

Consequently the next XRSTORS which tries to restore supervisor state fails
with #GP because the RFBM has zero for all supervisor features, which does
not match the XCOMP_BV field.

As XFEATURE_MASK_FPSTATE includes supervisor features setting up the FPU
causes a #GP, which ends up in fpu_reset_from_exception_fixup(). That fails
due to the same problem resulting in recursive #GPs until the kernel runs
out of stack space and double faults.

Prevent this by storing the supported independent features in
fpu_kernel_cfg during XSTATE initialization and use that cached value for
retrieving the independent feature bits to be written into IA32_XSS.

[ tglx: Massaged change log ]

Fixes: f0dccc9da4c0 ("x86/fpu/xstate: Support dynamic supervisor feature for LBR")
Suggested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Mitchell Levy &lt;levymitchell0@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240812-xsave-lbr-fix-v3-1-95bac1bf62f4@gmail.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are two distinct CPU features related to the use of XSAVES and LBR:
whether LBR is itself supported and whether XSAVES supports LBR. The LBR
subsystem correctly checks both in intel_pmu_arch_lbr_init(), but the
XSTATE subsystem does not.

The LBR bit is only removed from xfeatures_mask_independent when LBR is not
supported by the CPU, but there is no validation of XSTATE support.

If XSAVES does not support LBR the write to IA32_XSS causes a #GP fault,
leaving the state of IA32_XSS unchanged, i.e. zero. The fault is handled
with a warning and the boot continues.

Consequently the next XRSTORS which tries to restore supervisor state fails
with #GP because the RFBM has zero for all supervisor features, which does
not match the XCOMP_BV field.

As XFEATURE_MASK_FPSTATE includes supervisor features setting up the FPU
causes a #GP, which ends up in fpu_reset_from_exception_fixup(). That fails
due to the same problem resulting in recursive #GPs until the kernel runs
out of stack space and double faults.

Prevent this by storing the supported independent features in
fpu_kernel_cfg during XSTATE initialization and use that cached value for
retrieving the independent feature bits to be written into IA32_XSS.

[ tglx: Massaged change log ]

Fixes: f0dccc9da4c0 ("x86/fpu/xstate: Support dynamic supervisor feature for LBR")
Suggested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Mitchell Levy &lt;levymitchell0@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240812-xsave-lbr-fix-v3-1-95bac1bf62f4@gmail.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/apic: Make x2apic_disable() work correctly</title>
<updated>2024-08-13T13:15:19+00:00</updated>
<author>
<name>Yuntao Wang</name>
<email>yuntao.wang@linux.dev</email>
</author>
<published>2024-08-13T01:48:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0ecc5be200c84e67114f3640064ba2bae3ba2f5a'/>
<id>0ecc5be200c84e67114f3640064ba2bae3ba2f5a</id>
<content type='text'>
x2apic_disable() clears x2apic_state and x2apic_mode unconditionally, even
when the state is X2APIC_ON_LOCKED, which prevents the kernel to disable
it thereby creating inconsistent state.

Due to the early state check for X2APIC_ON, the code path which warns about
a locked X2APIC cannot be reached.

Test for state &lt; X2APIC_ON instead and move the clearing of the state and
mode variables to the place which actually disables X2APIC.

[ tglx: Massaged change log. Added Fixes tag. Moved clearing so it's at the
  	right place for back ports ]

Fixes: a57e456a7b28 ("x86/apic: Fix fallout from x2apic cleanup")
Signed-off-by: Yuntao Wang &lt;yuntao.wang@linux.dev&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240813014827.895381-1-yuntao.wang@linux.dev

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
x2apic_disable() clears x2apic_state and x2apic_mode unconditionally, even
when the state is X2APIC_ON_LOCKED, which prevents the kernel to disable
it thereby creating inconsistent state.

Due to the early state check for X2APIC_ON, the code path which warns about
a locked X2APIC cannot be reached.

Test for state &lt; X2APIC_ON instead and move the clearing of the state and
mode variables to the place which actually disables X2APIC.

[ tglx: Massaged change log. Added Fixes tag. Moved clearing so it's at the
  	right place for back ports ]

Fixes: a57e456a7b28 ("x86/apic: Fix fallout from x2apic cleanup")
Signed-off-by: Yuntao Wang &lt;yuntao.wang@linux.dev&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240813014827.895381-1-yuntao.wang@linux.dev

</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mtrr: Check if fixed MTRRs exist before saving them</title>
<updated>2024-08-08T15:03:12+00:00</updated>
<author>
<name>Andi Kleen</name>
<email>ak@linux.intel.com</email>
</author>
<published>2024-08-08T00:02:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=919f18f961c03d6694aa726c514184f2311a4614'/>
<id>919f18f961c03d6694aa726c514184f2311a4614</id>
<content type='text'>
MTRRs have an obsolete fixed variant for fine grained caching control
of the 640K-1MB region that uses separate MSRs. This fixed variant has
a separate capability bit in the MTRR capability MSR.

So far all x86 CPUs which support MTRR have this separate bit set, so it
went unnoticed that mtrr_save_state() does not check the capability bit
before accessing the fixed MTRR MSRs.

Though on a CPU that does not support the fixed MTRR capability this
results in a #GP.  The #GP itself is harmless because the RDMSR fault is
handled gracefully, but results in a WARN_ON().

Add the missing capability check to prevent this.

Fixes: 2b1f6278d77c ("[PATCH] x86: Save the MTRRs of the BSP before booting an AP")
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240808000244.946864-1-ak@linux.intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
MTRRs have an obsolete fixed variant for fine grained caching control
of the 640K-1MB region that uses separate MSRs. This fixed variant has
a separate capability bit in the MTRR capability MSR.

So far all x86 CPUs which support MTRR have this separate bit set, so it
went unnoticed that mtrr_save_state() does not check the capability bit
before accessing the fixed MTRR MSRs.

Though on a CPU that does not support the fixed MTRR capability this
results in a #GP.  The #GP itself is harmless because the RDMSR fault is
handled gracefully, but results in a WARN_ON().

Add the missing capability check to prevent this.

Fixes: 2b1f6278d77c ("[PATCH] x86: Save the MTRRs of the BSP before booting an AP")
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240808000244.946864-1-ak@linux.intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/paravirt: Fix incorrect virt spinlock setting on bare metal</title>
<updated>2024-08-07T18:04:38+00:00</updated>
<author>
<name>Chen Yu</name>
<email>yu.c.chen@intel.com</email>
</author>
<published>2024-08-06T11:22:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e639222a51196c69c70b49b67098ce2f9919ed08'/>
<id>e639222a51196c69c70b49b67098ce2f9919ed08</id>
<content type='text'>
The kernel can change spinlock behavior when running as a guest. But this
guest-friendly behavior causes performance problems on bare metal.

The kernel uses a static key to switch between the two modes.

In theory, the static key is enabled by default (run in guest mode) and
should be disabled for bare metal (and in some guests that want native
behavior or paravirt spinlock).

A performance drop is reported when running encode/decode workload and
BenchSEE cache sub-workload.

Bisect points to commit ce0a1b608bfc ("x86/paravirt: Silence unused
native_pv_lock_init() function warning"). When CONFIG_PARAVIRT_SPINLOCKS is
disabled the virt_spin_lock_key is incorrectly set to true on bare
metal. The qspinlock degenerates to test-and-set spinlock, which decreases
the performance on bare metal.

Set the default value of virt_spin_lock_key to false. If booting in a VM,
enable this key. Later during the VM initialization, if other
high-efficient spinlock is preferred (e.g. paravirt-spinlock), or the user
wants the native qspinlock (via nopvspin boot commandline), the
virt_spin_lock_key is disabled accordingly.

This results in the following decision matrix:

X86_FEATURE_HYPERVISOR         Y    Y       Y     N
CONFIG_PARAVIRT_SPINLOCKS      Y    Y       N     Y/N
PV spinlock                    Y    N       N     Y/N

virt_spin_lock_key             N    Y/N     Y     N

Fixes: ce0a1b608bfc ("x86/paravirt: Silence unused native_pv_lock_init() function warning")
Reported-by: Prem Nath Dey &lt;prem.nath.dey@intel.com&gt;
Reported-by: Xiaoping Zhou &lt;xiaoping.zhou@intel.com&gt;
Suggested-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Suggested-by: Qiuxu Zhuo &lt;qiuxu.zhuo@intel.com&gt;
Suggested-by: Nikolay Borisov &lt;nik.borisov@suse.com&gt;
Signed-off-by: Chen Yu &lt;yu.c.chen@intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Nikolay Borisov &lt;nik.borisov@suse.com&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240806112207.29792-1-yu.c.chen@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The kernel can change spinlock behavior when running as a guest. But this
guest-friendly behavior causes performance problems on bare metal.

The kernel uses a static key to switch between the two modes.

In theory, the static key is enabled by default (run in guest mode) and
should be disabled for bare metal (and in some guests that want native
behavior or paravirt spinlock).

A performance drop is reported when running encode/decode workload and
BenchSEE cache sub-workload.

Bisect points to commit ce0a1b608bfc ("x86/paravirt: Silence unused
native_pv_lock_init() function warning"). When CONFIG_PARAVIRT_SPINLOCKS is
disabled the virt_spin_lock_key is incorrectly set to true on bare
metal. The qspinlock degenerates to test-and-set spinlock, which decreases
the performance on bare metal.

Set the default value of virt_spin_lock_key to false. If booting in a VM,
enable this key. Later during the VM initialization, if other
high-efficient spinlock is preferred (e.g. paravirt-spinlock), or the user
wants the native qspinlock (via nopvspin boot commandline), the
virt_spin_lock_key is disabled accordingly.

This results in the following decision matrix:

X86_FEATURE_HYPERVISOR         Y    Y       Y     N
CONFIG_PARAVIRT_SPINLOCKS      Y    Y       N     Y/N
PV spinlock                    Y    N       N     Y/N

virt_spin_lock_key             N    Y/N     Y     N

Fixes: ce0a1b608bfc ("x86/paravirt: Silence unused native_pv_lock_init() function warning")
Reported-by: Prem Nath Dey &lt;prem.nath.dey@intel.com&gt;
Reported-by: Xiaoping Zhou &lt;xiaoping.zhou@intel.com&gt;
Suggested-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Suggested-by: Qiuxu Zhuo &lt;qiuxu.zhuo@intel.com&gt;
Suggested-by: Nikolay Borisov &lt;nik.borisov@suse.com&gt;
Signed-off-by: Chen Yu &lt;yu.c.chen@intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Nikolay Borisov &lt;nik.borisov@suse.com&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240806112207.29792-1-yu.c.chen@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/acpi: Remove __ro_after_init from acpi_mp_wake_mailbox</title>
<updated>2024-08-07T18:04:38+00:00</updated>
<author>
<name>Zhiquan Li</name>
<email>zhiquan1.li@intel.com</email>
</author>
<published>2024-08-05T10:35:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ab84ba647f2c94ac4d0c3fc6951c49f08aa1fcf7'/>
<id>ab84ba647f2c94ac4d0c3fc6951c49f08aa1fcf7</id>
<content type='text'>
On a platform using the "Multiprocessor Wakeup Structure"[1] to startup
secondary CPUs the control processor needs to memremap() the physical
address of the MP Wakeup Structure mailbox to the variable
acpi_mp_wake_mailbox, which holds the virtual address of mailbox.

To wake up the AP the control processor writes the APIC ID of AP, the
wakeup vector and the ACPI_MP_WAKE_COMMAND_WAKEUP command into the mailbox.

Current implementation doesn't consider the case which restricts boot time
CPU bringup to 1 with the kernel parameter "maxcpus=1" and brings other
CPUs online later from user space as it sets acpi_mp_wake_mailbox to
read-only after init.  So when the first AP is tried to brought online
after init, the attempt to update the variable results in a kernel panic.

The memremap() call that initializes the variable cannot be moved into
acpi_parse_mp_wake() because memremap() is not functional at that point in
the boot process. Also as the APs might never be brought up, keep the
memremap() call in acpi_wakeup_cpu() so that the operation only takes place
when needed.

Fixes: 24dd05da8c79 ("x86/apic: Mark acpi_mp_wake_* variables as __ro_after_init")
Signed-off-by: Zhiquan Li &lt;zhiquan1.li@intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Link: https://lore.kernel.org/all/20240805103531.1230635-1-zhiquan1.li@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On a platform using the "Multiprocessor Wakeup Structure"[1] to startup
secondary CPUs the control processor needs to memremap() the physical
address of the MP Wakeup Structure mailbox to the variable
acpi_mp_wake_mailbox, which holds the virtual address of mailbox.

To wake up the AP the control processor writes the APIC ID of AP, the
wakeup vector and the ACPI_MP_WAKE_COMMAND_WAKEUP command into the mailbox.

Current implementation doesn't consider the case which restricts boot time
CPU bringup to 1 with the kernel parameter "maxcpus=1" and brings other
CPUs online later from user space as it sets acpi_mp_wake_mailbox to
read-only after init.  So when the first AP is tried to brought online
after init, the attempt to update the variable results in a kernel panic.

The memremap() call that initializes the variable cannot be moved into
acpi_parse_mp_wake() because memremap() is not functional at that point in
the boot process. Also as the APs might never be brought up, keep the
memremap() call in acpi_wakeup_cpu() so that the operation only takes place
when needed.

Fixes: 24dd05da8c79 ("x86/apic: Mark acpi_mp_wake_* variables as __ro_after_init")
Signed-off-by: Zhiquan Li &lt;zhiquan1.li@intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Link: https://lore.kernel.org/all/20240805103531.1230635-1-zhiquan1.li@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/hyperv: Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency</title>
<updated>2024-08-02T23:47:30+00:00</updated>
<author>
<name>Michael Kelley</name>
<email>mhklinux@outlook.com</email>
</author>
<published>2024-06-06T02:55:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8fcc514809de41153b43ccbe1a0cdf7f72b78e7e'/>
<id>8fcc514809de41153b43ccbe1a0cdf7f72b78e7e</id>
<content type='text'>
A Linux guest on Hyper-V gets the TSC frequency from a synthetic MSR, if
available. In this case, set X86_FEATURE_TSC_KNOWN_FREQ so that Linux
doesn't unnecessarily do refined TSC calibration when setting up the TSC
clocksource.

With this change, a message such as this is no longer output during boot
when the TSC is used as the clocksource:

[    1.115141] tsc: Refined TSC clocksource calibration: 2918.408 MHz

Furthermore, the guest and host will have exactly the same view of the
TSC frequency, which is important for features such as the TSC deadline
timer that are emulated by the Hyper-V host.

Signed-off-by: Michael Kelley &lt;mhklinux@outlook.com&gt;
Reviewed-by: Roman Kisel &lt;romank@linux.microsoft.com&gt;
Link: https://lore.kernel.org/r/20240606025559.1631-1-mhklinux@outlook.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Message-ID: &lt;20240606025559.1631-1-mhklinux@outlook.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A Linux guest on Hyper-V gets the TSC frequency from a synthetic MSR, if
available. In this case, set X86_FEATURE_TSC_KNOWN_FREQ so that Linux
doesn't unnecessarily do refined TSC calibration when setting up the TSC
clocksource.

With this change, a message such as this is no longer output during boot
when the TSC is used as the clocksource:

[    1.115141] tsc: Refined TSC clocksource calibration: 2918.408 MHz

Furthermore, the guest and host will have exactly the same view of the
TSC frequency, which is important for features such as the TSC deadline
timer that are emulated by the Hyper-V host.

Signed-off-by: Michael Kelley &lt;mhklinux@outlook.com&gt;
Reviewed-by: Roman Kisel &lt;romank@linux.microsoft.com&gt;
Link: https://lore.kernel.org/r/20240606025559.1631-1-mhklinux@outlook.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Message-ID: &lt;20240606025559.1631-1-mhklinux@outlook.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/setup: Parse the builtin command line before merging</title>
<updated>2024-07-31T19:46:35+00:00</updated>
<author>
<name>Borislav Petkov (AMD)</name>
<email>bp@alien8.de</email>
</author>
<published>2024-07-30T14:15:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bf514327c324bc8af64f359b341cc9b189c096fd'/>
<id>bf514327c324bc8af64f359b341cc9b189c096fd</id>
<content type='text'>
Commit in Fixes was added as a catch-all for cases where the cmdline is
parsed before being merged with the builtin one.

And promptly one issue appeared, see Link below. The microcode loader
really needs to parse it that early, but the merging happens later.

Reshuffling the early boot nightmare^W code to handle that properly would
be a painful exercise for another day so do the chicken thing and parse the
builtin cmdline too before it has been merged.

Fixes: 0c40b1c7a897 ("x86/setup: Warn when option parsing is done too early")
Reported-by: Mike Lothian &lt;mike@fireburn.co.uk&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/all/20240730152108.GAZqkE5Dfi9AuKllRw@fat_crate.local
Link: https://lore.kernel.org/r/20240722152330.GCZp55ck8E_FT4kPnC@fat_crate.local
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit in Fixes was added as a catch-all for cases where the cmdline is
parsed before being merged with the builtin one.

And promptly one issue appeared, see Link below. The microcode loader
really needs to parse it that early, but the merging happens later.

Reshuffling the early boot nightmare^W code to handle that properly would
be a painful exercise for another day so do the chicken thing and parse the
builtin cmdline too before it has been merged.

Fixes: 0c40b1c7a897 ("x86/setup: Warn when option parsing is done too early")
Reported-by: Mike Lothian &lt;mike@fireburn.co.uk&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/all/20240730152108.GAZqkE5Dfi9AuKllRw@fat_crate.local
Link: https://lore.kernel.org/r/20240722152330.GCZp55ck8E_FT4kPnC@fat_crate.local
</pre>
</div>
</content>
</entry>
</feed>
