<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/x86/kernel/cpu/cpu.h, branch v5.17</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>x86/cpu: Fix migration safety with X86_BUG_NULL_SEL</title>
<updated>2021-10-21T18:49:16+00:00</updated>
<author>
<name>Jane Malalane</name>
<email>jane.malalane@citrix.com</email>
</author>
<published>2021-10-21T10:47:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=415de44076640483648d6c0f6d645a9ee61328ad'/>
<id>415de44076640483648d6c0f6d645a9ee61328ad</id>
<content type='text'>
Currently, Linux probes for X86_BUG_NULL_SEL unconditionally which
makes it unsafe to migrate in a virtualised environment as the
properties across the migration pool might differ.

To be specific, the case which goes wrong is:

1. Zen1 (or earlier) and Zen2 (or later) in a migration pool
2. Linux boots on Zen2, probes and finds the absence of X86_BUG_NULL_SEL
3. Linux is then migrated to Zen1

Linux is now running on a X86_BUG_NULL_SEL-impacted CPU while believing
that the bug is fixed.

The only way to address the problem is to fully trust the "no longer
affected" CPUID bit when virtualised, because in the above case it would
be clear deliberately to indicate the fact "you might migrate to
somewhere which has this behaviour".

Zen3 adds the NullSelectorClearsBase CPUID bit to indicate that loading
a NULL segment selector zeroes the base and limit fields, as well as
just attributes. Zen2 also has this behaviour but doesn't have the NSCB
bit.

 [ bp: Minor touchups. ]

Signed-off-by: Jane Malalane &lt;jane.malalane@citrix.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
CC: &lt;stable@vger.kernel.org&gt;
Link: https://lkml.kernel.org/r/20211021104744.24126-1-jane.malalane@citrix.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, Linux probes for X86_BUG_NULL_SEL unconditionally which
makes it unsafe to migrate in a virtualised environment as the
properties across the migration pool might differ.

To be specific, the case which goes wrong is:

1. Zen1 (or earlier) and Zen2 (or later) in a migration pool
2. Linux boots on Zen2, probes and finds the absence of X86_BUG_NULL_SEL
3. Linux is then migrated to Zen1

Linux is now running on a X86_BUG_NULL_SEL-impacted CPU while believing
that the bug is fixed.

The only way to address the problem is to fully trust the "no longer
affected" CPUID bit when virtualised, because in the above case it would
be clear deliberately to indicate the fact "you might migrate to
somewhere which has this behaviour".

Zen3 adds the NullSelectorClearsBase CPUID bit to indicate that loading
a NULL segment selector zeroes the base and limit fields, as well as
just attributes. Zen2 also has this behaviour but doesn't have the NSCB
bit.

 [ bp: Minor touchups. ]

Signed-off-by: Jane Malalane &lt;jane.malalane@citrix.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
CC: &lt;stable@vger.kernel.org&gt;
Link: https://lkml.kernel.org/r/20211021104744.24126-1-jane.malalane@citrix.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/tsx: Clear CPUID bits when TSX always force aborts</title>
<updated>2021-06-15T15:46:48+00:00</updated>
<author>
<name>Pawan Gupta</name>
<email>pawan.kumar.gupta@linux.intel.com</email>
</author>
<published>2021-06-14T21:14:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=293649307ef9abcd4f83f6dac4d4400dfd97c936'/>
<id>293649307ef9abcd4f83f6dac4d4400dfd97c936</id>
<content type='text'>
As a result of TSX deprecation, some processors always abort TSX
transactions by default after a microcode update.

When TSX feature cannot be used it is better to hide it. Clear CPUID.RTM
and CPUID.HLE bits when TSX transactions always abort.

 [ bp: Massage commit message and comments. ]

Signed-off-by: Pawan Gupta &lt;pawan.kumar.gupta@linux.intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Tested-by: Neelima Krishnan &lt;neelima.krishnan@intel.com&gt;
Link: https://lkml.kernel.org/r/5209b3d72ffe5bd3cafdcc803f5b883f785329c3.1623704845.git-series.pawan.kumar.gupta@linux.intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As a result of TSX deprecation, some processors always abort TSX
transactions by default after a microcode update.

When TSX feature cannot be used it is better to hide it. Clear CPUID.RTM
and CPUID.HLE bits when TSX transactions always abort.

 [ bp: Massage commit message and comments. ]

Signed-off-by: Pawan Gupta &lt;pawan.kumar.gupta@linux.intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Tested-by: Neelima Krishnan &lt;neelima.krishnan@intel.com&gt;
Link: https://lkml.kernel.org/r/5209b3d72ffe5bd3cafdcc803f5b883f785329c3.1623704845.git-series.pawan.kumar.gupta@linux.intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: Convert macro and uses of __section(foo) to __section("foo")</title>
<updated>2020-10-25T21:51:49+00:00</updated>
<author>
<name>Joe Perches</name>
<email>joe@perches.com</email>
</author>
<published>2020-10-22T02:36:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=33def8498fdde180023444b08e12b72a9efed41d'/>
<id>33def8498fdde180023444b08e12b72a9efed41d</id>
<content type='text'>
Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.

Remove the quote operator # from compiler_attributes.h __section macro.

Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.

Conversion done using the script at:

    https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Reviewed-by: Nick Desaulniers &lt;ndesaulniers@gooogle.com&gt;
Reviewed-by: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.

Remove the quote operator # from compiler_attributes.h __section macro.

Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.

Conversion done using the script at:

    https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Reviewed-by: Nick Desaulniers &lt;ndesaulniers@gooogle.com&gt;
Reviewed-by: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/cpu: Reinitialize IA32_FEAT_CTL MSR on BSP during wakeup</title>
<updated>2020-06-15T12:18:37+00:00</updated>
<author>
<name>Sean Christopherson</name>
<email>sean.j.christopherson@intel.com</email>
</author>
<published>2020-06-08T17:41:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5d5103595e9e53048bb7e70ee2673c897ab38300'/>
<id>5d5103595e9e53048bb7e70ee2673c897ab38300</id>
<content type='text'>
Reinitialize IA32_FEAT_CTL on the BSP during wakeup to handle the case
where firmware doesn't initialize or save/restore across S3.  This fixes
a bug where IA32_FEAT_CTL is left uninitialized and results in VMXON
taking a #GP due to VMX not being fully enabled, i.e. breaks KVM.

Use init_ia32_feat_ctl() to "restore" IA32_FEAT_CTL as it already deals
with the case where the MSR is locked, and because APs already redo
init_ia32_feat_ctl() during suspend by virtue of the SMP boot flow being
used to reinitialize APs upon wakeup.  Do the call in the early wakeup
flow to avoid dependencies in the syscore_ops chain, e.g. simply adding
a resume hook is not guaranteed to work, as KVM does VMXON in its own
resume hook, kvm_resume(), when KVM has active guests.

Fixes: 21bd3467a58e ("KVM: VMX: Drop initialization of IA32_FEAT_CTL MSR")
Reported-by: Brad Campbell &lt;lists2009@fnarfbargle.com&gt;
Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Liam Merwick &lt;liam.merwick@oracle.com&gt;
Reviewed-by: Maxim Levitsky &lt;mlevitsk@redhat.com&gt;
Tested-by: Brad Campbell &lt;lists2009@fnarfbargle.com&gt;
Cc: stable@vger.kernel.org # v5.6
Link: https://lkml.kernel.org/r/20200608174134.11157-1-sean.j.christopherson@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reinitialize IA32_FEAT_CTL on the BSP during wakeup to handle the case
where firmware doesn't initialize or save/restore across S3.  This fixes
a bug where IA32_FEAT_CTL is left uninitialized and results in VMXON
taking a #GP due to VMX not being fully enabled, i.e. breaks KVM.

Use init_ia32_feat_ctl() to "restore" IA32_FEAT_CTL as it already deals
with the case where the MSR is locked, and because APs already redo
init_ia32_feat_ctl() during suspend by virtue of the SMP boot flow being
used to reinitialize APs upon wakeup.  Do the call in the early wakeup
flow to avoid dependencies in the syscore_ops chain, e.g. simply adding
a resume hook is not guaranteed to work, as KVM does VMXON in its own
resume hook, kvm_resume(), when KVM has active guests.

Fixes: 21bd3467a58e ("KVM: VMX: Drop initialization of IA32_FEAT_CTL MSR")
Reported-by: Brad Campbell &lt;lists2009@fnarfbargle.com&gt;
Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Liam Merwick &lt;liam.merwick@oracle.com&gt;
Reviewed-by: Maxim Levitsky &lt;mlevitsk@redhat.com&gt;
Tested-by: Brad Campbell &lt;lists2009@fnarfbargle.com&gt;
Cc: stable@vger.kernel.org # v5.6
Link: https://lkml.kernel.org/r/20200608174134.11157-1-sean.j.christopherson@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigation</title>
<updated>2020-04-20T10:19:22+00:00</updated>
<author>
<name>Mark Gross</name>
<email>mgross@linux.intel.com</email>
</author>
<published>2020-04-16T15:54:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7e5b3c267d256822407a22fdce6afdf9cd13f9fb'/>
<id>7e5b3c267d256822407a22fdce6afdf9cd13f9fb</id>
<content type='text'>
SRBDS is an MDS-like speculative side channel that can leak bits from the
random number generator (RNG) across cores and threads. New microcode
serializes the processor access during the execution of RDRAND and
RDSEED. This ensures that the shared buffer is overwritten before it is
released for reuse.

While it is present on all affected CPU models, the microcode mitigation
is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the
cases where TSX is not supported or has been disabled with TSX_CTRL.

The mitigation is activated by default on affected processors and it
increases latency for RDRAND and RDSEED instructions. Among other
effects this will reduce throughput from /dev/urandom.

* Enable administrator to configure the mitigation off when desired using
  either mitigations=off or srbds=off.

* Export vulnerability status via sysfs

* Rename file-scoped macros to apply for non-whitelist table initializations.

 [ bp: Massage,
   - s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g,
   - do not read arch cap MSR a second time in tsx_fused_off() - just pass it in,
   - flip check in cpu_set_bug_bits() to save an indentation level,
   - reflow comments.
   jpoimboe: s/Mitigated/Mitigation/ in user-visible strings
   tglx: Dropped the fused off magic for now
 ]

Signed-off-by: Mark Gross &lt;mgross@linux.intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Reviewed-by: Pawan Gupta &lt;pawan.kumar.gupta@linux.intel.com&gt;
Reviewed-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Tested-by: Neelima Krishnan &lt;neelima.krishnan@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
SRBDS is an MDS-like speculative side channel that can leak bits from the
random number generator (RNG) across cores and threads. New microcode
serializes the processor access during the execution of RDRAND and
RDSEED. This ensures that the shared buffer is overwritten before it is
released for reuse.

While it is present on all affected CPU models, the microcode mitigation
is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the
cases where TSX is not supported or has been disabled with TSX_CTRL.

The mitigation is activated by default on affected processors and it
increases latency for RDRAND and RDSEED instructions. Among other
effects this will reduce throughput from /dev/urandom.

* Enable administrator to configure the mitigation off when desired using
  either mitigations=off or srbds=off.

* Export vulnerability status via sysfs

* Rename file-scoped macros to apply for non-whitelist table initializations.

 [ bp: Massage,
   - s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g,
   - do not read arch cap MSR a second time in tsx_fused_off() - just pass it in,
   - flip check in cpu_set_bug_bits() to save an indentation level,
   - reflow comments.
   jpoimboe: s/Mitigated/Mitigation/ in user-visible strings
   tglx: Dropped the fused off magic for now
 ]

Signed-off-by: Mark Gross &lt;mgross@linux.intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Reviewed-by: Pawan Gupta &lt;pawan.kumar.gupta@linux.intel.com&gt;
Reviewed-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Tested-by: Neelima Krishnan &lt;neelima.krishnan@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/intel: Initialize IA32_FEAT_CTL MSR at boot</title>
<updated>2020-01-13T16:45:45+00:00</updated>
<author>
<name>Sean Christopherson</name>
<email>sean.j.christopherson@intel.com</email>
</author>
<published>2019-12-21T04:44:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1db2a6e1e29ff994443a9eef7cf3d26104c777a7'/>
<id>1db2a6e1e29ff994443a9eef7cf3d26104c777a7</id>
<content type='text'>
Opportunistically initialize IA32_FEAT_CTL to enable VMX when the MSR is
left unlocked by BIOS.  Configuring feature control at boot time paves
the way for similar enabling of other features, e.g. Software Guard
Extensions (SGX).

Temporarily leave equivalent KVM code in place in order to avoid
introducing a regression on Centaur and Zhaoxin CPUs, e.g. removing
KVM's code would leave the MSR unlocked on those CPUs and would break
existing functionality if people are loading kvm_intel on Centaur and/or
Zhaoxin.  Defer enablement of the boot-time configuration on Centaur and
Zhaoxin to future patches to aid bisection.

Note, Local Machine Check Exceptions (LMCE) are also supported by the
kernel and enabled via feature control, but the kernel currently uses
LMCE if and only if the feature is explicitly enabled by BIOS.  Keep
the current behavior to avoid introducing bugs, future patches can opt
in to opportunistic enabling if it's deemed desirable to do so.

Always lock IA32_FEAT_CTL if it exists, even if the CPU doesn't support
VMX, so that other existing and future kernel code that queries the MSR
can assume it's locked.

Start from a clean slate when constructing the value to write to
IA32_FEAT_CTL, i.e. ignore whatever value BIOS left in the MSR so as not
to enable random features or fault on the WRMSR.

Suggested-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20191221044513.21680-5-sean.j.christopherson@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Opportunistically initialize IA32_FEAT_CTL to enable VMX when the MSR is
left unlocked by BIOS.  Configuring feature control at boot time paves
the way for similar enabling of other features, e.g. Software Guard
Extensions (SGX).

Temporarily leave equivalent KVM code in place in order to avoid
introducing a regression on Centaur and Zhaoxin CPUs, e.g. removing
KVM's code would leave the MSR unlocked on those CPUs and would break
existing functionality if people are loading kvm_intel on Centaur and/or
Zhaoxin.  Defer enablement of the boot-time configuration on Centaur and
Zhaoxin to future patches to aid bisection.

Note, Local Machine Check Exceptions (LMCE) are also supported by the
kernel and enabled via feature control, but the kernel currently uses
LMCE if and only if the feature is explicitly enabled by BIOS.  Keep
the current behavior to avoid introducing bugs, future patches can opt
in to opportunistic enabling if it's deemed desirable to do so.

Always lock IA32_FEAT_CTL if it exists, even if the CPU doesn't support
VMX, so that other existing and future kernel code that queries the MSR
can assume it's locked.

Start from a clean slate when constructing the value to write to
IA32_FEAT_CTL, i.e. ignore whatever value BIOS left in the MSR so as not
to enable random features or fault on the WRMSR.

Suggested-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20191221044513.21680-5-sean.j.christopherson@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default</title>
<updated>2019-10-28T07:36:58+00:00</updated>
<author>
<name>Pawan Gupta</name>
<email>pawan.kumar.gupta@linux.intel.com</email>
</author>
<published>2019-10-23T09:01:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=95c5824f75f3ba4c9e8e5a4b1a623c95390ac266'/>
<id>95c5824f75f3ba4c9e8e5a4b1a623c95390ac266</id>
<content type='text'>
Add a kernel cmdline parameter "tsx" to control the Transactional
Synchronization Extensions (TSX) feature. On CPUs that support TSX
control, use "tsx=on|off" to enable or disable TSX. Not specifying this
option is equivalent to "tsx=off". This is because on certain processors
TSX may be used as a part of a speculative side channel attack.

Carve out the TSX controlling functionality into a separate compilation
unit because TSX is a CPU feature while the TSX async abort control
machinery will go to cpu/bugs.c.

 [ bp: - Massage, shorten and clear the arg buffer.
       - Clarifications of the tsx= possible options - Josh.
       - Expand on TSX_CTRL availability - Pawan. ]

Signed-off-by: Pawan Gupta &lt;pawan.kumar.gupta@linux.intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a kernel cmdline parameter "tsx" to control the Transactional
Synchronization Extensions (TSX) feature. On CPUs that support TSX
control, use "tsx=on|off" to enable or disable TSX. Not specifying this
option is equivalent to "tsx=off". This is because on certain processors
TSX may be used as a part of a speculative side channel attack.

Carve out the TSX controlling functionality into a separate compilation
unit because TSX is a CPU feature while the TSX async abort control
machinery will go to cpu/bugs.c.

 [ bp: - Massage, shorten and clear the arg buffer.
       - Clarifications of the tsx= possible options - Josh.
       - Expand on TSX_CTRL availability - Pawan. ]

Signed-off-by: Pawan Gupta &lt;pawan.kumar.gupta@linux.intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86/cpu: Add a helper function x86_read_arch_cap_msr()</title>
<updated>2019-10-28T07:36:58+00:00</updated>
<author>
<name>Pawan Gupta</name>
<email>pawan.kumar.gupta@linux.intel.com</email>
</author>
<published>2019-10-23T08:52:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=286836a70433fb64131d2590f4bf512097c255e1'/>
<id>286836a70433fb64131d2590f4bf512097c255e1</id>
<content type='text'>
Add a helper function to read the IA32_ARCH_CAPABILITIES MSR.

Signed-off-by: Pawan Gupta &lt;pawan.kumar.gupta@linux.intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Neelima Krishnan &lt;neelima.krishnan@intel.com&gt;
Reviewed-by: Mark Gross &lt;mgross@linux.intel.com&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Reviewed-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a helper function to read the IA32_ARCH_CAPABILITIES MSR.

Signed-off-by: Pawan Gupta &lt;pawan.kumar.gupta@linux.intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Neelima Krishnan &lt;neelima.krishnan@intel.com&gt;
Reviewed-by: Mark Gross &lt;mgross@linux.intel.com&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Reviewed-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>PM / arch: x86: Rework the MSR_IA32_ENERGY_PERF_BIAS handling</title>
<updated>2019-04-07T20:33:19+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2019-03-21T22:18:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5861381d486601430cccf64849bd0a226154bc0d'/>
<id>5861381d486601430cccf64849bd0a226154bc0d</id>
<content type='text'>
The current handling of MSR_IA32_ENERGY_PERF_BIAS in the kernel is
problematic, because it may cause changes made by user space to that
MSR (with the help of the x86_energy_perf_policy tool, for example)
to be lost every time a CPU goes offline and then back online as well
as during system-wide power management transitions into sleep states
and back into the working state.

The first problem is that if the current EPB value for a CPU going
online is 0 ('performance'), the kernel will change it to 6 ('normal')
regardless of whether or not this is the first bring-up of that CPU.
That also happens during system-wide resume from sleep states
(including, but not limited to, hibernation).  However, the EPB may
have been adjusted by user space this way and the kernel should not
blindly override that setting.

The second problem is that if the platform firmware resets the EPB
values for any CPUs during system-wide resume from a sleep state,
the kernel will not restore their previous EPB values that may
have been set by user space before the preceding system-wide
suspend transition.  Again, that behavior may at least be confusing
from the user space perspective.

In order to address these issues, rework the handling of
MSR_IA32_ENERGY_PERF_BIAS so that the EPB value is saved on CPU
offline and restored on CPU online as well as (for the boot CPU)
during the syscore stages of system-wide suspend and resume
transitions, respectively.

However, retain the policy by which the EPB is set to 6 ('normal')
on the first bring-up of each CPU if its initial value is 0, based
on the observation that 0 may mean 'not initialized' just as well as
'performance' in that case.

While at it, move the MSR_IA32_ENERGY_PERF_BIAS handling code into
a separate file and document it in Documentation/admin-guide.

Fixes: abe48b108247 (x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS)
Fixes: b51ef52df71c (x86/cpu: Restore MSR_IA32_ENERGY_PERF_BIAS after resume)
Reported-by: Thomas Renninger &lt;trenn@suse.de&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Acked-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The current handling of MSR_IA32_ENERGY_PERF_BIAS in the kernel is
problematic, because it may cause changes made by user space to that
MSR (with the help of the x86_energy_perf_policy tool, for example)
to be lost every time a CPU goes offline and then back online as well
as during system-wide power management transitions into sleep states
and back into the working state.

The first problem is that if the current EPB value for a CPU going
online is 0 ('performance'), the kernel will change it to 6 ('normal')
regardless of whether or not this is the first bring-up of that CPU.
That also happens during system-wide resume from sleep states
(including, but not limited to, hibernation).  However, the EPB may
have been adjusted by user space this way and the kernel should not
blindly override that setting.

The second problem is that if the platform firmware resets the EPB
values for any CPUs during system-wide resume from a sleep state,
the kernel will not restore their previous EPB values that may
have been set by user space before the preceding system-wide
suspend transition.  Again, that behavior may at least be confusing
from the user space perspective.

In order to address these issues, rework the handling of
MSR_IA32_ENERGY_PERF_BIAS so that the EPB value is saved on CPU
offline and restored on CPU online as well as (for the boot CPU)
during the syscore stages of system-wide suspend and resume
transitions, respectively.

However, retain the policy by which the EPB is set to 6 ('normal')
on the first bring-up of each CPU if its initial value is 0, based
on the observation that 0 may mean 'not initialized' just as well as
'performance' in that case.

While at it, move the MSR_IA32_ENERGY_PERF_BIAS handling code into
a separate file and document it in Documentation/admin-guide.

Fixes: abe48b108247 (x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS)
Fixes: b51ef52df71c (x86/cpu: Restore MSR_IA32_ENERGY_PERF_BIAS after resume)
Reported-by: Thomas Renninger &lt;trenn@suse.de&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Acked-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/cpufeatures: Remove get_scattered_cpuid_leaf()</title>
<updated>2018-11-05T19:54:20+00:00</updated>
<author>
<name>Sean Christopherson</name>
<email>sean.j.christopherson@intel.com</email>
</author>
<published>2018-11-05T18:57:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=43500e6f294d175602606c77bfb0d8cd4ea88b4f'/>
<id>43500e6f294d175602606c77bfb0d8cd4ea88b4f</id>
<content type='text'>
get_scattered_cpuid_leaf() was added[1] to help KVM rebuild hardware-
defined leafs that are rearranged by Linux to avoid bloating the
x86_capability array. Eventually, the last consumer of the function was
removed[2], but the function itself was kept, perhaps even intentionally
as a form of documentation.

Remove get_scattered_cpuid_leaf() as it is currently not used by KVM.
Furthermore, simply rebuilding the "real" leaf does not resolve all of
KVM's woes when it comes to exposing a scattered CPUID feature, i.e.
keeping the function as documentation may be counter-productive in some
scenarios, e.g. when KVM needs to do more than simply expose the leaf.

[1] 47bdf3378d62 ("x86/cpuid: Provide get_scattered_cpuid_leaf()")
[2] b7b27aa011a1 ("KVM/x86: Update the reverse_cpuid list to include CPUID_7_EDX")

Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
CC: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
CC: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
CC: Thomas Gleixner &lt;tglx@linutronix.de&gt;
CC: x86-ml &lt;x86@kernel.org&gt;
Link: http://lkml.kernel.org/r/20181105185725.18679-1-sean.j.christopherson@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
get_scattered_cpuid_leaf() was added[1] to help KVM rebuild hardware-
defined leafs that are rearranged by Linux to avoid bloating the
x86_capability array. Eventually, the last consumer of the function was
removed[2], but the function itself was kept, perhaps even intentionally
as a form of documentation.

Remove get_scattered_cpuid_leaf() as it is currently not used by KVM.
Furthermore, simply rebuilding the "real" leaf does not resolve all of
KVM's woes when it comes to exposing a scattered CPUID feature, i.e.
keeping the function as documentation may be counter-productive in some
scenarios, e.g. when KVM needs to do more than simply expose the leaf.

[1] 47bdf3378d62 ("x86/cpuid: Provide get_scattered_cpuid_leaf()")
[2] b7b27aa011a1 ("KVM/x86: Update the reverse_cpuid list to include CPUID_7_EDX")

Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
CC: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
CC: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
CC: Thomas Gleixner &lt;tglx@linutronix.de&gt;
CC: x86-ml &lt;x86@kernel.org&gt;
Link: http://lkml.kernel.org/r/20181105185725.18679-1-sean.j.christopherson@intel.com
</pre>
</div>
</content>
</entry>
</feed>
