<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/x86/lib, branch v5.14</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge tag 'x86_sev_for_v5.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2021-06-28T18:29:12+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-06-28T18:29:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d04f7de0a5134de13420e72ae62a26f05d312c06'/>
<id>d04f7de0a5134de13420e72ae62a26f05d312c06</id>
<content type='text'>
Pull x86 SEV updates from Borislav Petkov:

 - Differentiate the type of exception the #VC handler raises depending
   on code executed in the guest and handle the case where failure to
   get the RIP would result in a #GP, as it should, instead of in a #PF

 - Disable interrupts while the per-CPU GHCB is held

 - Split the #VC handler depending on where the #VC exception has
   happened and therefore provide for precise context tracking like the
   rest of the exception handlers deal with noinstr regions now

 - Add defines for the GHCB version 2 protocol so that further shared
   development with KVM can happen without merge conflicts

 - The usual small cleanups

* tag 'x86_sev_for_v5.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sev: Use "SEV: " prefix for messages from sev.c
  x86/sev: Add defines for GHCB version 2 MSR protocol requests
  x86/sev: Split up runtime #VC handler for correct state tracking
  x86/sev: Make sure IRQs are disabled while GHCB is active
  x86/sev: Propagate #GP if getting linear instruction address failed
  x86/insn: Extend error reporting from insn_fetch_from_user[_inatomic]()
  x86/insn-eval: Make 0 a valid RIP for insn_get_effective_ip()
  x86/sev: Fix error message in runtime #VC handler
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull x86 SEV updates from Borislav Petkov:

 - Differentiate the type of exception the #VC handler raises depending
   on code executed in the guest and handle the case where failure to
   get the RIP would result in a #GP, as it should, instead of in a #PF

 - Disable interrupts while the per-CPU GHCB is held

 - Split the #VC handler depending on where the #VC exception has
   happened and therefore provide for precise context tracking like the
   rest of the exception handlers deal with noinstr regions now

 - Add defines for the GHCB version 2 protocol so that further shared
   development with KVM can happen without merge conflicts

 - The usual small cleanups

* tag 'x86_sev_for_v5.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sev: Use "SEV: " prefix for messages from sev.c
  x86/sev: Add defines for GHCB version 2 MSR protocol requests
  x86/sev: Split up runtime #VC handler for correct state tracking
  x86/sev: Make sure IRQs are disabled while GHCB is active
  x86/sev: Propagate #GP if getting linear instruction address failed
  x86/insn: Extend error reporting from insn_fetch_from_user[_inatomic]()
  x86/insn-eval: Make 0 a valid RIP for insn_get_effective_ip()
  x86/sev: Fix error message in runtime #VC handler
</pre>
</div>
</content>
</entry>
<entry>
<title>objtool/x86: Ignore __x86_indirect_alt_* symbols</title>
<updated>2021-06-21T15:26:57+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2021-06-21T14:13:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=31197d3a0f1caeb60fb01f6755e28347e4f44037'/>
<id>31197d3a0f1caeb60fb01f6755e28347e4f44037</id>
<content type='text'>
Because the __x86_indirect_alt* symbols are just that, objtool will
try and validate them as regular symbols, instead of the alternative
replacements that they are.

This goes sideways for FRAME_POINTER=y builds; which generate a fair
amount of warnings.

Fixes: 9bc0bb50727c ("objtool/x86: Rewrite retpoline thunk calls")
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/YNCgxwLBiK9wclYJ@hirez.programming.kicks-ass.net
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Because the __x86_indirect_alt* symbols are just that, objtool will
try and validate them as regular symbols, instead of the alternative
replacements that they are.

This goes sideways for FRAME_POINTER=y builds; which generate a fair
amount of warnings.

Fixes: 9bc0bb50727c ("objtool/x86: Rewrite retpoline thunk calls")
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/YNCgxwLBiK9wclYJ@hirez.programming.kicks-ass.net
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/insn: Extend error reporting from insn_fetch_from_user[_inatomic]()</title>
<updated>2021-06-15T09:39:30+00:00</updated>
<author>
<name>Joerg Roedel</name>
<email>jroedel@suse.de</email>
</author>
<published>2021-06-14T13:53:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4aaa7eacd7cc7c10f269c7f2a01d044b375bed8e'/>
<id>4aaa7eacd7cc7c10f269c7f2a01d044b375bed8e</id>
<content type='text'>
The error reporting from the insn_fetch_from_user*() functions is not
very verbose. Extend it to include information on whether the linear
RIP could not be calculated or whether the memory access faulted.

This will be used in the SEV-ES code to propagate the correct
exception depending on what went wrong during instruction fetch.

 [ bp: Massage comments. ]

Signed-off-by: Joerg Roedel &lt;jroedel@suse.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20210614135327.9921-6-joro@8bytes.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The error reporting from the insn_fetch_from_user*() functions is not
very verbose. Extend it to include information on whether the linear
RIP could not be calculated or whether the memory access faulted.

This will be used in the SEV-ES code to propagate the correct
exception depending on what went wrong during instruction fetch.

 [ bp: Massage comments. ]

Signed-off-by: Joerg Roedel &lt;jroedel@suse.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20210614135327.9921-6-joro@8bytes.org
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/insn-eval: Make 0 a valid RIP for insn_get_effective_ip()</title>
<updated>2021-06-15T09:24:21+00:00</updated>
<author>
<name>Joerg Roedel</name>
<email>jroedel@suse.de</email>
</author>
<published>2021-05-19T13:52:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f2df15639e44d23bf82a86a03092472c7278cd39'/>
<id>f2df15639e44d23bf82a86a03092472c7278cd39</id>
<content type='text'>
In theory, 0 is a valid value for the instruction pointer so don't use
it as the error return value from insn_get_effective_ip().

Signed-off-by: Joerg Roedel &lt;jroedel@suse.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20210614135327.9921-5-joro@8bytes.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In theory, 0 is a valid value for the instruction pointer so don't use
it as the error return value from insn_get_effective_ip().

Signed-off-by: Joerg Roedel &lt;jroedel@suse.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20210614135327.9921-5-joro@8bytes.org
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'x86_core_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2021-04-28T00:45:09+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-04-28T00:45:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c6536676c7fe3f572ba55842e59c3c71c01e7fb3'/>
<id>c6536676c7fe3f572ba55842e59c3c71c01e7fb3</id>
<content type='text'>
Pull x86 updates from Borislav Petkov:

 - Turn the stack canary into a normal __percpu variable on 32-bit which
   gets rid of the LAZY_GS stuff and a lot of code.

 - Add an insn_decode() API which all users of the instruction decoder
   should preferrably use. Its goal is to keep the details of the
   instruction decoder away from its users and simplify and streamline
   how one decodes insns in the kernel. Convert its users to it.

 - kprobes improvements and fixes

 - Set the maximum DIE per package variable on Hygon

 - Rip out the dynamic NOP selection and simplify all the machinery
   around selecting NOPs. Use the simplified NOPs in objtool now too.

 - Add Xeon Sapphire Rapids to list of CPUs that support PPIN

 - Simplify the retpolines by folding the entire thing into an
   alternative now that objtool can handle alternatives with stack ops.
   Then, have objtool rewrite the call to the retpoline with the
   alternative which then will get patched at boot time.

 - Document Intel uarch per models in intel-family.h

 - Make Sub-NUMA Clustering topology the default and Cluster-on-Die the
   exception on Intel.

* tag 'x86_core_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
  x86, sched: Treat Intel SNC topology as default, COD as exception
  x86/cpu: Comment Skylake server stepping too
  x86/cpu: Resort and comment Intel models
  objtool/x86: Rewrite retpoline thunk calls
  objtool: Skip magical retpoline .altinstr_replacement
  objtool: Cache instruction relocs
  objtool: Keep track of retpoline call sites
  objtool: Add elf_create_undef_symbol()
  objtool: Extract elf_symbol_add()
  objtool: Extract elf_strtab_concat()
  objtool: Create reloc sections implicitly
  objtool: Add elf_create_reloc() helper
  objtool: Rework the elf_rebuild_reloc_section() logic
  objtool: Fix static_call list generation
  objtool: Handle per arch retpoline naming
  objtool: Correctly handle retpoline thunk calls
  x86/retpoline: Simplify retpolines
  x86/alternatives: Optimize optimize_nops()
  x86: Add insn_decode_kernel()
  x86/kprobes: Move 'inline' to the beginning of the kprobe_is_ss() declaration
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull x86 updates from Borislav Petkov:

 - Turn the stack canary into a normal __percpu variable on 32-bit which
   gets rid of the LAZY_GS stuff and a lot of code.

 - Add an insn_decode() API which all users of the instruction decoder
   should preferrably use. Its goal is to keep the details of the
   instruction decoder away from its users and simplify and streamline
   how one decodes insns in the kernel. Convert its users to it.

 - kprobes improvements and fixes

 - Set the maximum DIE per package variable on Hygon

 - Rip out the dynamic NOP selection and simplify all the machinery
   around selecting NOPs. Use the simplified NOPs in objtool now too.

 - Add Xeon Sapphire Rapids to list of CPUs that support PPIN

 - Simplify the retpolines by folding the entire thing into an
   alternative now that objtool can handle alternatives with stack ops.
   Then, have objtool rewrite the call to the retpoline with the
   alternative which then will get patched at boot time.

 - Document Intel uarch per models in intel-family.h

 - Make Sub-NUMA Clustering topology the default and Cluster-on-Die the
   exception on Intel.

* tag 'x86_core_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
  x86, sched: Treat Intel SNC topology as default, COD as exception
  x86/cpu: Comment Skylake server stepping too
  x86/cpu: Resort and comment Intel models
  objtool/x86: Rewrite retpoline thunk calls
  objtool: Skip magical retpoline .altinstr_replacement
  objtool: Cache instruction relocs
  objtool: Keep track of retpoline call sites
  objtool: Add elf_create_undef_symbol()
  objtool: Extract elf_symbol_add()
  objtool: Extract elf_strtab_concat()
  objtool: Create reloc sections implicitly
  objtool: Add elf_create_reloc() helper
  objtool: Rework the elf_rebuild_reloc_section() logic
  objtool: Fix static_call list generation
  objtool: Handle per arch retpoline naming
  objtool: Correctly handle retpoline thunk calls
  x86/retpoline: Simplify retpolines
  x86/alternatives: Optimize optimize_nops()
  x86: Add insn_decode_kernel()
  x86/kprobes: Move 'inline' to the beginning of the kprobe_is_ss() declaration
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'x86_cleanups_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2021-04-26T16:25:47+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-04-26T16:25:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ea5bc7b977fc7cd2be4065ef41824adc976c807f'/>
<id>ea5bc7b977fc7cd2be4065ef41824adc976c807f</id>
<content type='text'>
Pull misc x86 cleanups from Borislav Petkov:
 "Trivial cleanups and fixes all over the place"

* tag 'x86_cleanups_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Remove me from IDE/ATAPI section
  x86/pat: Do not compile stubbed functions when X86_PAT is off
  x86/asm: Ensure asm/proto.h can be included stand-alone
  x86/platform/intel/quark: Fix incorrect kernel-doc comment syntax in files
  x86/msr: Make locally used functions static
  x86/cacheinfo: Remove unneeded dead-store initialization
  x86/process/64: Move cpu_current_top_of_stack out of TSS
  tools/turbostat: Unmark non-kernel-doc comment
  x86/syscalls: Fix -Wmissing-prototypes warnings from COND_SYSCALL()
  x86/fpu/math-emu: Fix function cast warning
  x86/msr: Fix wr/rdmsr_safe_regs_on_cpu() prototypes
  x86: Fix various typos in comments, take #2
  x86: Remove unusual Unicode characters from comments
  x86/kaslr: Return boolean values from a function returning bool
  x86: Fix various typos in comments
  x86/setup: Remove unused RESERVE_BRK_ARRAY()
  stacktrace: Move documentation for arch_stack_walk_reliable() to header
  x86: Remove duplicate TSC DEADLINE MSR definitions
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull misc x86 cleanups from Borislav Petkov:
 "Trivial cleanups and fixes all over the place"

* tag 'x86_cleanups_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Remove me from IDE/ATAPI section
  x86/pat: Do not compile stubbed functions when X86_PAT is off
  x86/asm: Ensure asm/proto.h can be included stand-alone
  x86/platform/intel/quark: Fix incorrect kernel-doc comment syntax in files
  x86/msr: Make locally used functions static
  x86/cacheinfo: Remove unneeded dead-store initialization
  x86/process/64: Move cpu_current_top_of_stack out of TSS
  tools/turbostat: Unmark non-kernel-doc comment
  x86/syscalls: Fix -Wmissing-prototypes warnings from COND_SYSCALL()
  x86/fpu/math-emu: Fix function cast warning
  x86/msr: Fix wr/rdmsr_safe_regs_on_cpu() prototypes
  x86: Fix various typos in comments, take #2
  x86: Remove unusual Unicode characters from comments
  x86/kaslr: Return boolean values from a function returning bool
  x86: Fix various typos in comments
  x86/setup: Remove unused RESERVE_BRK_ARRAY()
  stacktrace: Move documentation for arch_stack_walk_reliable() to header
  x86: Remove duplicate TSC DEADLINE MSR definitions
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'x86_alternatives_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2021-04-26T16:01:29+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-04-26T16:01:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2c5ce2dba26afb39d426d9c06fd1c8e5057936d7'/>
<id>2c5ce2dba26afb39d426d9c06fd1c8e5057936d7</id>
<content type='text'>
Pull x86 alternatives/paravirt updates from Borislav Petkov:
 "First big cleanup to the paravirt infra to use alternatives and thus
  eliminate custom code patching.

  For that, the alternatives infrastructure is extended to accomodate
  paravirt's needs and, as a result, a lot of paravirt patching code
  goes away, leading to a sizeable cleanup and simplification.

  Work by Juergen Gross"

* tag 'x86_alternatives_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/paravirt: Have only one paravirt patch function
  x86/paravirt: Switch functions with custom code to ALTERNATIVE
  x86/paravirt: Add new PVOP_ALT* macros to support pvops in ALTERNATIVEs
  x86/paravirt: Switch iret pvops to ALTERNATIVE
  x86/paravirt: Simplify paravirt macros
  x86/paravirt: Remove no longer needed 32-bit pvops cruft
  x86/paravirt: Add new features for paravirt patching
  x86/alternative: Use ALTERNATIVE_TERNARY() in _static_cpu_has()
  x86/alternative: Support ALTERNATIVE_TERNARY
  x86/alternative: Support not-feature
  x86/paravirt: Switch time pvops functions to use static_call()
  static_call: Add function to query current function
  static_call: Move struct static_call_key definition to static_call_types.h
  x86/alternative: Merge include files
  x86/alternative: Drop unused feature parameter from ALTINSTR_REPLACEMENT()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull x86 alternatives/paravirt updates from Borislav Petkov:
 "First big cleanup to the paravirt infra to use alternatives and thus
  eliminate custom code patching.

  For that, the alternatives infrastructure is extended to accomodate
  paravirt's needs and, as a result, a lot of paravirt patching code
  goes away, leading to a sizeable cleanup and simplification.

  Work by Juergen Gross"

* tag 'x86_alternatives_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/paravirt: Have only one paravirt patch function
  x86/paravirt: Switch functions with custom code to ALTERNATIVE
  x86/paravirt: Add new PVOP_ALT* macros to support pvops in ALTERNATIVEs
  x86/paravirt: Switch iret pvops to ALTERNATIVE
  x86/paravirt: Simplify paravirt macros
  x86/paravirt: Remove no longer needed 32-bit pvops cruft
  x86/paravirt: Add new features for paravirt patching
  x86/alternative: Use ALTERNATIVE_TERNARY() in _static_cpu_has()
  x86/alternative: Support ALTERNATIVE_TERNARY
  x86/alternative: Support not-feature
  x86/paravirt: Switch time pvops functions to use static_call()
  static_call: Add function to query current function
  static_call: Move struct static_call_key definition to static_call_types.h
  x86/alternative: Merge include files
  x86/alternative: Drop unused feature parameter from ALTINSTR_REPLACEMENT()
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/msr: Make locally used functions static</title>
<updated>2021-04-08T09:57:40+00:00</updated>
<author>
<name>Zhao Xuehui</name>
<email>zhaoxuehui1@huawei.com</email>
</author>
<published>2021-04-08T09:52:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3e7bbe15ed84e3baa7dfab3aebed3a06fd39b806'/>
<id>3e7bbe15ed84e3baa7dfab3aebed3a06fd39b806</id>
<content type='text'>
The functions msr_read() and msr_write() are not used outside of msr.c,
make them static.

 [ bp: Massage commit message. ]

Signed-off-by: Zhao Xuehui &lt;zhaoxuehui1@huawei.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20210408095218.152264-1-zhaoxuehui1@huawei.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The functions msr_read() and msr_write() are not used outside of msr.c,
make them static.

 [ bp: Massage commit message. ]

Signed-off-by: Zhao Xuehui &lt;zhaoxuehui1@huawei.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20210408095218.152264-1-zhaoxuehui1@huawei.com
</pre>
</div>
</content>
</entry>
<entry>
<title>objtool/x86: Rewrite retpoline thunk calls</title>
<updated>2021-04-02T10:47:28+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2021-03-26T15:12:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9bc0bb50727c8ac69fbb33fb937431cf3518ff37'/>
<id>9bc0bb50727c8ac69fbb33fb937431cf3518ff37</id>
<content type='text'>
When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an
indirect call, have objtool rewrite it to:

	ALTERNATIVE "call __x86_indirect_thunk_\reg",
		    "call *%reg", ALT_NOT(X86_FEATURE_RETPOLINE)

Additionally, in order to not emit endless identical
.altinst_replacement chunks, use a global symbol for them, see
__x86_indirect_alt_*.

This also avoids objtool from having to do code generation.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Link: https://lkml.kernel.org/r/20210326151300.320177914@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an
indirect call, have objtool rewrite it to:

	ALTERNATIVE "call __x86_indirect_thunk_\reg",
		    "call *%reg", ALT_NOT(X86_FEATURE_RETPOLINE)

Additionally, in order to not emit endless identical
.altinst_replacement chunks, use a global symbol for them, see
__x86_indirect_alt_*.

This also avoids objtool from having to do code generation.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Link: https://lkml.kernel.org/r/20210326151300.320177914@infradead.org
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/retpoline: Simplify retpolines</title>
<updated>2021-04-02T10:42:04+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2021-03-26T15:12:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=119251855f9adf9421cb5eb409933092141ab2c7'/>
<id>119251855f9adf9421cb5eb409933092141ab2c7</id>
<content type='text'>
Due to:

  c9c324dc22aa ("objtool: Support stack layout changes in alternatives")

it is now possible to simplify the retpolines.

Currently our retpolines consist of 2 symbols:

 - __x86_indirect_thunk_\reg: the compiler target
 - __x86_retpoline_\reg:  the actual retpoline.

Both are consecutive in code and aligned such that for any one register
they both live in the same cacheline:

  0000000000000000 &lt;__x86_indirect_thunk_rax&gt;:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop

  0000000000000005 &lt;__x86_retpoline_rax&gt;:
   5:   e8 07 00 00 00          callq  11 &lt;__x86_retpoline_rax+0xc&gt;
   a:   f3 90                   pause
   c:   0f ae e8                lfence
   f:   eb f9                   jmp    a &lt;__x86_retpoline_rax+0x5&gt;
  11:   48 89 04 24             mov    %rax,(%rsp)
  15:   c3                      retq
  16:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)

The thunk is an alternative_2, where one option is a JMP to the
retpoline. This was done so that objtool didn't need to deal with
alternatives with stack ops. But that problem has been solved, so now
it is possible to fold the entire retpoline into the alternative to
simplify and consolidate unused bytes:

  0000000000000000 &lt;__x86_indirect_thunk_rax&gt;:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop
   5:   90                      nop
   6:   90                      nop
   7:   90                      nop
   8:   90                      nop
   9:   90                      nop
   a:   90                      nop
   b:   90                      nop
   c:   90                      nop
   d:   90                      nop
   e:   90                      nop
   f:   90                      nop
  10:   90                      nop
  11:   66 66 2e 0f 1f 84 00 00 00 00 00        data16 nopw %cs:0x0(%rax,%rax,1)
  1c:   0f 1f 40 00             nopl   0x0(%rax)

Notice that since the longest alternative sequence is now:

   0:   e8 07 00 00 00          callq  c &lt;.altinstr_replacement+0xc&gt;
   5:   f3 90                   pause
   7:   0f ae e8                lfence
   a:   eb f9                   jmp    5 &lt;.altinstr_replacement+0x5&gt;
   c:   48 89 04 24             mov    %rax,(%rsp)
  10:   c3                      retq

17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW, if
we can shrink the retpoline by 1 byte we can pack it more densely).

 [ bp: Massage commit message. ]

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lkml.kernel.org/r/20210326151259.506071949@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Due to:

  c9c324dc22aa ("objtool: Support stack layout changes in alternatives")

it is now possible to simplify the retpolines.

Currently our retpolines consist of 2 symbols:

 - __x86_indirect_thunk_\reg: the compiler target
 - __x86_retpoline_\reg:  the actual retpoline.

Both are consecutive in code and aligned such that for any one register
they both live in the same cacheline:

  0000000000000000 &lt;__x86_indirect_thunk_rax&gt;:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop

  0000000000000005 &lt;__x86_retpoline_rax&gt;:
   5:   e8 07 00 00 00          callq  11 &lt;__x86_retpoline_rax+0xc&gt;
   a:   f3 90                   pause
   c:   0f ae e8                lfence
   f:   eb f9                   jmp    a &lt;__x86_retpoline_rax+0x5&gt;
  11:   48 89 04 24             mov    %rax,(%rsp)
  15:   c3                      retq
  16:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)

The thunk is an alternative_2, where one option is a JMP to the
retpoline. This was done so that objtool didn't need to deal with
alternatives with stack ops. But that problem has been solved, so now
it is possible to fold the entire retpoline into the alternative to
simplify and consolidate unused bytes:

  0000000000000000 &lt;__x86_indirect_thunk_rax&gt;:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop
   5:   90                      nop
   6:   90                      nop
   7:   90                      nop
   8:   90                      nop
   9:   90                      nop
   a:   90                      nop
   b:   90                      nop
   c:   90                      nop
   d:   90                      nop
   e:   90                      nop
   f:   90                      nop
  10:   90                      nop
  11:   66 66 2e 0f 1f 84 00 00 00 00 00        data16 nopw %cs:0x0(%rax,%rax,1)
  1c:   0f 1f 40 00             nopl   0x0(%rax)

Notice that since the longest alternative sequence is now:

   0:   e8 07 00 00 00          callq  c &lt;.altinstr_replacement+0xc&gt;
   5:   f3 90                   pause
   7:   0f ae e8                lfence
   a:   eb f9                   jmp    5 &lt;.altinstr_replacement+0x5&gt;
   c:   48 89 04 24             mov    %rax,(%rsp)
  10:   c3                      retq

17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW, if
we can shrink the retpoline by 1 byte we can pack it more densely).

 [ bp: Massage commit message. ]

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lkml.kernel.org/r/20210326151259.506071949@infradead.org
</pre>
</div>
</content>
</entry>
</feed>
