linux-stable.git/arch/x86/include/asm, branch linux-3.16.y

x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigation

2020-06-11T18:05:57+00:00

commit 7e5b3c267d256822407a22fdce6afdf9cd13f9fb upstream.

SRBDS is an MDS-like speculative side channel that can leak bits from the
random number generator (RNG) across cores and threads. New microcode
serializes the processor access during the execution of RDRAND and
RDSEED. This ensures that the shared buffer is overwritten before it is
released for reuse.

While it is present on all affected CPU models, the microcode mitigation
is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the
cases where TSX is not supported or has been disabled with TSX_CTRL.

The mitigation is activated by default on affected processors and it
increases latency for RDRAND and RDSEED instructions. Among other
effects this will reduce throughput from /dev/urandom.

* Enable administrator to configure the mitigation off when desired using
  either mitigations=off or srbds=off.

* Export vulnerability status via sysfs

* Rename file-scoped macros to apply for non-whitelist table initializations.

 [ bp: Massage,
   - s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g,
   - do not read arch cap MSR a second time in tsx_fused_off() - just pass it in,
   - flip check in cpu_set_bug_bits() to save an indentation level,
   - reflow comments.
   jpoimboe: s/Mitigated/Mitigation/ in user-visible strings
   tglx: Dropped the fused off magic for now
 ]

Signed-off-by: Mark Gross 
Signed-off-by: Borislav Petkov 
Signed-off-by: Thomas Gleixner 
Reviewed-by: Tony Luck 
Reviewed-by: Pawan Gupta 
Reviewed-by: Josh Poimboeuf 
Tested-by: Neelima Krishnan 
[bwh: Backported to 3.16:
 - CPU feature words and bugs are numbered differently
 - Adjust filename for ]
Signed-off-by: Ben Hutchings

x86/cpu: Add a steppings field to struct x86_cpu_id

2020-06-11T18:05:56+00:00

commit e9d7144597b10ff13ff2264c059f7d4a7fbc89ac upstream.

Intel uses the same family/model for several CPUs. Sometimes the
stepping must be checked to tell them apart.

On x86 there can be at most 16 steppings. Add a steppings bitmask to
x86_cpu_id and a X86_MATCH_VENDOR_FAMILY_MODEL_STEPPING_FEATURE macro
and support for matching against family/model/stepping.

 [ bp: Massage.
   tglx: Lightweight variant for backporting ]

Signed-off-by: Mark Gross 
Signed-off-by: Borislav Petkov 
Signed-off-by: Thomas Gleixner 
Reviewed-by: Tony Luck 
Reviewed-by: Josh Poimboeuf 
Signed-off-by: Ben Hutchings

x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping

2020-06-11T18:05:55+00:00

commit b399151cb48db30ad1e0e93dd40d68c6d007b637 upstream.

x86_mask is a confusing name which is hard to associate with the
processor's stepping.

Additionally, correct an indent issue in lib/cpu.c.

Signed-off-by: Jia Zhang 
[ Updated it to more recent kernels. ]
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: bp@alien8.de
Cc: tony.luck@intel.com
Link: http://lkml.kernel.org/r/1514771530-70829-1-git-send-email-qianyue.zj@alibaba-inc.com
Signed-off-by: Ingo Molnar 
Signed-off-by: Greg Kroah-Hartman 
[bwh: Backported to 3.16:
 - Drop changes in arch/x86/lib/cpu.c
 - Adjust filenames, context]
Signed-off-by: Ben Hutchings

x86/pti/efi: broken conversion from efi to kernel page table

2020-04-28T18:02:37+00:00

In entry_64.S we have code like this:

    /* Unconditionally use kernel CR3 for do_nmi() */
    /* %rax is saved above, so OK to clobber here */
    ALTERNATIVE "jmp 2f", "movq %cr3, %rax", X86_FEATURE_KAISER
    /* If PCID enabled, NOFLUSH now and NOFLUSH on return */
    ALTERNATIVE "", "bts $63, %rax", X86_FEATURE_PCID
    pushq   %rax
    /* mask off "user" bit of pgd address and 12 PCID bits: */
    andq    $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax
    movq    %rax, %cr3
2:

    /* paranoidentry do_nmi, 0; without TRACE_IRQS_OFF */
    call    do_nmi

With this instruction:
    andq    $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax

We unconditionally switch from whatever our CR3 was to kernel page table.
But, in arch/x86/platform/efi/efi_64.c We temporarily set a different page
table, that does not have the kernel page table with 0x1000 offset from it.

Look in efi_thunk() and efi_thunk_set_virtual_address_map().

So, while CR3 points to the other page table, we get an NMI interrupt,
and clear 0x1000 from CR3, resulting in a bogus CR3 if the 0x1000 bit was
set.

The efi page table comes from realmode/rm/trampoline_64.S:

arch/x86/realmode/rm/trampoline_64.S

141 .bss
142 .balign PAGE_SIZE
143 GLOBAL(trampoline_pgd) .space PAGE_SIZE

Notice: alignment is PAGE_SIZE, so after applying KAISER_SHADOW_PGD_OFFSET
which equal to PAGE_SIZE, we can get a different page table.

But, even if we fix alignment, here the trampoline binary is later copied
into dynamically allocated memory in reserve_real_mode(), so we need to
fix that place as well.

Fixes: f9a1666f97b3 ("KAISER: Kernel Address Isolation")
Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Signed-off-by: Greg Kroah-Hartman 
[bwh: Adjust the Fixes field for 3.16]
Signed-off-by: Ben Hutchings

locking/x86: Remove the unused atomic_inc_short() methd

2020-01-11T02:05:01+00:00

commit 31b35f6b4d5285a311e10753f4eb17304326b211 upstream.

It is completely unused and implemented only on x86.
Remove it.

Suggested-by: Mark Rutland 
Signed-off-by: Dmitry Vyukov 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Andrew Morton 
Cc: Andrey Ryabinin 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20170526172900.91058-1-dvyukov@google.com
Signed-off-by: Ingo Molnar 
[bwh: Backported to 3.16 because this function is broken after
 "x86/atomic: Fix smp_mb__{before,after}_atomic()":
 - Adjust context]
Signed-off-by: Ben Hutchings

locking,x86: Kill atomic_or_long()

2020-01-11T02:05:00+00:00

commit f6b4ecee0eb7bfa66ae8d5652105ed4da53209a3 upstream.

There are no users, kill it.

Signed-off-by: Peter Zijlstra 
Cc: Jesse Brandeburg 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Link: http://lkml.kernel.org/r/20140508135851.768177189@infradead.org
Signed-off-by: Ingo Molnar 
[bwh: Backported to 3.16 because this function is broken after
 "x86/atomic: Fix smp_mb__{before,after}_atomic()"]
Signed-off-by: Ben Hutchings

x86/atomic: Fix smp_mb__{before,after}_atomic()

2020-01-11T02:05:00+00:00

commit 69d927bba39517d0980462efc051875b7f4db185 upstream.

Recent probing at the Linux Kernel Memory Model uncovered a
'surprise'. Strongly ordered architectures where the atomic RmW
primitive implies full memory ordering and
smp_mb__{before,after}_atomic() are a simple barrier() (such as x86)
fail for:

	*x = 1;
	atomic_inc(u);
	smp_mb__after_atomic();
	r0 = *y;

Because, while the atomic_inc() implies memory order, it
(surprisingly) does not provide a compiler barrier. This then allows
the compiler to re-order like so:

	atomic_inc(u);
	*x = 1;
	smp_mb__after_atomic();
	r0 = *y;

Which the CPU is then allowed to re-order (under TSO rules) like:

	atomic_inc(u);
	r0 = *y;
	*x = 1;

And this very much was not intended. Therefore strengthen the atomic
RmW ops to include a compiler barrier.

NOTE: atomic_{or,and,xor} and the bitops already had the compiler
barrier.

Signed-off-by: Peter Zijlstra (Intel) 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Signed-off-by: Ingo Molnar 
Signed-off-by: Jari Ruusu 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Ben Hutchings

x86/apic: Drop logical_smp_processor_id() inline

2019-11-22T15:57:22+00:00

commit 8f1561680f42a5491b371b513f1ab8197f31fd62 upstream.

The logical_smp_processor_id() inline which is only called in
setup_local_APIC() on x86_32 systems has no real value.

Drop it and directly use GET_APIC_LOGICAL_ID() at the call site and use a
more suitable variable name for readability

Signed-off-by: Dou Liyang 
Signed-off-by: Thomas Gleixner 
Cc: andy.shevchenko@gmail.com
Cc: bhe@redhat.com
Cc: ebiederm@xmission.com
Link: https://lkml.kernel.org/r/20180301055930.2396-4-douly.fnst@cn.fujitsu.com
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings

ptrace,x86: Make user_64bit_mode() available to 32-bit builds

2019-11-22T15:57:20+00:00

commit e27c310af5c05cf876d9cad006928076c27f54d4 upstream.

In its current form, user_64bit_mode() can only be used when CONFIG_X86_64
is selected. This implies that code built with CONFIG_X86_64=n cannot use
it. If a piece of code needs to be built for both CONFIG_X86_64=y and
CONFIG_X86_64=n and wants to use this function, it needs to wrap it in
an #ifdef/#endif; potentially, in multiple places.

This can be easily avoided with a single #ifdef/#endif pair within
user_64bit_mode() itself.

Suggested-by: Borislav Petkov 
Signed-off-by: Ricardo Neri 
Signed-off-by: Thomas Gleixner 
Reviewed-by: Borislav Petkov 
Cc: "Michael S. Tsirkin" 
Cc: Peter Zijlstra 
Cc: Dave Hansen 
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter 
Cc: Paul Gortmaker 
Cc: Huang Rui 
Cc: Qiaowei Ren 
Cc: Shuah Khan 
Cc: Kees Cook 
Cc: Jonathan Corbet 
Cc: Jiri Slaby 
Cc: Dmitry Vyukov 
Cc: "Ravi V. Shankar" 
Cc: Chris Metcalf 
Cc: Brian Gerst 
Cc: Arnaldo Carvalho de Melo 
Cc: Andy Lutomirski 
Cc: Colin Ian King 
Cc: Chen Yucong 
Cc: Adam Buchbinder 
Cc: Vlastimil Babka 
Cc: Lorenzo Stoakes 
Cc: Masami Hiramatsu 
Cc: Paolo Bonzini 
Cc: Andrew Morton 
Cc: Thomas Garnier 
Link: https://lkml.kernel.org/r/1509135945-13762-4-git-send-email-ricardo.neri-calderon@linux.intel.com
[bwh: Backported to 3.16 as dependency of "uprobes/x86: Fix detection of
 32-bit user mode":
 - Adjust context]
Signed-off-by: Ben Hutchings

x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386

2019-11-22T15:57:19+00:00

commit b63f20a778c88b6a04458ed6ffc69da953d3a109 upstream.

Use 'lea' instead of 'add' when adjusting %rsp in CALL_NOSPEC so as to
avoid clobbering flags.

KVM's emulator makes indirect calls into a jump table of sorts, where
the destination of the CALL_NOSPEC is a small blob of code that performs
fast emulation by executing the target instruction with fixed operands.

  adcb_al_dl:
     0x000339f8 <+0>:   adc    %dl,%al
     0x000339fa <+2>:   ret

A major motiviation for doing fast emulation is to leverage the CPU to
handle consumption and manipulation of arithmetic flags, i.e. RFLAGS is
both an input and output to the target of CALL_NOSPEC.  Clobbering flags
results in all sorts of incorrect emulation, e.g. Jcc instructions often
take the wrong path.  Sans the nops...

  asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n"
     0x0003595a <+58>:  mov    0xc0(%ebx),%eax
     0x00035960 <+64>:  mov    0x60(%ebx),%edx
     0x00035963 <+67>:  mov    0x90(%ebx),%ecx
     0x00035969 <+73>:  push   %edi
     0x0003596a <+74>:  popf
     0x0003596b <+75>:  call   *%esi
     0x000359a0 <+128>: pushf
     0x000359a1 <+129>: pop    %edi
     0x000359a2 <+130>: mov    %eax,0xc0(%ebx)
     0x000359b1 <+145>: mov    %edx,0x60(%ebx)

  ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
     0x000359a8 <+136>: mov    -0x10(%ebp),%eax
     0x000359ab <+139>: and    $0x8d5,%edi
     0x000359b4 <+148>: and    $0xfffff72a,%eax
     0x000359b9 <+153>: or     %eax,%edi
     0x000359bd <+157>: mov    %edi,0x4(%ebx)

For the most part this has gone unnoticed as emulation of guest code
that can trigger fast emulation is effectively limited to MMIO when
running on modern hardware, and MMIO is rarely, if ever, accessed by
instructions that affect or consume flags.

Breakage is almost instantaneous when running with unrestricted guest
disabled, in which case KVM must emulate all instructions when the guest
has invalid state, e.g. when the guest is in Big Real Mode during early
BIOS.

Fixes: 776b043848fd2 ("x86/retpoline: Add initial retpoline support")
Fixes: 1a29b5b7f347a ("KVM: x86: Make indirect calls in emulator speculation safe")
Signed-off-by: Sean Christopherson 
Signed-off-by: Thomas Gleixner 
Acked-by: Peter Zijlstra (Intel) 
Link: https://lkml.kernel.org/r/20190822211122.27579-1-sean.j.christopherson@intel.com
Signed-off-by: Ben Hutchings