linux.git/arch/x86/kernel/cpu, branch v5.5

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2020-01-18T21:02:12+00:00

Pull x86 fixes from Ingo Molnar:
 "Misc fixes:

   - a resctrl fix for uninitialized objects found by debugobjects

   - a resctrl memory leak fix

   - fix the unintended re-enabling of the of SME and SEV CPU flags if
     memory encryption was disabled at bootup via the MSR space"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/CPU/AMD: Ensure clearing of SME/SEV features is maintained
  x86/resctrl: Fix potential memory leak
  x86/resctrl: Fix an imbalance in domain_remove_cpu()

x86/CPU/AMD: Ensure clearing of SME/SEV features is maintained

2020-01-16T19:23:20+00:00

If the SME and SEV features are present via CPUID, but memory encryption
support is not enabled (MSR 0xC001_0010[23]), the feature flags are cleared
using clear_cpu_cap(). However, if get_cpu_cap() is later called, these
feature flags will be reset back to present, which is not desired.

Change from using clear_cpu_cap() to setup_clear_cpu_cap() so that the
clearing of the flags is maintained.

Signed-off-by: Tom Lendacky 
Signed-off-by: Borislav Petkov 
Cc:  # 4.16.x-
Link: https://lkml.kernel.org/r/226de90a703c3c0be5a49565047905ac4e94e8f3.1579125915.git.thomas.lendacky@amd.com

x86/mce/therm_throt: Do not access uninitialized therm_work

2020-01-15T10:31:33+00:00

It is relatively easy to trigger the following boot splat on an Ice Lake
client platform. The call stack is like:

  kernel BUG at kernel/timer/timer.c:1152!

  Call Trace:
  __queue_delayed_work
  queue_delayed_work_on
  therm_throt_process
  intel_thermal_interrupt
  ...

The reason is that a CPU's thermal interrupt is enabled prior to
executing its hotplug onlining callback which will initialize the
throttling workqueues.

Such a race can lead to therm_throt_process() accessing an uninitialized
therm_work, leading to the above BUG at a very early bootup stage.

Therefore, unmask the thermal interrupt vector only after having setup
the workqueues completely.

 [ bp: Heavily massage commit message and correct comment formatting. ]

Fixes: f6656208f04e ("x86/mce/therm_throt: Optimize notifications of thermal throttle")
Signed-off-by: Chuansheng Liu 
Signed-off-by: Borislav Petkov 
Acked-by: Tony Luck 
Link: https://lkml.kernel.org/r/20200107004116.59353-1-chuansheng.liu@intel.com

x86/resctrl: Fix potential memory leak

2020-01-02T17:26:27+00:00

set_cache_qos_cfg() is leaking memory when the given level is not
RDT_RESOURCE_L3 or RDT_RESOURCE_L2. At the moment, this function is
called with only valid levels but move the allocation after the valid
level checks in order to make it more robust and future proof.

 [ bp: Massage commit message. ]

Fixes: 99adde9b370de ("x86/intel_rdt: Enable L2 CDP in MSR IA32_L2_QOS_CFG")
Signed-off-by: Shakeel Butt 
Signed-off-by: Borislav Petkov 
Cc: Fenghua Yu 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Reinette Chatre 
Cc: Thomas Gleixner 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20200102165844.133133-1-shakeelb@google.com

x86/resctrl: Fix an imbalance in domain_remove_cpu()

2019-12-30T18:25:59+00:00

A system that supports resource monitoring may have multiple resources
while not all of these resources are capable of monitoring. Monitoring
related state is initialized only for resources that are capable of
monitoring and correspondingly this state should subsequently only be
removed from these resources that are capable of monitoring.

domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable
is true where it will initialize d->mbm_over. However,
domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without
checking r->mon_capable resulting in an attempt to cancel d->mbm_over on
all resources, even those that never initialized d->mbm_over because
they are not capable of monitoring. Hence, it triggers a debugobjects
warning when offlining CPUs because those timer debugobjects are never
initialized:

  ODEBUG: assert_init not available (active state 0) object type:
  timer_list hint: 0x0
  WARNING: CPU: 143 PID: 789 at lib/debugobjects.c:484
  debug_print_object
  Hardware name: HP Synergy 680 Gen9/Synergy 680 Gen9 Compute Module, BIOS I40 05/23/2018
  RIP: 0010:debug_print_object
  Call Trace:
  debug_object_assert_init
  del_timer
  try_to_grab_pending
  cancel_delayed_work
  resctrl_offline_cpu
  cpuhp_invoke_callback
  cpuhp_thread_fun
  smpboot_thread_fn
  kthread
  ret_from_fork

Fixes: e33026831bdb ("x86/intel_rdt/mbm: Handle counter overflow")
Signed-off-by: Qian Cai 
Signed-off-by: Borislav Petkov 
Acked-by: Reinette Chatre 
Cc: Fenghua Yu 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: john.stultz@linaro.org
Cc: sboyd@kernel.org
Cc: 
Cc: Thomas Gleixner 
Cc: tj@kernel.org
Cc: Tony Luck 
Cc: Vikas Shivappa 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20191211033042.2188-1-cai@lca.pw

x86/mce: Fix possibly incorrect severity calculation on AMD

2019-12-17T08:39:53+00:00

The function mce_severity_amd_smca() requires m->bank to be initialized
for correct operation. Fix the one case, where mce_severity() is called
without doing so.

Fixes: 6bda529ec42e ("x86/mce: Grade uncorrected errors for SMCA-enabled systems")
Fixes: d28af26faa0b ("x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()")
Signed-off-by: Jan H. Schönherr 
Signed-off-by: Borislav Petkov 
Reviewed-by: Tony Luck 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: linux-edac 
Cc: 
Cc: Thomas Gleixner 
Cc: x86-ml 
Cc: Yazen Ghannam 
Link: https://lkml.kernel.org/r/20191210000733.17979-4-jschoenh@amazon.de

x86/MCE/AMD: Allow Reserved types to be overwritten in smca_banks[]

2019-12-17T08:39:53+00:00

Each logical CPU in Scalable MCA systems controls a unique set of MCA
banks in the system. These banks are not shared between CPUs. The bank
types and ordering will be the same across CPUs on currently available
systems.

However, some CPUs may see a bank as Reserved/Read-as-Zero (RAZ) while
other CPUs do not. In this case, the bank seen as Reserved on one CPU is
assumed to be the same type as the bank seen as a known type on another
CPU.

In general, this occurs when the hardware represented by the MCA bank
is disabled, e.g. disabled memory controllers on certain models, etc.
The MCA bank is disabled in the hardware, so there is no possibility of
getting an MCA/MCE from it even if it is assumed to have a known type.

For example:

Full system:
	Bank  |  Type seen on CPU0  |  Type seen on CPU1
	------------------------------------------------
	 0    |         LS          |          LS
	 1    |         UMC         |          UMC
	 2    |         CS          |          CS

System with hardware disabled:
	Bank  |  Type seen on CPU0  |  Type seen on CPU1
	------------------------------------------------
	 0    |         LS          |          LS
	 1    |         UMC         |          RAZ
	 2    |         CS          |          CS

For this reason, there is a single, global struct smca_banks[] that is
initialized at boot time. This array is initialized on each CPU as it
comes online. However, the array will not be updated if an entry already
exists.

This works as expected when the first CPU (usually CPU0) has all
possible MCA banks enabled. But if the first CPU has a subset, then it
will save a "Reserved" type in smca_banks[]. Successive CPUs will then
not be able to update smca_banks[] even if they encounter a known bank
type.

This may result in unexpected behavior. Depending on the system
configuration, a user may observe issues enumerating the MCA
thresholding sysfs interface. The issues may be as trivial as sysfs
entries not being available, or as severe as system hangs.

For example:

	Bank  |  Type seen on CPU0  |  Type seen on CPU1
	------------------------------------------------
	 0    |         LS          |          LS
	 1    |         RAZ         |          UMC
	 2    |         CS          |          CS

Extend the smca_banks[] entry check to return if the entry is a
non-reserved type. Otherwise, continue so that CPUs that encounter a
known bank type can update smca_banks[].

Fixes: 68627a697c19 ("x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type")
Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: linux-edac 
Cc: 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20191121141508.141273-1-Yazen.Ghannam@amd.com

x86/MCE/AMD: Do not use rdmsr_safe_on_cpu() in smca_configure()

2019-12-17T08:39:33+00:00

... because interrupts are disabled that early and sending IPIs can
deadlock:

  BUG: sleeping function called from invalid context at kernel/sched/completion.c:99
  in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
  no locks held by swapper/1/0.
  irq event stamp: 0
  hardirqs last  enabled at (0): [<0000000000000000>] 0x0
  hardirqs last disabled at (0): [] copy_process+0x8b9/0x1ca0
  softirqs last  enabled at (0): [] copy_process+0x8b9/0x1ca0
  softirqs last disabled at (0): [<0000000000000000>] 0x0
  Preemption disabled at:
  [] start_secondary+0x3b/0x190
  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.5.0-rc2+ #1
  Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018
  Call Trace:
   dump_stack
   ___might_sleep.cold.92
   wait_for_completion
   ? generic_exec_single
   rdmsr_safe_on_cpu
   ? wrmsr_on_cpus
   mce_amd_feature_init
   mcheck_cpu_init
   identify_cpu
   identify_secondary_cpu
   smp_store_cpu_info
   start_secondary
   secondary_startup_64

The function smca_configure() is called only on the current CPU anyway,
therefore replace rdmsr_safe_on_cpu() with atomic rdmsr_safe() and avoid
the IPI.

 [ bp: Update commit message. ]

Signed-off-by: Konstantin Khlebnikov 
Signed-off-by: Borislav Petkov 
Reviewed-by: Yazen Ghannam 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: linux-edac 
Cc: 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/157252708836.3876.4604398213417262402.stgit@buzz

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2019-12-02T03:05:07+00:00

Pull x86 fixes from Ingo Molnar:
 "Various fixes:

   - Fix the PAT performance regression that downgraded write-combining
     device memory regions to uncached.

   - There's been a number of bugs in 32-bit double fault handling -
     hopefully all fixed now.

   - Fix an LDT crash

   - Fix an FPU over-optimization that broke with GCC9 code
     optimizations.

   - Misc cleanups"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm/pat: Fix off-by-one bugs in interval tree search
  x86/ioperm: Save an indentation level in tss_update_io_bitmap()
  x86/fpu: Don't cache access to fpu_fpregs_owner_ctx
  x86/entry/32: Remove unused 'restore_all_notrace' local label
  x86/ptrace: Document FSBASE and GSBASE ABI oddities
  x86/ptrace: Remove set_segment_reg() implementations for current
  x86/traps: die() instead of panicking on a double fault
  x86/doublefault/32: Rewrite the x86_32 #DF handler and unify with 64-bit
  x86/doublefault/32: Move #DF stack and TSS to cpu_entry_area
  x86/doublefault/32: Rename doublefault.c to doublefault_32.c
  x86/traps: Disentangle the 32-bit and 64-bit doublefault code
  lkdtm: Add a DOUBLE_FAULT crash type on x86
  selftests/x86/single_step_syscall: Check SYSENTER directly
  x86/mm/32: Sync only to VMALLOC_END in vmalloc_sync_all()

x86/mce/therm_throt: Mask out read-only and reserved MSR bits

2019-11-29T08:17:52+00:00

While writing to MSR IA32_THERM_STATUS/IA32_PKG_THERM_STATUS, avoid
writing 1 to read only and reserved fields because updating some fields
generates exception.

 [ bp: Vertically align for better readability. ]

Fixes: f6656208f04e ("x86/mce/therm_throt: Optimize notifications of thermal throttle")
Reported-by: Dominik Brodowski 
Tested-by: Dominik Brodowski 
Signed-off-by: Srinivas Pandruvada 
Signed-off-by: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: linux-edac 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20191128150824.22413-1-srinivas.pandruvada@linux.intel.com