<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/x86/kernel/cpu/mce, branch v5.5</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>x86/mce/therm_throt: Do not access uninitialized therm_work</title>
<updated>2020-01-15T10:31:33+00:00</updated>
<author>
<name>Chuansheng Liu</name>
<email>chuansheng.liu@intel.com</email>
</author>
<published>2020-01-07T00:41:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=978370956d2046b19313659ce65ed12d5b996626'/>
<id>978370956d2046b19313659ce65ed12d5b996626</id>
<content type='text'>
It is relatively easy to trigger the following boot splat on an Ice Lake
client platform. The call stack is like:

  kernel BUG at kernel/timer/timer.c:1152!

  Call Trace:
  __queue_delayed_work
  queue_delayed_work_on
  therm_throt_process
  intel_thermal_interrupt
  ...

The reason is that a CPU's thermal interrupt is enabled prior to
executing its hotplug onlining callback which will initialize the
throttling workqueues.

Such a race can lead to therm_throt_process() accessing an uninitialized
therm_work, leading to the above BUG at a very early bootup stage.

Therefore, unmask the thermal interrupt vector only after having setup
the workqueues completely.

 [ bp: Heavily massage commit message and correct comment formatting. ]

Fixes: f6656208f04e ("x86/mce/therm_throt: Optimize notifications of thermal throttle")
Signed-off-by: Chuansheng Liu &lt;chuansheng.liu@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20200107004116.59353-1-chuansheng.liu@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is relatively easy to trigger the following boot splat on an Ice Lake
client platform. The call stack is like:

  kernel BUG at kernel/timer/timer.c:1152!

  Call Trace:
  __queue_delayed_work
  queue_delayed_work_on
  therm_throt_process
  intel_thermal_interrupt
  ...

The reason is that a CPU's thermal interrupt is enabled prior to
executing its hotplug onlining callback which will initialize the
throttling workqueues.

Such a race can lead to therm_throt_process() accessing an uninitialized
therm_work, leading to the above BUG at a very early bootup stage.

Therefore, unmask the thermal interrupt vector only after having setup
the workqueues completely.

 [ bp: Heavily massage commit message and correct comment formatting. ]

Fixes: f6656208f04e ("x86/mce/therm_throt: Optimize notifications of thermal throttle")
Signed-off-by: Chuansheng Liu &lt;chuansheng.liu@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20200107004116.59353-1-chuansheng.liu@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mce: Fix possibly incorrect severity calculation on AMD</title>
<updated>2019-12-17T08:39:53+00:00</updated>
<author>
<name>Jan H. Schönherr</name>
<email>jschoenh@amazon.de</email>
</author>
<published>2019-12-10T00:07:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a3a57ddad061acc90bef39635caf2b2330ce8f21'/>
<id>a3a57ddad061acc90bef39635caf2b2330ce8f21</id>
<content type='text'>
The function mce_severity_amd_smca() requires m-&gt;bank to be initialized
for correct operation. Fix the one case, where mce_severity() is called
without doing so.

Fixes: 6bda529ec42e ("x86/mce: Grade uncorrected errors for SMCA-enabled systems")
Fixes: d28af26faa0b ("x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()")
Signed-off-by: Jan H. Schönherr &lt;jschoenh@amazon.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Cc: Yazen Ghannam &lt;Yazen.Ghannam@amd.com&gt;
Link: https://lkml.kernel.org/r/20191210000733.17979-4-jschoenh@amazon.de
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The function mce_severity_amd_smca() requires m-&gt;bank to be initialized
for correct operation. Fix the one case, where mce_severity() is called
without doing so.

Fixes: 6bda529ec42e ("x86/mce: Grade uncorrected errors for SMCA-enabled systems")
Fixes: d28af26faa0b ("x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()")
Signed-off-by: Jan H. Schönherr &lt;jschoenh@amazon.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Cc: Yazen Ghannam &lt;Yazen.Ghannam@amd.com&gt;
Link: https://lkml.kernel.org/r/20191210000733.17979-4-jschoenh@amazon.de
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/MCE/AMD: Allow Reserved types to be overwritten in smca_banks[]</title>
<updated>2019-12-17T08:39:53+00:00</updated>
<author>
<name>Yazen Ghannam</name>
<email>yazen.ghannam@amd.com</email>
</author>
<published>2019-11-21T14:15:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=966af20929ac24360ba3fac5533eb2ab003747da'/>
<id>966af20929ac24360ba3fac5533eb2ab003747da</id>
<content type='text'>
Each logical CPU in Scalable MCA systems controls a unique set of MCA
banks in the system. These banks are not shared between CPUs. The bank
types and ordering will be the same across CPUs on currently available
systems.

However, some CPUs may see a bank as Reserved/Read-as-Zero (RAZ) while
other CPUs do not. In this case, the bank seen as Reserved on one CPU is
assumed to be the same type as the bank seen as a known type on another
CPU.

In general, this occurs when the hardware represented by the MCA bank
is disabled, e.g. disabled memory controllers on certain models, etc.
The MCA bank is disabled in the hardware, so there is no possibility of
getting an MCA/MCE from it even if it is assumed to have a known type.

For example:

Full system:
	Bank  |  Type seen on CPU0  |  Type seen on CPU1
	------------------------------------------------
	 0    |         LS          |          LS
	 1    |         UMC         |          UMC
	 2    |         CS          |          CS

System with hardware disabled:
	Bank  |  Type seen on CPU0  |  Type seen on CPU1
	------------------------------------------------
	 0    |         LS          |          LS
	 1    |         UMC         |          RAZ
	 2    |         CS          |          CS

For this reason, there is a single, global struct smca_banks[] that is
initialized at boot time. This array is initialized on each CPU as it
comes online. However, the array will not be updated if an entry already
exists.

This works as expected when the first CPU (usually CPU0) has all
possible MCA banks enabled. But if the first CPU has a subset, then it
will save a "Reserved" type in smca_banks[]. Successive CPUs will then
not be able to update smca_banks[] even if they encounter a known bank
type.

This may result in unexpected behavior. Depending on the system
configuration, a user may observe issues enumerating the MCA
thresholding sysfs interface. The issues may be as trivial as sysfs
entries not being available, or as severe as system hangs.

For example:

	Bank  |  Type seen on CPU0  |  Type seen on CPU1
	------------------------------------------------
	 0    |         LS          |          LS
	 1    |         RAZ         |          UMC
	 2    |         CS          |          CS

Extend the smca_banks[] entry check to return if the entry is a
non-reserved type. Otherwise, continue so that CPUs that encounter a
known bank type can update smca_banks[].

Fixes: 68627a697c19 ("x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type")
Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/20191121141508.141273-1-Yazen.Ghannam@amd.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Each logical CPU in Scalable MCA systems controls a unique set of MCA
banks in the system. These banks are not shared between CPUs. The bank
types and ordering will be the same across CPUs on currently available
systems.

However, some CPUs may see a bank as Reserved/Read-as-Zero (RAZ) while
other CPUs do not. In this case, the bank seen as Reserved on one CPU is
assumed to be the same type as the bank seen as a known type on another
CPU.

In general, this occurs when the hardware represented by the MCA bank
is disabled, e.g. disabled memory controllers on certain models, etc.
The MCA bank is disabled in the hardware, so there is no possibility of
getting an MCA/MCE from it even if it is assumed to have a known type.

For example:

Full system:
	Bank  |  Type seen on CPU0  |  Type seen on CPU1
	------------------------------------------------
	 0    |         LS          |          LS
	 1    |         UMC         |          UMC
	 2    |         CS          |          CS

System with hardware disabled:
	Bank  |  Type seen on CPU0  |  Type seen on CPU1
	------------------------------------------------
	 0    |         LS          |          LS
	 1    |         UMC         |          RAZ
	 2    |         CS          |          CS

For this reason, there is a single, global struct smca_banks[] that is
initialized at boot time. This array is initialized on each CPU as it
comes online. However, the array will not be updated if an entry already
exists.

This works as expected when the first CPU (usually CPU0) has all
possible MCA banks enabled. But if the first CPU has a subset, then it
will save a "Reserved" type in smca_banks[]. Successive CPUs will then
not be able to update smca_banks[] even if they encounter a known bank
type.

This may result in unexpected behavior. Depending on the system
configuration, a user may observe issues enumerating the MCA
thresholding sysfs interface. The issues may be as trivial as sysfs
entries not being available, or as severe as system hangs.

For example:

	Bank  |  Type seen on CPU0  |  Type seen on CPU1
	------------------------------------------------
	 0    |         LS          |          LS
	 1    |         RAZ         |          UMC
	 2    |         CS          |          CS

Extend the smca_banks[] entry check to return if the entry is a
non-reserved type. Otherwise, continue so that CPUs that encounter a
known bank type can update smca_banks[].

Fixes: 68627a697c19 ("x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type")
Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/20191121141508.141273-1-Yazen.Ghannam@amd.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/MCE/AMD: Do not use rdmsr_safe_on_cpu() in smca_configure()</title>
<updated>2019-12-17T08:39:33+00:00</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2019-10-31T13:04:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=246ff09f89e54fdf740a8d496176c86743db3ec7'/>
<id>246ff09f89e54fdf740a8d496176c86743db3ec7</id>
<content type='text'>
... because interrupts are disabled that early and sending IPIs can
deadlock:

  BUG: sleeping function called from invalid context at kernel/sched/completion.c:99
  in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
  no locks held by swapper/1/0.
  irq event stamp: 0
  hardirqs last  enabled at (0): [&lt;0000000000000000&gt;] 0x0
  hardirqs last disabled at (0): [&lt;ffffffff8106dda9&gt;] copy_process+0x8b9/0x1ca0
  softirqs last  enabled at (0): [&lt;ffffffff8106dda9&gt;] copy_process+0x8b9/0x1ca0
  softirqs last disabled at (0): [&lt;0000000000000000&gt;] 0x0
  Preemption disabled at:
  [&lt;ffffffff8104703b&gt;] start_secondary+0x3b/0x190
  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.5.0-rc2+ #1
  Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018
  Call Trace:
   dump_stack
   ___might_sleep.cold.92
   wait_for_completion
   ? generic_exec_single
   rdmsr_safe_on_cpu
   ? wrmsr_on_cpus
   mce_amd_feature_init
   mcheck_cpu_init
   identify_cpu
   identify_secondary_cpu
   smp_store_cpu_info
   start_secondary
   secondary_startup_64

The function smca_configure() is called only on the current CPU anyway,
therefore replace rdmsr_safe_on_cpu() with atomic rdmsr_safe() and avoid
the IPI.

 [ bp: Update commit message. ]

Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/157252708836.3876.4604398213417262402.stgit@buzz
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
... because interrupts are disabled that early and sending IPIs can
deadlock:

  BUG: sleeping function called from invalid context at kernel/sched/completion.c:99
  in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
  no locks held by swapper/1/0.
  irq event stamp: 0
  hardirqs last  enabled at (0): [&lt;0000000000000000&gt;] 0x0
  hardirqs last disabled at (0): [&lt;ffffffff8106dda9&gt;] copy_process+0x8b9/0x1ca0
  softirqs last  enabled at (0): [&lt;ffffffff8106dda9&gt;] copy_process+0x8b9/0x1ca0
  softirqs last disabled at (0): [&lt;0000000000000000&gt;] 0x0
  Preemption disabled at:
  [&lt;ffffffff8104703b&gt;] start_secondary+0x3b/0x190
  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.5.0-rc2+ #1
  Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018
  Call Trace:
   dump_stack
   ___might_sleep.cold.92
   wait_for_completion
   ? generic_exec_single
   rdmsr_safe_on_cpu
   ? wrmsr_on_cpus
   mce_amd_feature_init
   mcheck_cpu_init
   identify_cpu
   identify_secondary_cpu
   smp_store_cpu_info
   start_secondary
   secondary_startup_64

The function smca_configure() is called only on the current CPU anyway,
therefore replace rdmsr_safe_on_cpu() with atomic rdmsr_safe() and avoid
the IPI.

 [ bp: Update commit message. ]

Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/157252708836.3876.4604398213417262402.stgit@buzz
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mce/therm_throt: Mask out read-only and reserved MSR bits</title>
<updated>2019-11-29T08:17:52+00:00</updated>
<author>
<name>Srinivas Pandruvada</name>
<email>srinivas.pandruvada@linux.intel.com</email>
</author>
<published>2019-11-28T15:08:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5a43b87b3c62ad149ba6e9d0d3e5c0e5da02a5ca'/>
<id>5a43b87b3c62ad149ba6e9d0d3e5c0e5da02a5ca</id>
<content type='text'>
While writing to MSR IA32_THERM_STATUS/IA32_PKG_THERM_STATUS, avoid
writing 1 to read only and reserved fields because updating some fields
generates exception.

 [ bp: Vertically align for better readability. ]

Fixes: f6656208f04e ("x86/mce/therm_throt: Optimize notifications of thermal throttle")
Reported-by: Dominik Brodowski &lt;linux@dominikbrodowski.net&gt;
Tested-by: Dominik Brodowski &lt;linux@dominikbrodowski.net&gt;
Signed-off-by: Srinivas Pandruvada &lt;srinivas.pandruvada@linux.intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/20191128150824.22413-1-srinivas.pandruvada@linux.intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While writing to MSR IA32_THERM_STATUS/IA32_PKG_THERM_STATUS, avoid
writing 1 to read only and reserved fields because updating some fields
generates exception.

 [ bp: Vertically align for better readability. ]

Fixes: f6656208f04e ("x86/mce/therm_throt: Optimize notifications of thermal throttle")
Reported-by: Dominik Brodowski &lt;linux@dominikbrodowski.net&gt;
Tested-by: Dominik Brodowski &lt;linux@dominikbrodowski.net&gt;
Signed-off-by: Srinivas Pandruvada &lt;srinivas.pandruvada@linux.intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/20191128150824.22413-1-srinivas.pandruvada@linux.intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mce/therm_throt: Optimize notifications of thermal throttle</title>
<updated>2019-11-12T14:56:04+00:00</updated>
<author>
<name>Srinivas Pandruvada</name>
<email>srinivas.pandruvada@linux.intel.com</email>
</author>
<published>2019-11-11T21:43:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f6656208f04e5b3804054008eba4bf7170f4c841'/>
<id>f6656208f04e5b3804054008eba4bf7170f4c841</id>
<content type='text'>
Some modern systems have very tight thermal tolerances. Because of this
they may cross thermal thresholds when running normal workloads (even
during boot). The CPU hardware will react by limiting power/frequency
and using duty cycles to bring the temperature back into normal range.

Thus users may see a "critical" message about the "temperature above
threshold" which is soon followed by "temperature/speed normal". These
messages are rate-limited, but still may repeat every few minutes.

This issue became worse starting with the Ivy Bridge generation of
CPUs because they include a TCC activation offset in the MSR
IA32_TEMPERATURE_TARGET. OEMs use this to provide alerts long before
critical temperatures are reached.

A test run on a laptop with Intel 8th Gen i5 core for two hours with a
workload resulted in 20K+ thermal interrupts per CPU for core level and
another 20K+ interrupts at package level. The kernel logs were full of
throttling messages.

The real value of these threshold interrupts, is to debug problems with
the external cooling solutions and performance issues due to excessive
throttling.

So the solution here is the following:

  - In the current thermal_throttle folder, show:
    - the maximum time for one throttling event and,
    - the total amount of time the system was in throttling state.

  - Do not log short excursions.

  - Log only when, in spite of thermal throttling, the temperature is rising.
  On the high threshold interrupt trigger a delayed workqueue that
  monitors the threshold violation log bit (THERM_STATUS_PROCHOT_LOG). When
  the log bit is set, this workqueue callback calculates three point moving
  average and logs a warning message when the temperature trend is rising.

  When this log bit is clear and temperature is below threshold
  temperature, then the workqueue callback logs a "Normal" message. Once a
  high threshold event is logged, the logging is rate-limited.

With this patch on the same test laptop, no warnings are printed in the logs
as the max time the processor could bring the temperature under control is
only 280 ms.

This implementation is done with the inputs from Alan Cox and Tony Luck.

 [ bp: Touchups. ]

Signed-off-by: Srinivas Pandruvada &lt;srinivas.pandruvada@linux.intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: bberg@redhat.com
Cc: ckellner@redhat.com
Cc: hdegoede@redhat.com
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/20191111214312.81365-1-srinivas.pandruvada@linux.intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some modern systems have very tight thermal tolerances. Because of this
they may cross thermal thresholds when running normal workloads (even
during boot). The CPU hardware will react by limiting power/frequency
and using duty cycles to bring the temperature back into normal range.

Thus users may see a "critical" message about the "temperature above
threshold" which is soon followed by "temperature/speed normal". These
messages are rate-limited, but still may repeat every few minutes.

This issue became worse starting with the Ivy Bridge generation of
CPUs because they include a TCC activation offset in the MSR
IA32_TEMPERATURE_TARGET. OEMs use this to provide alerts long before
critical temperatures are reached.

A test run on a laptop with Intel 8th Gen i5 core for two hours with a
workload resulted in 20K+ thermal interrupts per CPU for core level and
another 20K+ interrupts at package level. The kernel logs were full of
throttling messages.

The real value of these threshold interrupts, is to debug problems with
the external cooling solutions and performance issues due to excessive
throttling.

So the solution here is the following:

  - In the current thermal_throttle folder, show:
    - the maximum time for one throttling event and,
    - the total amount of time the system was in throttling state.

  - Do not log short excursions.

  - Log only when, in spite of thermal throttling, the temperature is rising.
  On the high threshold interrupt trigger a delayed workqueue that
  monitors the threshold violation log bit (THERM_STATUS_PROCHOT_LOG). When
  the log bit is set, this workqueue callback calculates three point moving
  average and logs a warning message when the temperature trend is rising.

  When this log bit is clear and temperature is below threshold
  temperature, then the workqueue callback logs a "Normal" message. Once a
  high threshold event is logged, the logging is rate-limited.

With this patch on the same test laptop, no warnings are printed in the logs
as the max time the processor could bring the temperature under control is
only 280 ms.

This implementation is done with the inputs from Alan Cox and Tony Luck.

 [ bp: Touchups. ]

Signed-off-by: Srinivas Pandruvada &lt;srinivas.pandruvada@linux.intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: bberg@redhat.com
Cc: ckellner@redhat.com
Cc: hdegoede@redhat.com
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/20191111214312.81365-1-srinivas.pandruvada@linux.intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mce: Add Xeon Icelake to list of CPUs that support PPIN</title>
<updated>2019-11-01T16:29:36+00:00</updated>
<author>
<name>Tony Luck</name>
<email>tony.luck@intel.com</email>
</author>
<published>2019-10-28T16:37:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=dc6b025de95bcd22ff37c4fabb022ec8a027abf1'/>
<id>dc6b025de95bcd22ff37c4fabb022ec8a027abf1</id>
<content type='text'>
New CPU model, same MSRs to control and read the inventory number.

Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/20191028163719.19708-1-tony.luck@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
New CPU model, same MSRs to control and read the inventory number.

Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/20191028163719.19708-1-tony.luck@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mce: Lower throttling MCE messages' priority to warning</title>
<updated>2019-10-17T07:07:09+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>bberg@redhat.com</email>
</author>
<published>2019-10-09T15:54:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9c3bafaa1fd88e4dd2dba3735a1f1abb0f2c7bb7'/>
<id>9c3bafaa1fd88e4dd2dba3735a1f1abb0f2c7bb7</id>
<content type='text'>
On modern CPUs it is quite normal that the temperature limits are
reached and the CPU is throttled. In fact, often the thermal design is
not sufficient to cool the CPU at full load and limits can quickly be
reached when a burst in load happens. This will even happen with
technologies like RAPL limitting the long term power consumption of
the package.

Also, these limits are "softer", as Srinivas explains:

"CPU temperature doesn't have to hit max(TjMax) to get these warnings.
OEMs ha[ve] an ability to program a threshold where a thermal interrupt
can be generated. In some systems the offset is 20C+ (Read only value).

In recent systems, there is another offset on top of it which can be
programmed by OS, once some agent can adjust power limits dynamically.
By default this is set to low by the firmware, which I guess the
prime motivation of Benjamin to submit the patch."

So these messages do not usually indicate a hardware issue (e.g.
insufficient cooling). Log them as warnings to avoid confusion about
their severity.

 [ bp: Massage commit mesage. ]

Signed-off-by: Benjamin Berg &lt;bberg@redhat.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Hans de Goede &lt;hdegoede@redhat.com&gt;
Tested-by: Christian Kellner &lt;ckellner@redhat.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Srinivas Pandruvada &lt;srinivas.pandruvada@linux.intel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/20191009155424.249277-1-bberg@redhat.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On modern CPUs it is quite normal that the temperature limits are
reached and the CPU is throttled. In fact, often the thermal design is
not sufficient to cool the CPU at full load and limits can quickly be
reached when a burst in load happens. This will even happen with
technologies like RAPL limitting the long term power consumption of
the package.

Also, these limits are "softer", as Srinivas explains:

"CPU temperature doesn't have to hit max(TjMax) to get these warnings.
OEMs ha[ve] an ability to program a threshold where a thermal interrupt
can be generated. In some systems the offset is 20C+ (Read only value).

In recent systems, there is another offset on top of it which can be
programmed by OS, once some agent can adjust power limits dynamically.
By default this is set to low by the firmware, which I guess the
prime motivation of Benjamin to submit the patch."

So these messages do not usually indicate a hardware issue (e.g.
insufficient cooling). Log them as warnings to avoid confusion about
their severity.

 [ bp: Massage commit mesage. ]

Signed-off-by: Benjamin Berg &lt;bberg@redhat.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Hans de Goede &lt;hdegoede@redhat.com&gt;
Tested-by: Christian Kellner &lt;ckellner@redhat.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Srinivas Pandruvada &lt;srinivas.pandruvada@linux.intel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/20191009155424.249277-1-bberg@redhat.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mce: Add Zhaoxin LMCE support</title>
<updated>2019-10-01T10:33:33+00:00</updated>
<author>
<name>Tony W Wang-oc</name>
<email>TonyWWang-oc@zhaoxin.com</email>
</author>
<published>2019-09-18T06:19:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=70f0c230031dfef3c9b3e37b2a8c18d3f7186fb2'/>
<id>70f0c230031dfef3c9b3e37b2a8c18d3f7186fb2</id>
<content type='text'>
Newer Zhaoxin CPUs support LMCE compatible with Intel. Add support for
that.

 [ bp: Export functions and massage. ]

Signed-off-by: Tony W Wang-oc &lt;TonyWWang-oc@zhaoxin.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: CooperYan@zhaoxin.com
Cc: DavidWang@zhaoxin.com
Cc: HerryYang@zhaoxin.com
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: QiyuanWang@zhaoxin.com
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/1568787573-1297-5-git-send-email-TonyWWang-oc@zhaoxin.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Newer Zhaoxin CPUs support LMCE compatible with Intel. Add support for
that.

 [ bp: Export functions and massage. ]

Signed-off-by: Tony W Wang-oc &lt;TonyWWang-oc@zhaoxin.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: CooperYan@zhaoxin.com
Cc: DavidWang@zhaoxin.com
Cc: HerryYang@zhaoxin.com
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: QiyuanWang@zhaoxin.com
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/1568787573-1297-5-git-send-email-TonyWWang-oc@zhaoxin.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mce: Add Zhaoxin CMCI support</title>
<updated>2019-10-01T10:33:09+00:00</updated>
<author>
<name>Tony W Wang-oc</name>
<email>TonyWWang-oc@zhaoxin.com</email>
</author>
<published>2019-09-18T06:19:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5a3d56a034be9e8e87a6cb9ed3f2928184db1417'/>
<id>5a3d56a034be9e8e87a6cb9ed3f2928184db1417</id>
<content type='text'>
All newer Zhaoxin CPUs support CMCI and are compatible with Intel's
Machine-Check Architecture. Add that support for Zhaoxin CPUs.

 [ bp: Massage comments and export intel_init_cmci(). ]

Signed-off-by: Tony W Wang-oc &lt;TonyWWang-oc@zhaoxin.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: CooperYan@zhaoxin.com
Cc: DavidWang@zhaoxin.com
Cc: HerryYang@zhaoxin.com
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: QiyuanWang@zhaoxin.com
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/1568787573-1297-4-git-send-email-TonyWWang-oc@zhaoxin.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
All newer Zhaoxin CPUs support CMCI and are compatible with Intel's
Machine-Check Architecture. Add that support for Zhaoxin CPUs.

 [ bp: Massage comments and export intel_init_cmci(). ]

Signed-off-by: Tony W Wang-oc &lt;TonyWWang-oc@zhaoxin.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: CooperYan@zhaoxin.com
Cc: DavidWang@zhaoxin.com
Cc: HerryYang@zhaoxin.com
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: QiyuanWang@zhaoxin.com
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: https://lkml.kernel.org/r/1568787573-1297-4-git-send-email-TonyWWang-oc@zhaoxin.com
</pre>
</div>
</content>
</entry>
</feed>
