<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/x86/kernel/cpu/mcheck, branch v3.14</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge branch 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2014-01-20T20:10:27+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-01-20T20:10:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fab5669d556200c4dd119af705bff14085845d1e'/>
<id>fab5669d556200c4dd119af705bff14085845d1e</id>
<content type='text'>
Pull x86 RAS changes from Ingo Molnar:

 - SCI reporting for other error types not only correctable ones

 - GHES cleanups

 - Add the functionality to override error reporting agents as some
   machines are sporting a new extended error logging capability which,
   if done properly in the BIOS, makes a corresponding EDAC module
   redundant

 - PCIe AER tracepoint severity levels fix

 - Error path correction for the mce device init

 - MCE timer fix

 - Add more flexibility to the error injection (EINJ) debugfs interface

* 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, mce: Fix mce_start_timer semantics
  ACPI, APEI, GHES: Cleanup ghes memory error handling
  ACPI, APEI: Cleanup alignment-aware accesses
  ACPI, APEI, GHES: Do not report only correctable errors with SCI
  ACPI, APEI, EINJ: Changes to the ACPI/APEI/EINJ debugfs interface
  ACPI, eMCA: Combine eMCA/EDAC event reporting priority
  EDAC, sb_edac: Modify H/W event reporting policy
  EDAC: Add an edac_report parameter to EDAC
  PCI, AER: Fix severity usage in aer trace event
  x86, mce: Call put_device on device_register failure
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull x86 RAS changes from Ingo Molnar:

 - SCI reporting for other error types not only correctable ones

 - GHES cleanups

 - Add the functionality to override error reporting agents as some
   machines are sporting a new extended error logging capability which,
   if done properly in the BIOS, makes a corresponding EDAC module
   redundant

 - PCIe AER tracepoint severity levels fix

 - Error path correction for the mce device init

 - MCE timer fix

 - Add more flexibility to the error injection (EINJ) debugfs interface

* 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, mce: Fix mce_start_timer semantics
  ACPI, APEI, GHES: Cleanup ghes memory error handling
  ACPI, APEI: Cleanup alignment-aware accesses
  ACPI, APEI, GHES: Do not report only correctable errors with SCI
  ACPI, APEI, EINJ: Changes to the ACPI/APEI/EINJ debugfs interface
  ACPI, eMCA: Combine eMCA/EDAC event reporting priority
  EDAC, sb_edac: Modify H/W event reporting policy
  EDAC: Add an edac_report parameter to EDAC
  PCI, AER: Fix severity usage in aer trace event
  x86, mce: Call put_device on device_register failure
</pre>
</div>
</content>
</entry>
<entry>
<title>x86, mce: Fix mce_start_timer semantics</title>
<updated>2014-01-12T14:22:25+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2013-12-23T17:05:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4f75d8412792777a314ac5c1393a9ed43d695fd1'/>
<id>4f75d8412792777a314ac5c1393a9ed43d695fd1</id>
<content type='text'>
So mce_start_timer() has a 'cpu' argument which is supposed to mean to
start a timer on that cpu. However, the code currently starts a timer on
the *current* cpu the function runs on and causes the sanity-check in
mce_timer_fn to fire:

	WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/mcheck/mce.c:1286 mce_timer_fn

because it is running on the wrong cpu.

This was triggered by Prarit Bhargava &lt;prarit@redhat.com&gt; by offlining
all the cpus in succession.

Then, we were fiddling with the CMCI storm settings when starting the
timer whereas there's no need for that - if there's storm happening
on this newly restarted cpu, we're going to be in normal CMCI mode
initially and then when the CMCI interrupt starts firing, we're going to
go to the polling mode with the timer real soon.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Tested-by: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Reviewed-by: Chen, Gong &lt;gong.chen@linux.intel.com&gt;
Link: http://lkml.kernel.org/r/1387722156-5511-1-git-send-email-prarit@redhat.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
So mce_start_timer() has a 'cpu' argument which is supposed to mean to
start a timer on that cpu. However, the code currently starts a timer on
the *current* cpu the function runs on and causes the sanity-check in
mce_timer_fn to fire:

	WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/mcheck/mce.c:1286 mce_timer_fn

because it is running on the wrong cpu.

This was triggered by Prarit Bhargava &lt;prarit@redhat.com&gt; by offlining
all the cpus in succession.

Then, we were fiddling with the CMCI storm settings when starting the
timer whereas there's no need for that - if there's storm happening
on this newly restarted cpu, we're going to be in normal CMCI mode
initially and then when the CMCI interrupt starts firing, we're going to
go to the polling mode with the timer real soon.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Tested-by: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Reviewed-by: Chen, Gong &lt;gong.chen@linux.intel.com&gt;
Link: http://lkml.kernel.org/r/1387722156-5511-1-git-send-email-prarit@redhat.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: Delete non-required instances of include &lt;linux/init.h&gt;</title>
<updated>2014-01-07T05:25:18+00:00</updated>
<author>
<name>Paul Gortmaker</name>
<email>paul.gortmaker@windriver.com</email>
</author>
<published>2014-01-07T00:20:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=663b55b9b39fa9c848cca273ca4e12bf29b32c71'/>
<id>663b55b9b39fa9c848cca273ca4e12bf29b32c71</id>
<content type='text'>
None of these files are actually using any __init type directives
and hence don't need to include &lt;linux/init.h&gt;.  Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.

[ hpa: undid incorrect removal from arch/x86/kernel/head_32.S ]

Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Link: http://lkml.kernel.org/r/1389054026-12947-1-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
None of these files are actually using any __init type directives
and hence don't need to include &lt;linux/init.h&gt;.  Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.

[ hpa: undid incorrect removal from arch/x86/kernel/head_32.S ]

Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Link: http://lkml.kernel.org/r/1389054026-12947-1-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ACPI, APEI, GHES: Do not report only correctable errors with SCI</title>
<updated>2013-12-21T12:31:06+00:00</updated>
<author>
<name>Chen, Gong</name>
<email>gong.chen@linux.intel.com</email>
</author>
<published>2013-11-25T07:15:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=addccbb264e5e0e5762f4893f6df24afad327c8c'/>
<id>addccbb264e5e0e5762f4893f6df24afad327c8c</id>
<content type='text'>
Currently SCI is employed to handle corrected errors - memory corrected
errors, more specifically but in fact SCI still can be used to handle
any errors, e.g. uncorrected or even fatal ones if enabled by the BIOS.
Enable logging for those kinds of errors too.

Signed-off-by: Chen, Gong &lt;gong.chen@linux.intel.com&gt;
Acked-by: Naveen N. Rao &lt;naveen.n.rao@linux.vnet.ibm.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Link: http://lkml.kernel.org/r/1385363701-12387-1-git-send-email-gong.chen@linux.intel.com
[ Boris: massage commit message, rename function arg. ]
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently SCI is employed to handle corrected errors - memory corrected
errors, more specifically but in fact SCI still can be used to handle
any errors, e.g. uncorrected or even fatal ones if enabled by the BIOS.
Enable logging for those kinds of errors too.

Signed-off-by: Chen, Gong &lt;gong.chen@linux.intel.com&gt;
Acked-by: Naveen N. Rao &lt;naveen.n.rao@linux.vnet.ibm.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Link: http://lkml.kernel.org/r/1385363701-12387-1-git-send-email-gong.chen@linux.intel.com
[ Boris: massage commit message, rename function arg. ]
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86, mce: Call put_device on device_register failure</title>
<updated>2013-11-30T11:14:51+00:00</updated>
<author>
<name>Levente Kurusa</name>
<email>levex@linux.com</email>
</author>
<published>2013-11-29T20:28:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=853d9b18f1e861d37e9b271742329f8c1176eabe'/>
<id>853d9b18f1e861d37e9b271742329f8c1176eabe</id>
<content type='text'>
This patch adds a call to put_device() when the device_register() call
has failed. This is required so that the last reference to the device is
given up.

Signed-off-by: Levente Kurusa &lt;levex@linux.com&gt;
Link: http://lkml.kernel.org/r/5298F900.9000208@linux.com
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds a call to put_device() when the device_register() call
has failed. This is required so that the last reference to the device is
given up.

Signed-off-by: Levente Kurusa &lt;levex@linux.com&gt;
Link: http://lkml.kernel.org/r/5298F900.9000208@linux.com
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ACPI, APEI, CPER: Add UEFI 2.4 support for memory error</title>
<updated>2013-10-23T17:10:20+00:00</updated>
<author>
<name>Chen, Gong</name>
<email>gong.chen@linux.intel.com</email>
</author>
<published>2013-10-18T21:30:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=147de14772ed897727dba7353916b02d1e0f17f4'/>
<id>147de14772ed897727dba7353916b02d1e0f17f4</id>
<content type='text'>
In latest UEFI spec(by now it is 2.4) memory error definition
for CPER (UEFI 2.4 Appendix N Common Platform Error Record)
adds some new fields. These fields help people to locate
memory error to an actual DIMM location.

Original-author: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Chen, Gong &lt;gong.chen@linux.intel.com&gt;
Reviewed-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Mauro Carvalho Chehab &lt;m.chehab@samsung.com&gt;
Acked-by: Naveen N. Rao &lt;naveen.n.rao@linux.vnet.ibm.com&gt;
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In latest UEFI spec(by now it is 2.4) memory error definition
for CPER (UEFI 2.4 Appendix N Common Platform Error Record)
adds some new fields. These fields help people to locate
memory error to an actual DIMM location.

Original-author: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Chen, Gong &lt;gong.chen@linux.intel.com&gt;
Reviewed-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Mauro Carvalho Chehab &lt;m.chehab@samsung.com&gt;
Acked-by: Naveen N. Rao &lt;naveen.n.rao@linux.vnet.ibm.com&gt;
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'please-pull-mce-f-bit' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/ras</title>
<updated>2013-08-12T17:51:43+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2013-08-12T17:51:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6356bb0ad6525dae93c06478a098ed3848e9ab01'/>
<id>6356bb0ad6525dae93c06478a098ed3848e9ab01</id>
<content type='text'>
Pull MCE-uncorrected-error fix from Tony Luck:

 "Bit 12 may or may not be set in MCi_STATUS.MCACOD when
  an uncorrected error is reported. Ignore it when checking
  error signatures."

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull MCE-uncorrected-error fix from Tony Luck:

 "Bit 12 may or may not be set in MCi_STATUS.MCACOD when
  an uncorrected error is reported. Ignore it when checking
  error signatures."

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'x86/mce' into x86/ras</title>
<updated>2013-08-12T15:54:05+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2013-08-12T15:54:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0237d7f355eef4d9ab8557e1597e8c9debd6c8c2'/>
<id>0237d7f355eef4d9ab8557e1597e8c9debd6c8c2</id>
<content type='text'>
Pursue a single RAS/MCE topic branch on x86.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pursue a single RAS/MCE topic branch on x86.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mce: Fix mce regression from recent cleanup</title>
<updated>2013-07-29T18:23:27+00:00</updated>
<author>
<name>Tony Luck</name>
<email>tony.luck@intel.com</email>
</author>
<published>2013-07-24T17:09:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1a7f0e3c4fff449f6dd08787beb98a8e57d8cdc7'/>
<id>1a7f0e3c4fff449f6dd08787beb98a8e57d8cdc7</id>
<content type='text'>
In commit 33d7885b594e169256daef652e8d3527b2298e75
   x86/mce: Update MCE severity condition check

We simplified the rules to recognise each classification of recoverable
machine check combining the instruction and data fetch rules into a
single entry based on clarifications in the June 2013 SDM that all
recoverable events would be reported on the unaffected processor with
MCG_STATUS.EIPV=0 and MCG_STATUS.RIPV=1.  Unfortunately the simplified
rule has a couple of bugs.  Fix them here.

Acked-by: Naveen N. Rao &lt;naveen.n.rao@linux.vnet.ibm.com&gt;
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In commit 33d7885b594e169256daef652e8d3527b2298e75
   x86/mce: Update MCE severity condition check

We simplified the rules to recognise each classification of recoverable
machine check combining the instruction and data fetch rules into a
single entry based on clarifications in the June 2013 SDM that all
recoverable events would be reported on the unaffected processor with
MCG_STATUS.EIPV=0 and MCG_STATUS.RIPV=1.  Unfortunately the simplified
rule has a couple of bugs.  Fix them here.

Acked-by: Naveen N. Rao &lt;naveen.n.rao@linux.vnet.ibm.com&gt;
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: delete __cpuinit usage from all x86 files</title>
<updated>2013-07-14T23:36:56+00:00</updated>
<author>
<name>Paul Gortmaker</name>
<email>paul.gortmaker@windriver.com</email>
</author>
<published>2013-06-18T22:23:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=148f9bb87745ed45f7a11b2cbd3bc0f017d5d257'/>
<id>148f9bb87745ed45f7a11b2cbd3bc0f017d5d257</id>
<content type='text'>
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit  -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings.  In any case, they are temporary and harmless.

This removes all the arch/x86 uses of the __cpuinit macros from
all C files.  x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: x86@kernel.org
Acked-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit  -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings.  In any case, they are temporary and harmless.

This removes all the arch/x86 uses of the __cpuinit macros from
all C files.  x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: x86@kernel.org
Acked-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
