linux-stable.git/arch/x86/kernel, branch linux-3.13.y

x86: Adjust irq remapping quirk for older revisions of 5500/5520 chipsets

2014-04-22T23:49:20+00:00

commit 6f8a1b335fde143b7407036e2368d3cd6eb55674 upstream.

Commit 03bbcb2e7e2 (iommu/vt-d: add quirk for broken interrupt
remapping on 55XX chipsets) properly disables irq remapping on the
5500/5520 chipsets that don't correctly perform that feature.

However, when I wrote it, I followed the errata sheet linked in that
commit too closely, and explicitly tied the activation of the quirk to
revision 0x13 of the chip, under the assumption that earlier revisions
were not in the field.  Recently a system was reported to be suffering
from this remap bug and the quirk hadn't triggered, because the
revision id register read at a lower value that 0x13, so the quirk
test failed improperly.  Given this, it seems only prudent to adjust
this quirk so that any revision less than 0x13 has the quirk asserted.

[ tglx: Removed the 0x12 comparison of pci id 3405 as this is covered
    	by the <= 0x13 check already ]

Signed-off-by: Neil Horman 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1394649873-14913-1-git-send-email-nhorman@tuxdriver.com
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman

x86, hyperv: Bypass the timer_irq_works() check

2014-04-22T23:49:20+00:00

commit ca3ba2a2f4a49a308e7d78c784d51b2332064f15 upstream.

This patch bypass the timer_irq_works() check for hyperv guest since:

- It was guaranteed to work.
- timer_irq_works() may fail sometime due to the lpj calibration were inaccurate
  in a hyperv guest or a buggy host.

In the future, we should get the tsc frequency from hypervisor and use preset
lpj instead.

[ hpa: I would prefer to not defer things to "the future" in the future... ]

Cc: K. Y. Srinivasan 
Cc: Haiyang Zhang 
Acked-by: K. Y. Srinivasan 
Signed-off-by: Jason Wang 
Link: http://lkml.kernel.org/r/1393558229-14755-1-git-send-email-jasowang@redhat.com
Signed-off-by: H. Peter Anvin 
Signed-off-by: Greg Kroah-Hartman

x86, fpu: Check tsk_used_math() in kernel_fpu_end() for eager FPU

2014-03-24T04:44:19+00:00

commit 731bd6a93a6e9172094a2322bd0ee964bb1f4d63 upstream.

For non-eager fpu mode, thread's fpu state is allocated during the first
fpu usage (in the context of device not available exception). This
(math_state_restore()) can be a blocking call and hence we enable
interrupts (which were originally disabled when the exception happened),
allocate memory and disable interrupts etc.

But the eager-fpu mode, call's the same math_state_restore() from
kernel_fpu_end(). The assumption being that tsk_used_math() is always
set for the eager-fpu mode and thus avoid the code path of enabling
interrupts, allocating fpu state using blocking call and disable
interrupts etc.

But the below issue was noticed by Maarten Baert, Nate Eldredge and
few others:

If a user process dumps core on an ecrypt fs while aesni-intel is loaded,
we get a BUG() in __find_get_block() complaining that it was called with
interrupts disabled; then all further accesses to our ecrypt fs hang
and we have to reboot.

The aesni-intel code (encrypting the core file that we are writing) needs
the FPU and quite properly wraps its code in kernel_fpu_{begin,end}(),
the latter of which calls math_state_restore(). So after kernel_fpu_end(),
interrupts may be disabled, which nobody seems to expect, and they stay
that way until we eventually get to __find_get_block() which barfs.

For eager fpu, most the time, tsk_used_math() is true. At few instances
during thread exit, signal return handling etc, tsk_used_math() might
be false.

In kernel_fpu_end(), for eager-fpu, call math_state_restore()
only if tsk_used_math() is set. Otherwise, don't bother. Kernel code
path which cleared tsk_used_math() knows what needs to be done
with the fpu state.

Reported-by: Maarten Baert 
Reported-by: Nate Eldredge 
Suggested-by: Linus Torvalds 
Signed-off-by: Suresh Siddha 
Link: http://lkml.kernel.org/r/1391410583.3801.6.camel@europa
Cc: George Spelvin 
Signed-off-by: H. Peter Anvin 
Signed-off-by: Greg Kroah-Hartman

x86/amd/numa: Fix northbridge quirk to assign correct NUMA node

2014-03-24T04:44:08+00:00

commit 847d7970defb45540735b3fb4e88471c27cacd85 upstream.

For systems with multiple servers and routed fabric, all
northbridges get assigned to the first server. Fix this by also
using the node reported from the PCI bus. For single-fabric
systems, the northbriges are on PCI bus 0 by definition, which
are on NUMA node 0 by definition, so this is invarient on most
systems.

Tested on fam10h and fam15h single and multi-fabric systems and
candidate for stable.

Signed-off-by: Daniel J Blueman 
Acked-by: Steffen Persvold 
Acked-by: Borislav Petkov 
Link: http://lkml.kernel.org/r/1394710981-3596-1-git-send-email-daniel@numascale.com
Signed-off-by: Ingo Molnar 
Signed-off-by: Greg Kroah-Hartman

x86: fix compile error due to X86_TRAP_NMI use in asm files

2014-03-24T04:44:08+00:00

commit b01d4e68933ec23e43b1046fa35d593cefcf37d1 upstream.

It's an enum, not a #define, you can't use it in asm files.

Introduced in commit 5fa10196bdb5 ("x86: Ignore NMIs that come in during
early boot"), and sadly I didn't compile-test things like I should have
before pushing out.

My weak excuse is that the x86 tree generally doesn't introduce stupid
things like this (and the ARM pull afterwards doesn't cause me to do a
compile-test either, since I don't cross-compile).

Cc: Don Zickus 
Cc: H. Peter Anvin 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

x86: Ignore NMIs that come in during early boot

2014-03-24T04:44:08+00:00

commit 5fa10196bdb5f190f595ebd048490ee52dddea0f upstream.

Don Zickus reports:

A customer generated an external NMI using their iLO to test kdump
worked.  Unfortunately, the machine hung.  Disabling the nmi_watchdog
made things work.

I speculated the external NMI fired, caused the machine to panic (as
expected) and the perf NMI from the watchdog came in and was latched.
My guess was this somehow caused the hang.

   ----

It appears that the latched NMI stays latched until the early page
table generation on 64 bits, which causes exceptions to happen which
end in IRET, which re-enable NMI.  Therefore, ignore NMIs that come in
during early execution, until we have proper exception handling.

Reported-and-tested-by: Don Zickus 
Link: http://lkml.kernel.org/r/1394221143-29713-1-git-send-email-dzickus@redhat.com
Signed-off-by: H. Peter Anvin 
Signed-off-by: Greg Kroah-Hartman

perf/x86: Fix event scheduling

2014-03-07T06:06:22+00:00

commit 26e61e8939b1fe8729572dabe9a9e97d930dd4f6 upstream.

Vince "Super Tester" Weaver reported a new round of syscall fuzzing (Trinity) failures,
with perf WARN_ON()s triggering. He also provided traces of the failures.

This is I think the relevant bit:

	>    pec_1076_warn-2804  [000] d...   147.926153: x86_pmu_disable: x86_pmu_disable
	>    pec_1076_warn-2804  [000] d...   147.926153: x86_pmu_state: Events: {
	>    pec_1076_warn-2804  [000] d...   147.926156: x86_pmu_state:   0: state: .R config: ffffffffffffffff (          (null))
	>    pec_1076_warn-2804  [000] d...   147.926158: x86_pmu_state:   33: state: AR config: 0 (ffff88011ac99800)
	>    pec_1076_warn-2804  [000] d...   147.926159: x86_pmu_state: }
	>    pec_1076_warn-2804  [000] d...   147.926160: x86_pmu_state: n_events: 1, n_added: 0, n_txn: 1
	>    pec_1076_warn-2804  [000] d...   147.926161: x86_pmu_state: Assignment: {
	>    pec_1076_warn-2804  [000] d...   147.926162: x86_pmu_state:   0->33 tag: 1 config: 0 (ffff88011ac99800)
	>    pec_1076_warn-2804  [000] d...   147.926163: x86_pmu_state: }
	>    pec_1076_warn-2804  [000] d...   147.926166: collect_events: Adding event: 1 (ffff880119ec8800)

So we add the insn:p event (fd[23]).

At this point we should have:

  n_events = 2, n_added = 1, n_txn = 1

	>    pec_1076_warn-2804  [000] d...   147.926170: collect_events: Adding event: 0 (ffff8800c9e01800)
	>    pec_1076_warn-2804  [000] d...   147.926172: collect_events: Adding event: 4 (ffff8800cbab2c00)

We try and add the {BP,cycles,br_insn} group (fd[3], fd[4], fd[15]).
These events are 0:cycles and 4:br_insn, the BP event isn't x86_pmu so
that's not visible.

	group_sched_in()
	  pmu->start_txn() /* nop - BP pmu */
	  event_sched_in()
	     event->pmu->add()

So here we should end up with:

  0: n_events = 3, n_added = 2, n_txn = 2
  4: n_events = 4, n_added = 3, n_txn = 3

But seeing the below state on x86_pmu_enable(), the must have failed,
because the 0 and 4 events aren't there anymore.

Looking at group_sched_in(), since the BP is the leader, its
event_sched_in() must have succeeded, for otherwise we would not have
seen the sibling adds.

But since neither 0 or 4 are in the below state; their event_sched_in()
must have failed; but I don't see why, the complete state: 0,0,1:p,4
fits perfectly fine on a core2.

However, since we try and schedule 4 it means the 0 event must have
succeeded!  Therefore the 4 event must have failed, its failure will
have put group_sched_in() into the fail path, which will call:

	event_sched_out()
	  event->pmu->del()

on 0 and the BP event.

Now x86_pmu_del() will reduce n_events; but it will not reduce n_added;
giving what we see below:

 n_event = 2, n_added = 2, n_txn = 2

	>    pec_1076_warn-2804  [000] d...   147.926177: x86_pmu_enable: x86_pmu_enable
	>    pec_1076_warn-2804  [000] d...   147.926177: x86_pmu_state: Events: {
	>    pec_1076_warn-2804  [000] d...   147.926179: x86_pmu_state:   0: state: .R config: ffffffffffffffff (          (null))
	>    pec_1076_warn-2804  [000] d...   147.926181: x86_pmu_state:   33: state: AR config: 0 (ffff88011ac99800)
	>    pec_1076_warn-2804  [000] d...   147.926182: x86_pmu_state: }
	>    pec_1076_warn-2804  [000] d...   147.926184: x86_pmu_state: n_events: 2, n_added: 2, n_txn: 2
	>    pec_1076_warn-2804  [000] d...   147.926184: x86_pmu_state: Assignment: {
	>    pec_1076_warn-2804  [000] d...   147.926186: x86_pmu_state:   0->33 tag: 1 config: 0 (ffff88011ac99800)
	>    pec_1076_warn-2804  [000] d...   147.926188: x86_pmu_state:   1->0 tag: 1 config: 1 (ffff880119ec8800)
	>    pec_1076_warn-2804  [000] d...   147.926188: x86_pmu_state: }
	>    pec_1076_warn-2804  [000] d...   147.926190: x86_pmu_enable: S0: hwc->idx: 33, hwc->last_cpu: 0, hwc->last_tag: 1 hwc->state: 0

So the problem is that x86_pmu_del(), when called from a
group_sched_in() that fails (for whatever reason), and without x86_pmu
TXN support (because the leader is !x86_pmu), will corrupt the n_added
state.

Reported-and-Tested-by: Vince Weaver 
Signed-off-by: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Steven Rostedt 
Cc: Stephane Eranian 
Cc: Dave Jones 
Link: http://lkml.kernel.org/r/20140221150312.GF3104@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar 
Signed-off-by: Greg Kroah-Hartman

x86: dma-mapping: fix GFP_ATOMIC macro usage

2014-03-07T06:06:21+00:00

commit c091c71ad2218fc50a07b3d1dab85783f3b77efd upstream.

GFP_ATOMIC is not a single gfp flag, but a macro which expands to the other
flags, where meaningful is the LACK of __GFP_WAIT flag. To check if caller
wants to perform an atomic allocation, the code must test for a lack of the
__GFP_WAIT flag. This patch fixes the issue introduced in v3.5-rc1.

Signed-off-by: Marek Szyprowski 
Signed-off-by: Greg Kroah-Hartman

ftrace/x86: Use breakpoints for converting function graph caller

2014-02-22T21:34:52+00:00

commit 87fbb2ac6073a7039303517546a76074feb14c84 upstream.

When the conversion was made to remove stop machine and use the breakpoint
logic instead, the modification of the function graph caller is still
done directly as though it was being done under stop machine.

As it is not converted via stop machine anymore, there is a possibility
that the code could be layed across cache lines and if another CPU is
accessing that function graph call when it is being updated, it could
cause a General Protection Fault.

Convert the update of the function graph caller to use the breakpoint
method as well.

Cc: H. Peter Anvin 
Fixes: 08d636b6d4fb "ftrace/x86: Have arch x86_64 use breakpoints instead of stop machine"
Signed-off-by: Steven Rostedt 
Signed-off-by: Greg Kroah-Hartman

x86, smap: Don't enable SMAP if CONFIG_X86_SMAP is disabled

2014-02-22T21:34:51+00:00

commit 03bbd596ac04fef47ce93a730b8f086d797c3021 upstream.

If SMAP support is not compiled into the kernel, don't enable SMAP in
CR4 -- in fact, we should clear it, because the kernel doesn't contain
the proper STAC/CLAC instructions for SMAP support.

Found by Fengguang Wu's test system.

Reported-by: Fengguang Wu 
Link: http://lkml.kernel.org/r/20140213124550.GA30497@localhost
Signed-off-by: H. Peter Anvin 
Signed-off-by: Greg Kroah-Hartman