<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/arch, branch v3.0.67</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>Purge existing TLB entries in set_pte_at and ptep_set_wrprotect</title>
<updated>2013-02-28T14:32:27+00:00</updated>
<author>
<name>John David Anglin</name>
<email>dave.anglin@bell.net</email>
</author>
<published>2013-01-15T00:45:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ef96576ef50b8bbd4c63b9cebab372cb7cf4ba67'/>
<id>ef96576ef50b8bbd4c63b9cebab372cb7cf4ba67</id>
<content type='text'>
commit 7139bc1579901b53db7e898789e916ee2fb52d78 upstream.

This patch goes a long way toward fixing the minifail bug, and
it  significantly improves the stability of SMP machines such as
the rp3440.  When write  protecting a page for COW, we need to
purge the existing translation.  Otherwise, the COW break
doesn't occur as expected because the TLB may still have a stale entry
which allows writes.

[jejb: fix up checkpatch errors]
Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 7139bc1579901b53db7e898789e916ee2fb52d78 upstream.

This patch goes a long way toward fixing the minifail bug, and
it  significantly improves the stability of SMP machines such as
the rp3440.  When write  protecting a page for COW, we need to
purge the existing translation.  Otherwise, the COW break
doesn't occur as expected because the TLB may still have a stale entry
which allows writes.

[jejb: fix up checkpatch errors]
Signed-off-by: John David Anglin &lt;dave.anglin@bell.net&gt;
Signed-off-by: James Bottomley &lt;JBottomley@Parallels.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/kexec: Disable hard IRQ before kexec</title>
<updated>2013-02-28T14:32:26+00:00</updated>
<author>
<name>Phileas Fogg</name>
<email>phileas-fogg@mail.ru</email>
</author>
<published>2013-02-22T23:32:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=534aaed9080ba82974e8d2a71e43a362918138f8'/>
<id>534aaed9080ba82974e8d2a71e43a362918138f8</id>
<content type='text'>
commit 8520e443aa56cc157b015205ea53e7b9fc831291 upstream.

Disable hard IRQ before kexec a new kernel image.
Not doing it can result in corrupted data in the memory segments
reserved for the new kernel.

Signed-off-by: Phileas Fogg &lt;phileas-fogg@mail.ru&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 8520e443aa56cc157b015205ea53e7b9fc831291 upstream.

Disable hard IRQ before kexec a new kernel image.
Not doing it can result in corrupted data in the memory segments
reserved for the new kernel.

Signed-off-by: Phileas Fogg &lt;phileas-fogg@mail.ru&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: PXA3xx: program the CSMSADRCFG register</title>
<updated>2013-02-28T14:32:26+00:00</updated>
<author>
<name>Igor Grinberg</name>
<email>grinberg@compulab.co.il</email>
</author>
<published>2013-01-13T11:49:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=928de5bcadf8540f58ba6b12c6b7547d33dcde89'/>
<id>928de5bcadf8540f58ba6b12c6b7547d33dcde89</id>
<content type='text'>
commit d107a204154ddd79339203c2deeb7433f0cf6777 upstream.

The Chip Select Configuration Register must be programmed to 0x2 in
order to achieve the correct behavior of the Static Memory Controller.

Without this patch devices wired to DFI and accessed through SMC cannot
be accessed after resume from S2.

Do not rely on the boot loader to program the CSMSADRCFG register by
programming it in the kernel smemc module.

Signed-off-by: Igor Grinberg &lt;grinberg@compulab.co.il&gt;
Acked-by: Eric Miao &lt;eric.y.miao@gmail.com&gt;
Signed-off-by: Haojian Zhuang &lt;haojian.zhuang@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d107a204154ddd79339203c2deeb7433f0cf6777 upstream.

The Chip Select Configuration Register must be programmed to 0x2 in
order to achieve the correct behavior of the Static Memory Controller.

Without this patch devices wired to DFI and accessed through SMC cannot
be accessed after resume from S2.

Do not rely on the boot loader to program the CSMSADRCFG register by
programming it in the kernel smemc module.

Signed-off-by: Igor Grinberg &lt;grinberg@compulab.co.il&gt;
Acked-by: Eric Miao &lt;eric.y.miao@gmail.com&gt;
Signed-off-by: Haojian Zhuang &lt;haojian.zhuang@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>s390/kvm: Fix store status for ACRS/FPRS</title>
<updated>2013-02-28T14:32:24+00:00</updated>
<author>
<name>Christian Borntraeger</name>
<email>borntraeger@de.ibm.com</email>
</author>
<published>2013-01-25T14:34:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=58c9ce6fad8e00d9726447f939fe7e78e2aec891'/>
<id>58c9ce6fad8e00d9726447f939fe7e78e2aec891</id>
<content type='text'>
commit 15bc8d8457875f495c59d933b05770ba88d1eacb upstream.

On store status we need to copy the current state of registers
into a save area. Currently we might save stale versions:
The sie state descriptor doesnt have fields for guest ACRS,FPRS,
those registers are simply stored in the host registers. The host
program must copy these away if needed. We do that in vcpu_put/load.

If we now do a store status in KVM code between vcpu_put/load, the
saved values are not up-to-date. Lets collect the ACRS/FPRS before
saving them.

This also fixes some strange problems with hotplug and virtio-ccw,
since the low level machine check handler (on hotplug a machine check
will happen) will revalidate all registers with the content of the
save area.

Signed-off-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Signed-off-by: Gleb Natapov &lt;gleb@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 15bc8d8457875f495c59d933b05770ba88d1eacb upstream.

On store status we need to copy the current state of registers
into a save area. Currently we might save stale versions:
The sie state descriptor doesnt have fields for guest ACRS,FPRS,
those registers are simply stored in the host registers. The host
program must copy these away if needed. We do that in vcpu_put/load.

If we now do a store status in KVM code between vcpu_put/load, the
saved values are not up-to-date. Lets collect the ACRS/FPRS before
saving them.

This also fixes some strange problems with hotplug and virtio-ccw,
since the low level machine check handler (on hotplug a machine check
will happen) will revalidate all registers with the content of the
save area.

Signed-off-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Signed-off-by: Gleb Natapov &lt;gleb@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>xen: Send spinlock IPI to all waiters</title>
<updated>2013-02-28T14:32:24+00:00</updated>
<author>
<name>Stefan Bader</name>
<email>stefan.bader@canonical.com</email>
</author>
<published>2013-02-15T08:48:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=20807f47cffc5e85f9127c260621d12dc3b7814b'/>
<id>20807f47cffc5e85f9127c260621d12dc3b7814b</id>
<content type='text'>
commit 76eaca031f0af2bb303e405986f637811956a422 upstream.

There is a loophole between Xen's current implementation of
pv-spinlocks and the scheduler. This was triggerable through
a testcase until v3.6 changed the TLB flushing code. The
problem potentially is still there just not observable in the
same way.

What could happen was (is):

1. CPU n tries to schedule task x away and goes into a slow
   wait for the runq lock of CPU n-# (must be one with a lower
   number).
2. CPU n-#, while processing softirqs, tries to balance domains
   and goes into a slow wait for its own runq lock (for updating
   some records). Since this is a spin_lock_irqsave in softirq
   context, interrupts will be re-enabled for the duration of
   the poll_irq hypercall used by Xen.
3. Before the runq lock of CPU n-# is unlocked, CPU n-1 receives
   an interrupt (e.g. endio) and when processing the interrupt,
   tries to wake up task x. But that is in schedule and still
   on_cpu, so try_to_wake_up goes into a tight loop.
4. The runq lock of CPU n-# gets unlocked, but the message only
   gets sent to the first waiter, which is CPU n-# and that is
   busily stuck.
5. CPU n-# never returns from the nested interruption to take and
   release the lock because the scheduler uses a busy wait.
   And CPU n never finishes the task migration because the unlock
   notification only went to CPU n-#.

To avoid this and since the unlocking code has no real sense of
which waiter is best suited to grab the lock, just send the IPI
to all of them. This causes the waiters to return from the hyper-
call (those not interrupted at least) and do active spinlocking.

BugLink: http://bugs.launchpad.net/bugs/1011792

Acked-by: Jan Beulich &lt;JBeulich@suse.com&gt;
Signed-off-by: Stefan Bader &lt;stefan.bader@canonical.com&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 76eaca031f0af2bb303e405986f637811956a422 upstream.

There is a loophole between Xen's current implementation of
pv-spinlocks and the scheduler. This was triggerable through
a testcase until v3.6 changed the TLB flushing code. The
problem potentially is still there just not observable in the
same way.

What could happen was (is):

1. CPU n tries to schedule task x away and goes into a slow
   wait for the runq lock of CPU n-# (must be one with a lower
   number).
2. CPU n-#, while processing softirqs, tries to balance domains
   and goes into a slow wait for its own runq lock (for updating
   some records). Since this is a spin_lock_irqsave in softirq
   context, interrupts will be re-enabled for the duration of
   the poll_irq hypercall used by Xen.
3. Before the runq lock of CPU n-# is unlocked, CPU n-1 receives
   an interrupt (e.g. endio) and when processing the interrupt,
   tries to wake up task x. But that is in schedule and still
   on_cpu, so try_to_wake_up goes into a tight loop.
4. The runq lock of CPU n-# gets unlocked, but the message only
   gets sent to the first waiter, which is CPU n-# and that is
   busily stuck.
5. CPU n-# never returns from the nested interruption to take and
   release the lock because the scheduler uses a busy wait.
   And CPU n never finishes the task migration because the unlock
   notification only went to CPU n-#.

To avoid this and since the unlocking code has no real sense of
which waiter is best suited to grab the lock, just send the IPI
to all of them. This causes the waiters to return from the hyper-
call (those not interrupted at least) and do active spinlocking.

BugLink: http://bugs.launchpad.net/bugs/1011792

Acked-by: Jan Beulich &lt;JBeulich@suse.com&gt;
Signed-off-by: Stefan Bader &lt;stefan.bader@canonical.com&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86-32, mm: Remove reference to resume_map_numa_kva()</title>
<updated>2013-02-28T14:32:23+00:00</updated>
<author>
<name>H. Peter Anvin</name>
<email>hpa@linux.intel.com</email>
</author>
<published>2013-01-31T21:53:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=dbb694e810c87e7e1760527a783437f26ac5a547'/>
<id>dbb694e810c87e7e1760527a783437f26ac5a547</id>
<content type='text'>
commit bb112aec5ee41427e9b9726e3d57b896709598ed upstream.

Remove reference to removed function resume_map_numa_kva().

Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Cc: Dave Hansen &lt;dave@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/r/20130131005616.1C79F411@kernel.stglabs.ibm.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit bb112aec5ee41427e9b9726e3d57b896709598ed upstream.

Remove reference to removed function resume_map_numa_kva().

Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Cc: Dave Hansen &lt;dave@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/r/20130131005616.1C79F411@kernel.stglabs.ibm.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS.</title>
<updated>2013-02-17T18:46:20+00:00</updated>
<author>
<name>Jan Beulich</name>
<email>JBeulich@suse.com</email>
</author>
<published>2013-01-24T13:11:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3339af37f35aff045db6a830185dddfac424c937'/>
<id>3339af37f35aff045db6a830185dddfac424c937</id>
<content type='text'>
commit 13d2b4d11d69a92574a55bfd985cfb0ca77aebdc upstream.

This fixes CVE-2013-0228 / XSA-42

Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user
in 32bit PV guest can use to crash the &gt; guest with the panic like this:

-------------
general protection fault: 0000 [#1] SMP
last sysfs file: /sys/devices/vbd-51712/block/xvda/dev
Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4
mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: scsi_wait_scan]

Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1
EIP: 0061:[&lt;c0407462&gt;] EFLAGS: 00010086 CPU: 0
EIP is at xen_iret+0x12/0x2b
EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010
ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0
 DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069
Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000)
Stack:
 00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000
Call Trace:
Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00
8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 &lt;8b&gt; 40
10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02
EIP: [&lt;c0407462&gt;] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0
general protection fault: 0000 [#2]
---[ end trace ab0d29a492dcd330 ]---
Kernel panic - not syncing: Fatal exception
Pid: 1250, comm: r Tainted: G      D    ---------------
2.6.32-356.el6.i686 #1
Call Trace:
 [&lt;c08476df&gt;] ? panic+0x6e/0x122
 [&lt;c084b63c&gt;] ? oops_end+0xbc/0xd0
 [&lt;c084b260&gt;] ? do_general_protection+0x0/0x210
 [&lt;c084a9b7&gt;] ? error_code+0x73/
-------------

Petr says: "
 I've analysed the bug and I think that xen_iret() cannot cope with
 mangled DS, in this case zeroed out (null selector/descriptor) by either
 xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT
 entry was invalidated by the reproducer. "

Jan took a look at the preliminary patch and came up a fix that solves
this problem:

"This code gets called after all registers other than those handled by
IRET got already restored, hence a null selector in %ds or a non-null
one that got loaded from a code or read-only data descriptor would
cause a kernel mode fault (with the potential of crashing the kernel
as a whole, if panic_on_oops is set)."

The way to fix this is to realize that the we can only relay on the
registers that IRET restores. The two that are guaranteed are the
%cs and %ss as they are always fixed GDT selectors. Also they are
inaccessible from user mode - so they cannot be altered. This is
the approach taken in this patch.

Another alternative option suggested by Jan would be to relay on
the subtle realization that using the %ebp or %esp relative references uses
the %ss segment.  In which case we could switch from using %eax to %ebp and
would not need the %ss over-rides. That would also require one extra
instruction to compensate for the one place where the register is used
as scaled index. However Andrew pointed out that is too subtle and if
further work was to be done in this code-path it could escape folks attention
and lead to accidents.

Reviewed-by: Petr Matousek &lt;pmatouse@redhat.com&gt;
Reported-by: Petr Matousek &lt;pmatouse@redhat.com&gt;
Reviewed-by: Andrew Cooper &lt;andrew.cooper3@citrix.com&gt;
Signed-off-by: Jan Beulich &lt;jbeulich@suse.com&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 13d2b4d11d69a92574a55bfd985cfb0ca77aebdc upstream.

This fixes CVE-2013-0228 / XSA-42

Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user
in 32bit PV guest can use to crash the &gt; guest with the panic like this:

-------------
general protection fault: 0000 [#1] SMP
last sysfs file: /sys/devices/vbd-51712/block/xvda/dev
Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4
mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: scsi_wait_scan]

Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1
EIP: 0061:[&lt;c0407462&gt;] EFLAGS: 00010086 CPU: 0
EIP is at xen_iret+0x12/0x2b
EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010
ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0
 DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069
Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000)
Stack:
 00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000
Call Trace:
Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00
8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 &lt;8b&gt; 40
10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02
EIP: [&lt;c0407462&gt;] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0
general protection fault: 0000 [#2]
---[ end trace ab0d29a492dcd330 ]---
Kernel panic - not syncing: Fatal exception
Pid: 1250, comm: r Tainted: G      D    ---------------
2.6.32-356.el6.i686 #1
Call Trace:
 [&lt;c08476df&gt;] ? panic+0x6e/0x122
 [&lt;c084b63c&gt;] ? oops_end+0xbc/0xd0
 [&lt;c084b260&gt;] ? do_general_protection+0x0/0x210
 [&lt;c084a9b7&gt;] ? error_code+0x73/
-------------

Petr says: "
 I've analysed the bug and I think that xen_iret() cannot cope with
 mangled DS, in this case zeroed out (null selector/descriptor) by either
 xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT
 entry was invalidated by the reproducer. "

Jan took a look at the preliminary patch and came up a fix that solves
this problem:

"This code gets called after all registers other than those handled by
IRET got already restored, hence a null selector in %ds or a non-null
one that got loaded from a code or read-only data descriptor would
cause a kernel mode fault (with the potential of crashing the kernel
as a whole, if panic_on_oops is set)."

The way to fix this is to realize that the we can only relay on the
registers that IRET restores. The two that are guaranteed are the
%cs and %ss as they are always fixed GDT selectors. Also they are
inaccessible from user mode - so they cannot be altered. This is
the approach taken in this patch.

Another alternative option suggested by Jan would be to relay on
the subtle realization that using the %ebp or %esp relative references uses
the %ss segment.  In which case we could switch from using %eax to %ebp and
would not need the %ss over-rides. That would also require one extra
instruction to compensate for the one place where the register is used
as scaled index. However Andrew pointed out that is too subtle and if
further work was to be done in this code-path it could escape folks attention
and lead to accidents.

Reviewed-by: Petr Matousek &lt;pmatouse@redhat.com&gt;
Reported-by: Petr Matousek &lt;pmatouse@redhat.com&gt;
Reviewed-by: Andrew Cooper &lt;andrew.cooper3@citrix.com&gt;
Signed-off-by: Jan Beulich &lt;jbeulich@suse.com&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mm: Check if PUD is large when validating a kernel address</title>
<updated>2013-02-17T18:46:20+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2013-02-11T14:52:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ec3918604c896df59632d47bd2ed874cbc2f262b'/>
<id>ec3918604c896df59632d47bd2ed874cbc2f262b</id>
<content type='text'>
commit 0ee364eb316348ddf3e0dfcd986f5f13f528f821 upstream.

A user reported the following oops when a backup process reads
/proc/kcore:

 BUG: unable to handle kernel paging request at ffffbb00ff33b000
 IP: [&lt;ffffffff8103157e&gt;] kern_addr_valid+0xbe/0x110
 [...]

 Call Trace:
  [&lt;ffffffff811b8aaa&gt;] read_kcore+0x17a/0x370
  [&lt;ffffffff811ad847&gt;] proc_reg_read+0x77/0xc0
  [&lt;ffffffff81151687&gt;] vfs_read+0xc7/0x130
  [&lt;ffffffff811517f3&gt;] sys_read+0x53/0xa0
  [&lt;ffffffff81449692&gt;] system_call_fastpath+0x16/0x1b

Investigation determined that the bug triggered when reading
system RAM at the 4G mark. On this system, that was the first
address using 1G pages for the virt-&gt;phys direct mapping so the
PUD is pointing to a physical address, not a PMD page.

The problem is that the page table walker in kern_addr_valid() is
not checking pud_large() and treats the physical address as if
it was a PMD.  If it happens to look like pmd_none then it'll
silently fail, probably returning zeros instead of real data. If
the data happens to look like a present PMD though, it will be
walked resulting in the oops above.

This patch adds the necessary pud_large() check.

Unfortunately the problem was not readily reproducible and now
they are running the backup program without accessing
/proc/kcore so the patch has not been validated but I think it
makes sense.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.coM&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20130211145236.GX21389@suse.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 0ee364eb316348ddf3e0dfcd986f5f13f528f821 upstream.

A user reported the following oops when a backup process reads
/proc/kcore:

 BUG: unable to handle kernel paging request at ffffbb00ff33b000
 IP: [&lt;ffffffff8103157e&gt;] kern_addr_valid+0xbe/0x110
 [...]

 Call Trace:
  [&lt;ffffffff811b8aaa&gt;] read_kcore+0x17a/0x370
  [&lt;ffffffff811ad847&gt;] proc_reg_read+0x77/0xc0
  [&lt;ffffffff81151687&gt;] vfs_read+0xc7/0x130
  [&lt;ffffffff811517f3&gt;] sys_read+0x53/0xa0
  [&lt;ffffffff81449692&gt;] system_call_fastpath+0x16/0x1b

Investigation determined that the bug triggered when reading
system RAM at the 4G mark. On this system, that was the first
address using 1G pages for the virt-&gt;phys direct mapping so the
PUD is pointing to a physical address, not a PMD page.

The problem is that the page table walker in kern_addr_valid() is
not checking pud_large() and treats the physical address as if
it was a PMD.  If it happens to look like pmd_none then it'll
silently fail, probably returning zeros instead of real data. If
the data happens to look like a present PMD though, it will be
walked resulting in the oops above.

This patch adds the necessary pud_large() check.

Unfortunately the problem was not readily reproducible and now
they are running the backup program without accessing
/proc/kcore so the patch has not been validated but I think it
makes sense.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.coM&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20130211145236.GX21389@suse.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86-64: Replace left over sti/cli in ia32 audit exit code</title>
<updated>2013-02-11T16:16:47+00:00</updated>
<author>
<name>Jan Beulich</name>
<email>JBeulich@suse.com</email>
</author>
<published>2013-01-30T07:55:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b19f432c33a8ee49f46e5fab79cf86f745661318'/>
<id>b19f432c33a8ee49f46e5fab79cf86f745661318</id>
<content type='text'>
commit 40a1ef95da85843696fc3ebe5fce39b0db32669f upstream.

For some reason they didn't get replaced so far by their
paravirt equivalents, resulting in code to be run with
interrupts disabled that doesn't expect so (causing, in the
observed case, a BUG_ON() to trigger) when syscall auditing is
enabled.

David (Cc-ed) came up with an identical fix, so likely this can
be taken to count as an ack from him.

Reported-by: Peter Moody &lt;pmoody@google.com&gt;
Signed-off-by: Jan Beulich &lt;jbeulich@suse.com&gt;
Cc: David Vrabel &lt;david.vrabel@citrix.com&gt;
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Link: http://lkml.kernel.org/r/5108E01902000078000BA9C5@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Cc: David Vrabel &lt;david.vrabel@citrix.com&gt;
Tested-by: Peter Moody &lt;pmoody@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 40a1ef95da85843696fc3ebe5fce39b0db32669f upstream.

For some reason they didn't get replaced so far by their
paravirt equivalents, resulting in code to be run with
interrupts disabled that doesn't expect so (causing, in the
observed case, a BUG_ON() to trigger) when syscall auditing is
enabled.

David (Cc-ed) came up with an identical fix, so likely this can
be taken to count as an ack from him.

Reported-by: Peter Moody &lt;pmoody@google.com&gt;
Signed-off-by: Jan Beulich &lt;jbeulich@suse.com&gt;
Cc: David Vrabel &lt;david.vrabel@citrix.com&gt;
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Link: http://lkml.kernel.org/r/5108E01902000078000BA9C5@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Cc: David Vrabel &lt;david.vrabel@citrix.com&gt;
Tested-by: Peter Moody &lt;pmoody@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86/Sandy Bridge: Sandy Bridge workaround depends on CONFIG_PCI</title>
<updated>2013-02-04T00:21:38+00:00</updated>
<author>
<name>H. Peter Anvin</name>
<email>hpa@linux.intel.com</email>
</author>
<published>2013-01-14T04:56:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d997f40c116cdb098e09f0a140501ffc42b5184e'/>
<id>d997f40c116cdb098e09f0a140501ffc42b5184e</id>
<content type='text'>
commit e43b3cec711a61edf047adf6204d542f3a659ef8 upstream.

early_pci_allowed() and read_pci_config_16() are only available if
CONFIG_PCI is defined.

Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Cc: Jesse Barnes &lt;jbarnes@virtuousgeek.org&gt;
Signed-off-by: Abdallah Chatila &lt;abdallah.chatila@ericsson.com&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e43b3cec711a61edf047adf6204d542f3a659ef8 upstream.

early_pci_allowed() and read_pci_config_16() are only available if
CONFIG_PCI is defined.

Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Cc: Jesse Barnes &lt;jbarnes@virtuousgeek.org&gt;
Signed-off-by: Abdallah Chatila &lt;abdallah.chatila@ericsson.com&gt;

</pre>
</div>
</content>
</entry>
</feed>
