<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/kernel, branch linux-3.11.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>exec/ptrace: fix get_dumpable() incorrect tests</title>
<updated>2013-11-29T18:42:17+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2013-11-12T23:11:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4004afd5f51ad0a86be405522b1ecf28ba66f4e5'/>
<id>4004afd5f51ad0a86be405522b1ecf28ba66f4e5</id>
<content type='text'>
commit d049f74f2dbe71354d43d393ac3a188947811348 upstream.

The get_dumpable() return value is not boolean.  Most users of the
function actually want to be testing for non-SUID_DUMP_USER(1) rather than
SUID_DUMP_DISABLE(0).  The SUID_DUMP_ROOT(2) is also considered a
protected state.  Almost all places did this correctly, excepting the two
places fixed in this patch.

Wrong logic:
    if (dumpable == SUID_DUMP_DISABLE) { /* be protective */ }
        or
    if (dumpable == 0) { /* be protective */ }
        or
    if (!dumpable) { /* be protective */ }

Correct logic:
    if (dumpable != SUID_DUMP_USER) { /* be protective */ }
        or
    if (dumpable != 1) { /* be protective */ }

Without this patch, if the system had set the sysctl fs/suid_dumpable=2, a
user was able to ptrace attach to processes that had dropped privileges to
that user.  (This may have been partially mitigated if Yama was enabled.)

The macros have been moved into the file that declares get/set_dumpable(),
which means things like the ia64 code can see them too.

CVE-2013-2929

Reported-by: Vasily Kulikov &lt;segoon@openwall.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d049f74f2dbe71354d43d393ac3a188947811348 upstream.

The get_dumpable() return value is not boolean.  Most users of the
function actually want to be testing for non-SUID_DUMP_USER(1) rather than
SUID_DUMP_DISABLE(0).  The SUID_DUMP_ROOT(2) is also considered a
protected state.  Almost all places did this correctly, excepting the two
places fixed in this patch.

Wrong logic:
    if (dumpable == SUID_DUMP_DISABLE) { /* be protective */ }
        or
    if (dumpable == 0) { /* be protective */ }
        or
    if (!dumpable) { /* be protective */ }

Correct logic:
    if (dumpable != SUID_DUMP_USER) { /* be protective */ }
        or
    if (dumpable != 1) { /* be protective */ }

Without this patch, if the system had set the sysctl fs/suid_dumpable=2, a
user was able to ptrace attach to processes that had dropped privileges to
that user.  (This may have been partially mitigated if Yama was enabled.)

The macros have been moved into the file that declares get/set_dumpable(),
which means things like the ia64 code can see them too.

CVE-2013-2929

Reported-by: Vasily Kulikov &lt;segoon@openwall.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>perf/ftrace: Fix paranoid level for enabling function tracer</title>
<updated>2013-11-29T18:42:15+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2013-11-05T17:51:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b8d125faff42139ad14da49479d903ab696a009e'/>
<id>b8d125faff42139ad14da49479d903ab696a009e</id>
<content type='text'>
commit 12ae030d54ef250706da5642fc7697cc60ad0df7 upstream.

The current default perf paranoid level is "1" which has
"perf_paranoid_kernel()" return false, and giving any operations that
use it, access to normal users. Unfortunately, this includes function
tracing and normal users should not be allowed to enable function
tracing by default.

The proper level is defined at "-1" (full perf access), which
"perf_paranoid_tracepoint_raw()" will only give access to. Use that
check instead for enabling function tracing.

Reported-by: Dave Jones &lt;davej@redhat.com&gt;
Reported-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Tested-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
CVE: CVE-2013-2930
Fixes: ced39002f5ea ("ftrace, perf: Add support to use function tracepoint in perf")
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 12ae030d54ef250706da5642fc7697cc60ad0df7 upstream.

The current default perf paranoid level is "1" which has
"perf_paranoid_kernel()" return false, and giving any operations that
use it, access to normal users. Unfortunately, this includes function
tracing and normal users should not be allowed to enable function
tracing by default.

The proper level is defined at "-1" (full perf access), which
"perf_paranoid_tracepoint_raw()" will only give access to. Use that
check instead for enabling function tracing.

Reported-by: Dave Jones &lt;davej@redhat.com&gt;
Reported-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Tested-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
CVE: CVE-2013-2930
Fixes: ced39002f5ea ("ftrace, perf: Add support to use function tracepoint in perf")
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>sched, idle: Fix the idle polling state logic</title>
<updated>2013-11-29T18:42:15+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2013-09-11T10:43:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f484c09fbbdf2c97175ab7f08ab713e5d88ff99f'/>
<id>f484c09fbbdf2c97175ab7f08ab713e5d88ff99f</id>
<content type='text'>
commit ea8117478918a4734586d35ff530721b682425be upstream.

Mike reported that commit 7d1a9417 ("x86: Use generic idle loop")
regressed several workloads and caused excessive reschedule
interrupts.

The patch in question failed to notice that the x86 code had an
inverted sense of the polling state versus the new generic code (x86:
default polling, generic: default !polling).

Fix the two prominent x86 mwait based idle drivers and introduce a few
new generic polling helpers (fixing the wrong smp_mb__after_clear_bit
usage).

Also switch the idle routines to using tif_need_resched() which is an
immediate TIF_NEED_RESCHED test as opposed to need_resched which will
end up being slightly different.

Reported-by: Mike Galbraith &lt;bitbucket@online.de&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: lenb@kernel.org
Cc: tglx@linutronix.de
Link: http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ea8117478918a4734586d35ff530721b682425be upstream.

Mike reported that commit 7d1a9417 ("x86: Use generic idle loop")
regressed several workloads and caused excessive reschedule
interrupts.

The patch in question failed to notice that the x86 code had an
inverted sense of the polling state versus the new generic code (x86:
default polling, generic: default !polling).

Fix the two prominent x86 mwait based idle drivers and introduce a few
new generic polling helpers (fixing the wrong smp_mb__after_clear_bit
usage).

Also switch the idle routines to using tif_need_resched() which is an
immediate TIF_NEED_RESCHED test as opposed to need_resched which will
end up being slightly different.

Reported-by: Mike Galbraith &lt;bitbucket@online.de&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: lenb@kernel.org
Cc: tglx@linutronix.de
Link: http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Fix perf ring buffer memory ordering</title>
<updated>2013-11-20T20:31:39+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2013-10-28T12:55:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=57b0a9d310c0b1373fbbeee2aa0caec12cd74c5e'/>
<id>57b0a9d310c0b1373fbbeee2aa0caec12cd74c5e</id>
<content type='text'>
commit bf378d341e4873ed928dc3c636252e6895a21f50 upstream.

The PPC64 people noticed a missing memory barrier and crufty old
comments in the perf ring buffer code. So update all the comments and
add the missing barrier.

When the architecture implements local_t using atomic_long_t there
will be double barriers issued; but short of introducing more
conditional barrier primitives this is the best we can do.

Reported-by: Victor Kaplansky &lt;victork@il.ibm.com&gt;
Tested-by: Victor Kaplansky &lt;victork@il.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@polymtl.ca&gt;
Cc: michael@ellerman.id.au
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Michael Neuling &lt;mikey@neuling.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: anton@samba.org
Cc: benh@kernel.crashing.org
Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lan
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Michael Neuling &lt;mikey@neuling.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit bf378d341e4873ed928dc3c636252e6895a21f50 upstream.

The PPC64 people noticed a missing memory barrier and crufty old
comments in the perf ring buffer code. So update all the comments and
add the missing barrier.

When the architecture implements local_t using atomic_long_t there
will be double barriers issued; but short of introducing more
conditional barrier primitives this is the best we can do.

Reported-by: Victor Kaplansky &lt;victork@il.ibm.com&gt;
Tested-by: Victor Kaplansky &lt;victork@il.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@polymtl.ca&gt;
Cc: michael@ellerman.id.au
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Michael Neuling &lt;mikey@neuling.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: anton@samba.org
Cc: benh@kernel.crashing.org
Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lan
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Michael Neuling &lt;mikey@neuling.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Fix potential out-of-bounds in trace_get_user()</title>
<updated>2013-11-20T20:31:39+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2013-10-10T02:23:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=33e3df4431c556fab571411bf46a7cac1c6426e5'/>
<id>33e3df4431c556fab571411bf46a7cac1c6426e5</id>
<content type='text'>
commit 057db8488b53d5e4faa0cedb2f39d4ae75dfbdbb upstream.

Andrey reported the following report:

ERROR: AddressSanitizer: heap-buffer-overflow on address ffff8800359c99f3
ffff8800359c99f3 is located 0 bytes to the right of 243-byte region [ffff8800359c9900, ffff8800359c99f3)
Accessed by thread T13003:
  #0 ffffffff810dd2da (asan_report_error+0x32a/0x440)
  #1 ffffffff810dc6b0 (asan_check_region+0x30/0x40)
  #2 ffffffff810dd4d3 (__tsan_write1+0x13/0x20)
  #3 ffffffff811cd19e (ftrace_regex_release+0x1be/0x260)
  #4 ffffffff812a1065 (__fput+0x155/0x360)
  #5 ffffffff812a12de (____fput+0x1e/0x30)
  #6 ffffffff8111708d (task_work_run+0x10d/0x140)
  #7 ffffffff810ea043 (do_exit+0x433/0x11f0)
  #8 ffffffff810eaee4 (do_group_exit+0x84/0x130)
  #9 ffffffff810eafb1 (SyS_exit_group+0x21/0x30)
  #10 ffffffff81928782 (system_call_fastpath+0x16/0x1b)

Allocated by thread T5167:
  #0 ffffffff810dc778 (asan_slab_alloc+0x48/0xc0)
  #1 ffffffff8128337c (__kmalloc+0xbc/0x500)
  #2 ffffffff811d9d54 (trace_parser_get_init+0x34/0x90)
  #3 ffffffff811cd7b3 (ftrace_regex_open+0x83/0x2e0)
  #4 ffffffff811cda7d (ftrace_filter_open+0x2d/0x40)
  #5 ffffffff8129b4ff (do_dentry_open+0x32f/0x430)
  #6 ffffffff8129b668 (finish_open+0x68/0xa0)
  #7 ffffffff812b66ac (do_last+0xb8c/0x1710)
  #8 ffffffff812b7350 (path_openat+0x120/0xb50)
  #9 ffffffff812b8884 (do_filp_open+0x54/0xb0)
  #10 ffffffff8129d36c (do_sys_open+0x1ac/0x2c0)
  #11 ffffffff8129d4b7 (SyS_open+0x37/0x50)
  #12 ffffffff81928782 (system_call_fastpath+0x16/0x1b)

Shadow bytes around the buggy address:
  ffff8800359c9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  ffff8800359c9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  ffff8800359c9800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9880: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=&gt;ffff8800359c9980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[03]fb
  ffff8800359c9a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  ffff8800359c9b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff8800359c9c00: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap redzone:          fa
  Heap kmalloc redzone:  fb
  Freed heap region:     fd
  Shadow gap:            fe

The out-of-bounds access happens on 'parser-&gt;buffer[parser-&gt;idx] = 0;'

Although the crash happened in ftrace_regex_open() the real bug
occurred in trace_get_user() where there's an incrementation to
parser-&gt;idx without a check against the size. The way it is triggered
is if userspace sends in 128 characters (EVENT_BUF_SIZE + 1), the loop
that reads the last character stores it and then breaks out because
there is no more characters. Then the last character is read to determine
what to do next, and the index is incremented without checking size.

Then the caller of trace_get_user() usually nulls out the last character
with a zero, but since the index is equal to the size, it writes a nul
character after the allocated space, which can corrupt memory.

Luckily, only root user has write access to this file.

Link: http://lkml.kernel.org/r/20131009222323.04fd1a0d@gandalf.local.home

Reported-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 057db8488b53d5e4faa0cedb2f39d4ae75dfbdbb upstream.

Andrey reported the following report:

ERROR: AddressSanitizer: heap-buffer-overflow on address ffff8800359c99f3
ffff8800359c99f3 is located 0 bytes to the right of 243-byte region [ffff8800359c9900, ffff8800359c99f3)
Accessed by thread T13003:
  #0 ffffffff810dd2da (asan_report_error+0x32a/0x440)
  #1 ffffffff810dc6b0 (asan_check_region+0x30/0x40)
  #2 ffffffff810dd4d3 (__tsan_write1+0x13/0x20)
  #3 ffffffff811cd19e (ftrace_regex_release+0x1be/0x260)
  #4 ffffffff812a1065 (__fput+0x155/0x360)
  #5 ffffffff812a12de (____fput+0x1e/0x30)
  #6 ffffffff8111708d (task_work_run+0x10d/0x140)
  #7 ffffffff810ea043 (do_exit+0x433/0x11f0)
  #8 ffffffff810eaee4 (do_group_exit+0x84/0x130)
  #9 ffffffff810eafb1 (SyS_exit_group+0x21/0x30)
  #10 ffffffff81928782 (system_call_fastpath+0x16/0x1b)

Allocated by thread T5167:
  #0 ffffffff810dc778 (asan_slab_alloc+0x48/0xc0)
  #1 ffffffff8128337c (__kmalloc+0xbc/0x500)
  #2 ffffffff811d9d54 (trace_parser_get_init+0x34/0x90)
  #3 ffffffff811cd7b3 (ftrace_regex_open+0x83/0x2e0)
  #4 ffffffff811cda7d (ftrace_filter_open+0x2d/0x40)
  #5 ffffffff8129b4ff (do_dentry_open+0x32f/0x430)
  #6 ffffffff8129b668 (finish_open+0x68/0xa0)
  #7 ffffffff812b66ac (do_last+0xb8c/0x1710)
  #8 ffffffff812b7350 (path_openat+0x120/0xb50)
  #9 ffffffff812b8884 (do_filp_open+0x54/0xb0)
  #10 ffffffff8129d36c (do_sys_open+0x1ac/0x2c0)
  #11 ffffffff8129d4b7 (SyS_open+0x37/0x50)
  #12 ffffffff81928782 (system_call_fastpath+0x16/0x1b)

Shadow bytes around the buggy address:
  ffff8800359c9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  ffff8800359c9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  ffff8800359c9800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9880: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=&gt;ffff8800359c9980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[03]fb
  ffff8800359c9a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  ffff8800359c9b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff8800359c9c00: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap redzone:          fa
  Heap kmalloc redzone:  fb
  Freed heap region:     fd
  Shadow gap:            fe

The out-of-bounds access happens on 'parser-&gt;buffer[parser-&gt;idx] = 0;'

Although the crash happened in ftrace_regex_open() the real bug
occurred in trace_get_user() where there's an incrementation to
parser-&gt;idx without a check against the size. The way it is triggered
is if userspace sends in 128 characters (EVENT_BUF_SIZE + 1), the loop
that reads the last character stores it and then breaks out because
there is no more characters. Then the last character is read to determine
what to do next, and the index is incremented without checking size.

Then the caller of trace_get_user() usually nulls out the last character
with a zero, but since the index is equal to the size, it writes a nul
character after the allocated space, which can corrupt memory.

Luckily, only root user has write access to this file.

Link: http://lkml.kernel.org/r/20131009222323.04fd1a0d@gandalf.local.home

Reported-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>mutex: Avoid gcc version dependent __builtin_constant_p() usage</title>
<updated>2013-11-13T03:08:12+00:00</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2013-10-17T10:45:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=97f61e059bf498022b88fd800ed629018bacfcc2'/>
<id>97f61e059bf498022b88fd800ed629018bacfcc2</id>
<content type='text'>
commit b0267507dfd0187fb7840a0ec461a510a7f041c5 upstream.

Commit 040a0a37 ("mutex: Add support for wound/wait style locks")
used "!__builtin_constant_p(p == NULL)" but gcc 3.x cannot
handle such expression correctly, leading to boot failure when
built with CONFIG_DEBUG_MUTEXES=y.

Fix it by explicitly passing a bool which tells whether p != NULL
or not.

[ PeterZ: This is a sad patch, but provided it actually generates
          similar code I suppose its the best we can do bar whole
	  sale deprecating gcc-3. ]

Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Maarten Lankhorst &lt;maarten.lankhorst@canonical.com&gt;
Cc: peterz@infradead.org
Cc: imirkin@alum.mit.edu
Cc: daniel.vetter@ffwll.ch
Cc: robdclark@gmail.com
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/201310171945.AGB17114.FSQVtHOJFOOFML@I-love.SAKURA.ne.jp
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b0267507dfd0187fb7840a0ec461a510a7f041c5 upstream.

Commit 040a0a37 ("mutex: Add support for wound/wait style locks")
used "!__builtin_constant_p(p == NULL)" but gcc 3.x cannot
handle such expression correctly, leading to boot failure when
built with CONFIG_DEBUG_MUTEXES=y.

Fix it by explicitly passing a bool which tells whether p != NULL
or not.

[ PeterZ: This is a sad patch, but provided it actually generates
          similar code I suppose its the best we can do bar whole
	  sale deprecating gcc-3. ]

Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Maarten Lankhorst &lt;maarten.lankhorst@canonical.com&gt;
Cc: peterz@infradead.org
Cc: imirkin@alum.mit.edu
Cc: daniel.vetter@ffwll.ch
Cc: robdclark@gmail.com
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/201310171945.AGB17114.FSQVtHOJFOOFML@I-love.SAKURA.ne.jp
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>clockevents: Sanitize ticks to nsec conversion</title>
<updated>2013-11-13T03:08:09+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2013-09-24T19:50:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=728ccd1f9a6e8ec19e001d8b4d1e2930c9034068'/>
<id>728ccd1f9a6e8ec19e001d8b4d1e2930c9034068</id>
<content type='text'>
commit 97b9410643475d6557d2517c2aff9fd2221141a9 upstream.

Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use
clockevents_config_and_register() where possible" caused a regression
for some of the converted subarchs.

The reason is, that the clockevents core code converts the minimal
hardware tick delta to a nanosecond value for core internal
usage. This conversion is affected by integer math rounding loss, so
the backwards conversion to hardware ticks will likely result in a
value which is less than the configured hardware limitation. The
affected subarchs used their own workaround (SIGH!) which got lost in
the conversion.

The solution for the issue at hand is simple: adding evt-&gt;mult - 1 to
the shifted value before the integer divison in the core conversion
function takes care of it. But this only works for the case where for
the scaled math mult/shift pair "mult &lt;= 1 &lt;&lt; shift" is true. For the
case where "mult &gt; 1 &lt;&lt; shift" we can apply the rounding add only for
the minimum delta value to make sure that the backward conversion is
not less than the given hardware limit. For the upper bound we need to
omit the rounding add, because the backwards conversion is always
larger than the original latch value. That would violate the upper
bound of the hardware device.

Though looking closer at the details of that function reveals another
bogosity: The upper bounds check is broken as well. Checking for a
resulting "clc" value greater than KTIME_MAX after the conversion is
pointless. The conversion does:

      u64 clc = (latch &lt;&lt; evt-&gt;shift) / evt-&gt;mult;

So there is no sanity check for (latch &lt;&lt; evt-&gt;shift) exceeding the
64bit boundary. The latch argument is "unsigned long", so on a 64bit
arch the handed in argument could easily lead to an unnoticed shift
overflow. With the above rounding fix applied the calculation before
the divison is:

       u64 clc = (latch &lt;&lt; evt-&gt;shift) + evt-&gt;mult - 1;

So we need to make sure, that neither the shift nor the rounding add
is overflowing the u64 boundary.

[ukl: move assignment to rnd after eventually changing mult, fix build
 issue and correct comment with the right math]

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Russell King - ARM Linux &lt;linux@arm.linux.org.uk&gt;
Cc: Marc Kleine-Budde &lt;mkl@pengutronix.de&gt;
Cc: nicolas.ferre@atmel.com
Cc: Marc Pignat &lt;marc.pignat@hevs.ch&gt;
Cc: john.stultz@linaro.org
Cc: kernel@pengutronix.de
Cc: Ronald Wahl &lt;ronald.wahl@raritan.com&gt;
Cc: LAK &lt;linux-arm-kernel@lists.infradead.org&gt;
Cc: Ludovic Desroches &lt;ludovic.desroches@atmel.com&gt;
Link: http://lkml.kernel.org/r/1380052223-24139-1-git-send-email-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 97b9410643475d6557d2517c2aff9fd2221141a9 upstream.

Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use
clockevents_config_and_register() where possible" caused a regression
for some of the converted subarchs.

The reason is, that the clockevents core code converts the minimal
hardware tick delta to a nanosecond value for core internal
usage. This conversion is affected by integer math rounding loss, so
the backwards conversion to hardware ticks will likely result in a
value which is less than the configured hardware limitation. The
affected subarchs used their own workaround (SIGH!) which got lost in
the conversion.

The solution for the issue at hand is simple: adding evt-&gt;mult - 1 to
the shifted value before the integer divison in the core conversion
function takes care of it. But this only works for the case where for
the scaled math mult/shift pair "mult &lt;= 1 &lt;&lt; shift" is true. For the
case where "mult &gt; 1 &lt;&lt; shift" we can apply the rounding add only for
the minimum delta value to make sure that the backward conversion is
not less than the given hardware limit. For the upper bound we need to
omit the rounding add, because the backwards conversion is always
larger than the original latch value. That would violate the upper
bound of the hardware device.

Though looking closer at the details of that function reveals another
bogosity: The upper bounds check is broken as well. Checking for a
resulting "clc" value greater than KTIME_MAX after the conversion is
pointless. The conversion does:

      u64 clc = (latch &lt;&lt; evt-&gt;shift) / evt-&gt;mult;

So there is no sanity check for (latch &lt;&lt; evt-&gt;shift) exceeding the
64bit boundary. The latch argument is "unsigned long", so on a 64bit
arch the handed in argument could easily lead to an unnoticed shift
overflow. With the above rounding fix applied the calculation before
the divison is:

       u64 clc = (latch &lt;&lt; evt-&gt;shift) + evt-&gt;mult - 1;

So we need to make sure, that neither the shift nor the rounding add
is overflowing the u64 boundary.

[ukl: move assignment to rnd after eventually changing mult, fix build
 issue and correct comment with the right math]

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Russell King - ARM Linux &lt;linux@arm.linux.org.uk&gt;
Cc: Marc Kleine-Budde &lt;mkl@pengutronix.de&gt;
Cc: nicolas.ferre@atmel.com
Cc: Marc Pignat &lt;marc.pignat@hevs.ch&gt;
Cc: john.stultz@linaro.org
Cc: kernel@pengutronix.de
Cc: Ronald Wahl &lt;ronald.wahl@raritan.com&gt;
Cc: LAK &lt;linux-arm-kernel@lists.infradead.org&gt;
Cc: Ludovic Desroches &lt;ludovic.desroches@atmel.com&gt;
Link: http://lkml.kernel.org/r/1380052223-24139-1-git-send-email-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: fix to break the while loop in cgroup_attach_task() correctly</title>
<updated>2013-11-13T03:08:07+00:00</updated>
<author>
<name>Anjana V Kumar</name>
<email>anjanavk12@gmail.com</email>
</author>
<published>2013-10-12T02:59:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4e6787c87c004aa26866b26e3eb44cf0b478ce0d'/>
<id>4e6787c87c004aa26866b26e3eb44cf0b478ce0d</id>
<content type='text'>
commit ea84753c98a7ac6b74e530b64c444a912b3835ca upstream.

Both Anjana and Eunki reported a stall in the while_each_thread loop
in cgroup_attach_task().

It's because, when we attach a single thread to a cgroup, if the cgroup
is exiting or is already in that cgroup, we won't break the loop.

If the task is already in the cgroup, the bug can lead to another thread
being attached to the cgroup unexpectedly:

  # echo 5207 &gt; tasks
  # cat tasks
  5207
  # echo 5207 &gt; tasks
  # cat tasks
  5207
  5215

What's worse, if the task to be attached isn't the leader of the thread
group, we might never exit the loop, hence cpu stall. Thanks for Oleg's
analysis.

This bug was introduced by commit 081aa458c38ba576bdd4265fc807fa95b48b9e79
("cgroup: consolidate cgroup_attach_task() and cgroup_attach_proc()")

[ lizf: - fixed the first continue, pointed out by Oleg,
        - rewrote changelog. ]

Reported-by: Eunki Kim &lt;eunki_kim@samsung.com&gt;
Reported-by: Anjana V Kumar &lt;anjanavk12@gmail.com&gt;
Signed-off-by: Anjana V Kumar &lt;anjanavk12@gmail.com&gt;
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ea84753c98a7ac6b74e530b64c444a912b3835ca upstream.

Both Anjana and Eunki reported a stall in the while_each_thread loop
in cgroup_attach_task().

It's because, when we attach a single thread to a cgroup, if the cgroup
is exiting or is already in that cgroup, we won't break the loop.

If the task is already in the cgroup, the bug can lead to another thread
being attached to the cgroup unexpectedly:

  # echo 5207 &gt; tasks
  # cat tasks
  5207
  # echo 5207 &gt; tasks
  # cat tasks
  5207
  5215

What's worse, if the task to be attached isn't the leader of the thread
group, we might never exit the loop, hence cpu stall. Thanks for Oleg's
analysis.

This bug was introduced by commit 081aa458c38ba576bdd4265fc807fa95b48b9e79
("cgroup: consolidate cgroup_attach_task() and cgroup_attach_proc()")

[ lizf: - fixed the first continue, pointed out by Oleg,
        - rewrote changelog. ]

Reported-by: Eunki Kim &lt;eunki_kim@samsung.com&gt;
Reported-by: Anjana V Kumar &lt;anjanavk12@gmail.com&gt;
Signed-off-by: Anjana V Kumar &lt;anjanavk12@gmail.com&gt;
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>irq: Force hardirq exit's softirq processing on its own stack</title>
<updated>2013-10-14T01:14:31+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2013-09-23T22:50:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3e1fee600177eed70abbd7e7703a40b5edb6cb8f'/>
<id>3e1fee600177eed70abbd7e7703a40b5edb6cb8f</id>
<content type='text'>
commit ded797547548a5b8e7b92383a41e4c0e6b0ecb7f upstream.

The commit facd8b80c67a3cf64a467c4a2ac5fb31f2e6745b
("irq: Sanitize invoke_softirq") converted irq exit
calls of do_softirq() to __do_softirq() on all architectures,
assuming it was only used there for its irq disablement
properties.

But as a side effect, the softirqs processed in the end
of the hardirq are always called on the inline current
stack that is used by irq_exit() instead of the softirq
stack provided by the archs that override do_softirq().

The result is mostly safe if the architecture runs irq_exit()
on a separate irq stack because then softirqs are processed
on that same stack that is near empty at this stage (assuming
hardirq aren't nesting).

Otherwise irq_exit() runs in the task stack and so does the softirq
too. The interrupted call stack can be randomly deep already and
the softirq can dig through it even further. To add insult to the
injury, this softirq can be interrupted by a new hardirq, maximizing
the chances for a stack overrun as reported in powerpc for example:

	do_IRQ: stack overflow: 1920
	CPU: 0 PID: 1602 Comm: qemu-system-ppc Not tainted 3.10.4-300.1.fc19.ppc64p7 #1
	Call Trace:
	[c0000000050a8740] .show_stack+0x130/0x200 (unreliable)
	[c0000000050a8810] .dump_stack+0x28/0x3c
	[c0000000050a8880] .do_IRQ+0x2b8/0x2c0
	[c0000000050a8930] hardware_interrupt_common+0x154/0x180
	--- Exception: 501 at .cp_start_xmit+0x3a4/0x820 [8139cp]
		LR = .cp_start_xmit+0x390/0x820 [8139cp]
	[c0000000050a8d40] .dev_hard_start_xmit+0x394/0x640
	[c0000000050a8e00] .sch_direct_xmit+0x110/0x260
	[c0000000050a8ea0] .dev_queue_xmit+0x260/0x630
	[c0000000050a8f40] .br_dev_queue_push_xmit+0xc4/0x130 [bridge]
	[c0000000050a8fc0] .br_dev_xmit+0x198/0x270 [bridge]
	[c0000000050a9070] .dev_hard_start_xmit+0x394/0x640
	[c0000000050a9130] .dev_queue_xmit+0x428/0x630
	[c0000000050a91d0] .ip_finish_output+0x2a4/0x550
	[c0000000050a9290] .ip_local_out+0x50/0x70
	[c0000000050a9310] .ip_queue_xmit+0x148/0x420
	[c0000000050a93b0] .tcp_transmit_skb+0x4e4/0xaf0
	[c0000000050a94a0] .__tcp_ack_snd_check+0x7c/0xf0
	[c0000000050a9520] .tcp_rcv_established+0x1e8/0x930
	[c0000000050a95f0] .tcp_v4_do_rcv+0x21c/0x570
	[c0000000050a96c0] .tcp_v4_rcv+0x734/0x930
	[c0000000050a97a0] .ip_local_deliver_finish+0x184/0x360
	[c0000000050a9840] .ip_rcv_finish+0x148/0x400
	[c0000000050a98d0] .__netif_receive_skb_core+0x4f8/0xb00
	[c0000000050a99d0] .netif_receive_skb+0x44/0x110
	[c0000000050a9a70] .br_handle_frame_finish+0x2bc/0x3f0 [bridge]
	[c0000000050a9b20] .br_nf_pre_routing_finish+0x2ac/0x420 [bridge]
	[c0000000050a9bd0] .br_nf_pre_routing+0x4dc/0x7d0 [bridge]
	[c0000000050a9c70] .nf_iterate+0x114/0x130
	[c0000000050a9d30] .nf_hook_slow+0xb4/0x1e0
	[c0000000050a9e00] .br_handle_frame+0x290/0x330 [bridge]
	[c0000000050a9ea0] .__netif_receive_skb_core+0x34c/0xb00
	[c0000000050a9fa0] .netif_receive_skb+0x44/0x110
	[c0000000050aa040] .napi_gro_receive+0xe8/0x120
	[c0000000050aa0c0] .cp_rx_poll+0x31c/0x590 [8139cp]
	[c0000000050aa1d0] .net_rx_action+0x1dc/0x310
	[c0000000050aa2b0] .__do_softirq+0x158/0x330
	[c0000000050aa3b0] .irq_exit+0xc8/0x110
	[c0000000050aa430] .do_IRQ+0xdc/0x2c0
	[c0000000050aa4e0] hardware_interrupt_common+0x154/0x180
	 --- Exception: 501 at .bad_range+0x1c/0x110
		 LR = .get_page_from_freelist+0x908/0xbb0
	[c0000000050aa7d0] .list_del+0x18/0x50 (unreliable)
	[c0000000050aa850] .get_page_from_freelist+0x908/0xbb0
	[c0000000050aa9e0] .__alloc_pages_nodemask+0x21c/0xae0
	[c0000000050aaba0] .alloc_pages_vma+0xd0/0x210
	[c0000000050aac60] .handle_pte_fault+0x814/0xb70
	[c0000000050aad50] .__get_user_pages+0x1a4/0x640
	[c0000000050aae60] .get_user_pages_fast+0xec/0x160
	[c0000000050aaf10] .__gfn_to_pfn_memslot+0x3b0/0x430 [kvm]
	[c0000000050aafd0] .kvmppc_gfn_to_pfn+0x64/0x130 [kvm]
	[c0000000050ab070] .kvmppc_mmu_map_page+0x94/0x530 [kvm]
	[c0000000050ab190] .kvmppc_handle_pagefault+0x174/0x610 [kvm]
	[c0000000050ab270] .kvmppc_handle_exit_pr+0x464/0x9b0 [kvm]
	[c0000000050ab320]  kvm_start_lightweight+0x1ec/0x1fc [kvm]
	[c0000000050ab4f0] .kvmppc_vcpu_run_pr+0x168/0x3b0 [kvm]
	[c0000000050ab9c0] .kvmppc_vcpu_run+0xc8/0xf0 [kvm]
	[c0000000050aba50] .kvm_arch_vcpu_ioctl_run+0x5c/0x1a0 [kvm]
	[c0000000050abae0] .kvm_vcpu_ioctl+0x478/0x730 [kvm]
	[c0000000050abc90] .do_vfs_ioctl+0x4ec/0x7c0
	[c0000000050abd80] .SyS_ioctl+0xd4/0xf0
	[c0000000050abe30] syscall_exit+0x0/0x98

Since this is a regression, this patch proposes a minimalistic
and low-risk solution by blindly forcing the hardirq exit processing of
softirqs on the softirq stack. This way we should reduce significantly
the opportunities for task stack overflow dug by softirqs.

Longer term solutions may involve extending the hardirq stack coverage to
irq_exit(), etc...

Reported-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mackerras &lt;paulus@au1.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul Mackerras &lt;paulus@au1.ibm.com&gt;
Cc: James Hogan &lt;james.hogan@imgtec.com&gt;
Cc: James E.J. Bottomley &lt;jejb@parisc-linux.org&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ded797547548a5b8e7b92383a41e4c0e6b0ecb7f upstream.

The commit facd8b80c67a3cf64a467c4a2ac5fb31f2e6745b
("irq: Sanitize invoke_softirq") converted irq exit
calls of do_softirq() to __do_softirq() on all architectures,
assuming it was only used there for its irq disablement
properties.

But as a side effect, the softirqs processed in the end
of the hardirq are always called on the inline current
stack that is used by irq_exit() instead of the softirq
stack provided by the archs that override do_softirq().

The result is mostly safe if the architecture runs irq_exit()
on a separate irq stack because then softirqs are processed
on that same stack that is near empty at this stage (assuming
hardirq aren't nesting).

Otherwise irq_exit() runs in the task stack and so does the softirq
too. The interrupted call stack can be randomly deep already and
the softirq can dig through it even further. To add insult to the
injury, this softirq can be interrupted by a new hardirq, maximizing
the chances for a stack overrun as reported in powerpc for example:

	do_IRQ: stack overflow: 1920
	CPU: 0 PID: 1602 Comm: qemu-system-ppc Not tainted 3.10.4-300.1.fc19.ppc64p7 #1
	Call Trace:
	[c0000000050a8740] .show_stack+0x130/0x200 (unreliable)
	[c0000000050a8810] .dump_stack+0x28/0x3c
	[c0000000050a8880] .do_IRQ+0x2b8/0x2c0
	[c0000000050a8930] hardware_interrupt_common+0x154/0x180
	--- Exception: 501 at .cp_start_xmit+0x3a4/0x820 [8139cp]
		LR = .cp_start_xmit+0x390/0x820 [8139cp]
	[c0000000050a8d40] .dev_hard_start_xmit+0x394/0x640
	[c0000000050a8e00] .sch_direct_xmit+0x110/0x260
	[c0000000050a8ea0] .dev_queue_xmit+0x260/0x630
	[c0000000050a8f40] .br_dev_queue_push_xmit+0xc4/0x130 [bridge]
	[c0000000050a8fc0] .br_dev_xmit+0x198/0x270 [bridge]
	[c0000000050a9070] .dev_hard_start_xmit+0x394/0x640
	[c0000000050a9130] .dev_queue_xmit+0x428/0x630
	[c0000000050a91d0] .ip_finish_output+0x2a4/0x550
	[c0000000050a9290] .ip_local_out+0x50/0x70
	[c0000000050a9310] .ip_queue_xmit+0x148/0x420
	[c0000000050a93b0] .tcp_transmit_skb+0x4e4/0xaf0
	[c0000000050a94a0] .__tcp_ack_snd_check+0x7c/0xf0
	[c0000000050a9520] .tcp_rcv_established+0x1e8/0x930
	[c0000000050a95f0] .tcp_v4_do_rcv+0x21c/0x570
	[c0000000050a96c0] .tcp_v4_rcv+0x734/0x930
	[c0000000050a97a0] .ip_local_deliver_finish+0x184/0x360
	[c0000000050a9840] .ip_rcv_finish+0x148/0x400
	[c0000000050a98d0] .__netif_receive_skb_core+0x4f8/0xb00
	[c0000000050a99d0] .netif_receive_skb+0x44/0x110
	[c0000000050a9a70] .br_handle_frame_finish+0x2bc/0x3f0 [bridge]
	[c0000000050a9b20] .br_nf_pre_routing_finish+0x2ac/0x420 [bridge]
	[c0000000050a9bd0] .br_nf_pre_routing+0x4dc/0x7d0 [bridge]
	[c0000000050a9c70] .nf_iterate+0x114/0x130
	[c0000000050a9d30] .nf_hook_slow+0xb4/0x1e0
	[c0000000050a9e00] .br_handle_frame+0x290/0x330 [bridge]
	[c0000000050a9ea0] .__netif_receive_skb_core+0x34c/0xb00
	[c0000000050a9fa0] .netif_receive_skb+0x44/0x110
	[c0000000050aa040] .napi_gro_receive+0xe8/0x120
	[c0000000050aa0c0] .cp_rx_poll+0x31c/0x590 [8139cp]
	[c0000000050aa1d0] .net_rx_action+0x1dc/0x310
	[c0000000050aa2b0] .__do_softirq+0x158/0x330
	[c0000000050aa3b0] .irq_exit+0xc8/0x110
	[c0000000050aa430] .do_IRQ+0xdc/0x2c0
	[c0000000050aa4e0] hardware_interrupt_common+0x154/0x180
	 --- Exception: 501 at .bad_range+0x1c/0x110
		 LR = .get_page_from_freelist+0x908/0xbb0
	[c0000000050aa7d0] .list_del+0x18/0x50 (unreliable)
	[c0000000050aa850] .get_page_from_freelist+0x908/0xbb0
	[c0000000050aa9e0] .__alloc_pages_nodemask+0x21c/0xae0
	[c0000000050aaba0] .alloc_pages_vma+0xd0/0x210
	[c0000000050aac60] .handle_pte_fault+0x814/0xb70
	[c0000000050aad50] .__get_user_pages+0x1a4/0x640
	[c0000000050aae60] .get_user_pages_fast+0xec/0x160
	[c0000000050aaf10] .__gfn_to_pfn_memslot+0x3b0/0x430 [kvm]
	[c0000000050aafd0] .kvmppc_gfn_to_pfn+0x64/0x130 [kvm]
	[c0000000050ab070] .kvmppc_mmu_map_page+0x94/0x530 [kvm]
	[c0000000050ab190] .kvmppc_handle_pagefault+0x174/0x610 [kvm]
	[c0000000050ab270] .kvmppc_handle_exit_pr+0x464/0x9b0 [kvm]
	[c0000000050ab320]  kvm_start_lightweight+0x1ec/0x1fc [kvm]
	[c0000000050ab4f0] .kvmppc_vcpu_run_pr+0x168/0x3b0 [kvm]
	[c0000000050ab9c0] .kvmppc_vcpu_run+0xc8/0xf0 [kvm]
	[c0000000050aba50] .kvm_arch_vcpu_ioctl_run+0x5c/0x1a0 [kvm]
	[c0000000050abae0] .kvm_vcpu_ioctl+0x478/0x730 [kvm]
	[c0000000050abc90] .do_vfs_ioctl+0x4ec/0x7c0
	[c0000000050abd80] .SyS_ioctl+0xd4/0xf0
	[c0000000050abe30] syscall_exit+0x0/0x98

Since this is a regression, this patch proposes a minimalistic
and low-risk solution by blindly forcing the hardirq exit processing of
softirqs on the softirq stack. This way we should reduce significantly
the opportunities for task stack overflow dug by softirqs.

Longer term solutions may involve extending the hardirq stack coverage to
irq_exit(), etc...

Reported-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mackerras &lt;paulus@au1.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul Mackerras &lt;paulus@au1.ibm.com&gt;
Cc: James Hogan &lt;james.hogan@imgtec.com&gt;
Cc: James E.J. Bottomley &lt;jejb@parisc-linux.org&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>kernel/kmod.c: check for NULL in call_usermodehelper_exec()</title>
<updated>2013-10-14T01:14:30+00:00</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2013-09-30T20:45:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=29416e26236a2e6b9cfe67260835300dc07b2d26'/>
<id>29416e26236a2e6b9cfe67260835300dc07b2d26</id>
<content type='text'>
commit 4c1c7be95c345cf2ad537a0c48e9aeadc7304527 upstream.

If /proc/sys/kernel/core_pattern contains only "|", a NULL pointer
dereference happens upon core dump because argv_split("") returns
argv[0] == NULL.

This bug was once fixed by commit 264b83c07a84 ("usermodehelper: check
subprocess_info-&gt;path != NULL") but was by error reintroduced by commit
7f57cfa4e2aa ("usermodehelper: kill the sub_info-&gt;path[0] check").

This bug seems to exist since 2.6.19 (the version which core dump to
pipe was added).  Depending on kernel version and config, some side
effect might happen immediately after this oops (e.g.  kernel panic with
2.6.32-358.18.1.el6).

Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 4c1c7be95c345cf2ad537a0c48e9aeadc7304527 upstream.

If /proc/sys/kernel/core_pattern contains only "|", a NULL pointer
dereference happens upon core dump because argv_split("") returns
argv[0] == NULL.

This bug was once fixed by commit 264b83c07a84 ("usermodehelper: check
subprocess_info-&gt;path != NULL") but was by error reintroduced by commit
7f57cfa4e2aa ("usermodehelper: kill the sub_info-&gt;path[0] check").

This bug seems to exist since 2.6.19 (the version which core dump to
pipe was added).  Depending on kernel version and config, some side
effect might happen immediately after this oops (e.g.  kernel panic with
2.6.32-358.18.1.el6).

Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
</feed>
