<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/arch/x86/kernel/dumpstack_64.c, branch v3.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>x86, dumpstack: Fix code bytes breakage due to missing KERN_CONT</title>
<updated>2011-12-19T21:09:56+00:00</updated>
<author>
<name>Clemens Ladisch</name>
<email>clemens@ladisch.de</email>
</author>
<published>2011-12-19T21:07:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=13f541c10b30fc6529200d7f9a0073217709622f'/>
<id>13f541c10b30fc6529200d7f9a0073217709622f</id>
<content type='text'>
When printing the code bytes in show_registers(), the markers around the
byte at the fault address could make the printk() format string look
like a valid log level and facility code.  This would prevent this byte
from being printed and result in a spurious newline:

[ 7555.765589] Code: 8b 32 e9 94 00 00 00 81 7d 00 ff 00 00 00 0f 87 96 00 00 00 48 8b 83 c0 00 00 00 44 89 e2 44 89 e6 48 89 df 48 8b 80 d8 02 00 00
[ 7555.765683]  8b 48 28 48 89 d0 81 e2 ff 0f 00 00 48 c1 e8 0c 48 c1 e0 04

Add KERN_CONT where needed, and elsewhere in show_registers() for
consistency.

Signed-off-by: Clemens Ladisch &lt;clemens@ladisch.de&gt;
Link: http://lkml.kernel.org/r/4EEFA7AE.9020407@ladisch.de
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When printing the code bytes in show_registers(), the markers around the
byte at the fault address could make the printk() format string look
like a valid log level and facility code.  This would prevent this byte
from being printed and result in a spurious newline:

[ 7555.765589] Code: 8b 32 e9 94 00 00 00 81 7d 00 ff 00 00 00 0f 87 96 00 00 00 48 8b 83 c0 00 00 00 44 89 e2 44 89 e6 48 89 df 48 8b 80 d8 02 00 00
[ 7555.765683]  8b 48 28 48 89 d0 81 e2 ff 0f 00 00 48 c1 e8 0c 48 c1 e0 04

Add KERN_CONT where needed, and elsewhere in show_registers() for
consistency.

Signed-off-by: Clemens Ladisch &lt;clemens@ladisch.de&gt;
Link: http://lkml.kernel.org/r/4EEFA7AE.9020407@ladisch.de
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: Don't use frame pointer to save old stack on irq entry</title>
<updated>2011-07-02T16:06:36+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2011-07-02T14:52:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a2bbe75089d5eb9a3a46d50dd5c215e213790288'/>
<id>a2bbe75089d5eb9a3a46d50dd5c215e213790288</id>
<content type='text'>
rbp is used in SAVE_ARGS_IRQ to save the old stack pointer
in order to restore it later in ret_from_intr.

It is convenient because we save its value in the irq regs
and it's easily restored using the leave instruction.

However this is a kind of abuse of the frame pointer which
role is to help unwinding the kernel by chaining frames
together, each node following the return address to the
previous frame.

But although we are breaking the frame by changing the stack
pointer, there is no preceding return address before the new
frame. Hence using the frame pointer to link the two stacks
breaks the stack unwinders that find a random value instead of
a return address here.

There is no workaround that can work in every case. We are using
the fixup_bp_irq_link() function to dereference that abused frame
pointer in the case of non nesting interrupt (which means stack
changed).
But that doesn't fix the case of interrupts that don't change the
stack (but we still have the unconditional frame link), which is
the case of hardirq interrupting softirq. We have no way to detect
this transition so the frame irq link is considered as a real frame
pointer and the return address is dereferenced but it is still a
spurious one.

There are two possible results of this: either the spurious return
address, a random stack value, luckily belongs to the kernel text
and then the unwinding can continue and we just have a weird entry
in the stack trace. Or it doesn't belong to the kernel text and
unwinding stops there.

This is the reason why stacktraces (including perf callchains) on
irqs that interrupted softirqs don't work very well.

To solve this, we don't save the old stack pointer on rbp anymore
but we save it to a scratch register that we push on the new
stack and that we pop back later on irq return.

This preserves the whole frame chain without spurious return addresses
in the middle and drops the need for the horrid fixup_bp_irq_link()
workaround.

And finally irqs that interrupt softirq are sanely unwinded.

Before:

    99.81%         perf  [kernel.kallsyms]  [k] perf_pending_event
                   |
                   --- perf_pending_event
                       irq_work_run
                       smp_irq_work_interrupt
                       irq_work_interrupt
                      |
                      |--41.60%-- __read
                      |          |
                      |          |--99.90%-- create_worker
                      |          |          bench_sched_messaging
                      |          |          cmd_bench
                      |          |          run_builtin
                      |          |          main
                      |          |          __libc_start_main
                      |           --0.10%-- [...]

After:

     1.64%  swapper  [kernel.kallsyms]  [k] perf_pending_event
            |
            --- perf_pending_event
                irq_work_run
                smp_irq_work_interrupt
                irq_work_interrupt
               |
               |--95.00%-- arch_irq_work_raise
               |          irq_work_queue
               |          __perf_event_overflow
               |          perf_swevent_overflow
               |          perf_swevent_event
               |          perf_tp_event
               |          perf_trace_softirq
               |          __do_softirq
               |          call_softirq
               |          do_softirq
               |          irq_exit
               |          |
               |          |--73.68%-- smp_apic_timer_interrupt
               |          |          apic_timer_interrupt
               |          |          |
               |          |          |--96.43%-- amd_e400_idle
               |          |          |          cpu_idle
               |          |          |          start_secondary

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jan Beulich &lt;JBeulich@novell.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
rbp is used in SAVE_ARGS_IRQ to save the old stack pointer
in order to restore it later in ret_from_intr.

It is convenient because we save its value in the irq regs
and it's easily restored using the leave instruction.

However this is a kind of abuse of the frame pointer which
role is to help unwinding the kernel by chaining frames
together, each node following the return address to the
previous frame.

But although we are breaking the frame by changing the stack
pointer, there is no preceding return address before the new
frame. Hence using the frame pointer to link the two stacks
breaks the stack unwinders that find a random value instead of
a return address here.

There is no workaround that can work in every case. We are using
the fixup_bp_irq_link() function to dereference that abused frame
pointer in the case of non nesting interrupt (which means stack
changed).
But that doesn't fix the case of interrupts that don't change the
stack (but we still have the unconditional frame link), which is
the case of hardirq interrupting softirq. We have no way to detect
this transition so the frame irq link is considered as a real frame
pointer and the return address is dereferenced but it is still a
spurious one.

There are two possible results of this: either the spurious return
address, a random stack value, luckily belongs to the kernel text
and then the unwinding can continue and we just have a weird entry
in the stack trace. Or it doesn't belong to the kernel text and
unwinding stops there.

This is the reason why stacktraces (including perf callchains) on
irqs that interrupted softirqs don't work very well.

To solve this, we don't save the old stack pointer on rbp anymore
but we save it to a scratch register that we push on the new
stack and that we pop back later on irq return.

This preserves the whole frame chain without spurious return addresses
in the middle and drops the need for the horrid fixup_bp_irq_link()
workaround.

And finally irqs that interrupt softirq are sanely unwinded.

Before:

    99.81%         perf  [kernel.kallsyms]  [k] perf_pending_event
                   |
                   --- perf_pending_event
                       irq_work_run
                       smp_irq_work_interrupt
                       irq_work_interrupt
                      |
                      |--41.60%-- __read
                      |          |
                      |          |--99.90%-- create_worker
                      |          |          bench_sched_messaging
                      |          |          cmd_bench
                      |          |          run_builtin
                      |          |          main
                      |          |          __libc_start_main
                      |           --0.10%-- [...]

After:

     1.64%  swapper  [kernel.kallsyms]  [k] perf_pending_event
            |
            --- perf_pending_event
                irq_work_run
                smp_irq_work_interrupt
                irq_work_interrupt
               |
               |--95.00%-- arch_irq_work_raise
               |          irq_work_queue
               |          __perf_event_overflow
               |          perf_swevent_overflow
               |          perf_swevent_event
               |          perf_tp_event
               |          perf_trace_softirq
               |          __do_softirq
               |          call_softirq
               |          do_softirq
               |          irq_exit
               |          |
               |          |--73.68%-- smp_apic_timer_interrupt
               |          |          apic_timer_interrupt
               |          |          |
               |          |          |--96.43%-- amd_e400_idle
               |          |          |          cpu_idle
               |          |          |          start_secondary

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jan Beulich &lt;JBeulich@novell.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: Fetch stack from regs when possible in dump_trace()</title>
<updated>2011-07-02T16:04:20+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2011-06-30T17:04:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=47ce11a2b6519f9c7843223ea8e561eb71ea5896'/>
<id>47ce11a2b6519f9c7843223ea8e561eb71ea5896</id>
<content type='text'>
When regs are passed to dump_stack(), we fetch the frame
pointer from the regs but the stack pointer is taken from
the current frame.

Thus the frame and stack pointers may not come from the same
context. For example this can result in the unwinder to
think the context is in irq, due to the current value of
the stack, but the frame pointer coming from the regs points
to a frame from another place. It then tries to fix up
the irq link but ends up dereferencing a random frame
pointer that doesn't belong to the irq stack:

[ 9131.706906] ------------[ cut here ]------------
[ 9131.707003] WARNING: at arch/x86/kernel/dumpstack_64.c:129 dump_trace+0x2aa/0x330()
[ 9131.707003] Hardware name: AMD690VM-FMH
[ 9131.707003] Perf: bad frame pointer = 0000000000000005 in callchain
[ 9131.707003] Modules linked in:
[ 9131.707003] Pid: 1050, comm: perf Not tainted 3.0.0-rc3+ #181
[ 9131.707003] Call Trace:
[ 9131.707003]  &lt;IRQ&gt;  [&lt;ffffffff8104bd4a&gt;] warn_slowpath_common+0x7a/0xb0
[ 9131.707003]  [&lt;ffffffff8104be21&gt;] warn_slowpath_fmt+0x41/0x50
[ 9131.707003]  [&lt;ffffffff8178b873&gt;] ? bad_to_user+0x6d/0x10be
[ 9131.707003]  [&lt;ffffffff8100c2da&gt;] dump_trace+0x2aa/0x330
[ 9131.707003]  [&lt;ffffffff810107d3&gt;] ? native_sched_clock+0x13/0x50
[ 9131.707003]  [&lt;ffffffff8101b164&gt;] perf_callchain_kernel+0x54/0x70
[ 9131.707003]  [&lt;ffffffff810d391f&gt;] perf_prepare_sample+0x19f/0x2a0
[ 9131.707003]  [&lt;ffffffff810d546c&gt;] __perf_event_overflow+0x16c/0x290
[ 9131.707003]  [&lt;ffffffff810d5430&gt;] ? __perf_event_overflow+0x130/0x290
[ 9131.707003]  [&lt;ffffffff810107d3&gt;] ? native_sched_clock+0x13/0x50
[ 9131.707003]  [&lt;ffffffff8100fbb9&gt;] ? sched_clock+0x9/0x10
[ 9131.707003]  [&lt;ffffffff810752e5&gt;] ? T.375+0x15/0x90
[ 9131.707003]  [&lt;ffffffff81084da4&gt;] ? trace_hardirqs_on_caller+0x64/0x180
[ 9131.707003]  [&lt;ffffffff810817bd&gt;] ? trace_hardirqs_off+0xd/0x10
[ 9131.707003]  [&lt;ffffffff810d5764&gt;] perf_event_overflow+0x14/0x20
[ 9131.707003]  [&lt;ffffffff810d588c&gt;] perf_swevent_hrtimer+0x11c/0x130
[ 9131.707003]  [&lt;ffffffff817821a1&gt;] ? error_exit+0x51/0xb0
[ 9131.707003]  [&lt;ffffffff81072e93&gt;] __run_hrtimer+0x83/0x1e0
[ 9131.707003]  [&lt;ffffffff810d5770&gt;] ? perf_event_overflow+0x20/0x20
[ 9131.707003]  [&lt;ffffffff81073256&gt;] hrtimer_interrupt+0x106/0x250
[ 9131.707003]  [&lt;ffffffff812a3bfd&gt;] ? trace_hardirqs_off_thunk+0x3a/0x3c
[ 9131.707003]  [&lt;ffffffff81024833&gt;] smp_apic_timer_interrupt+0x53/0x90
[ 9131.707003]  [&lt;ffffffff81789053&gt;] apic_timer_interrupt+0x13/0x20
[ 9131.707003]  &lt;EOI&gt;  [&lt;ffffffff817821a1&gt;] ? error_exit+0x51/0xb0
[ 9131.707003]  [&lt;ffffffff8178219c&gt;] ? error_exit+0x4c/0xb0
[ 9131.707003] ---[ end trace b2560d4876709347 ]---

Fix this by simply taking the stack pointer from regs-&gt;sp
when regs are provided.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When regs are passed to dump_stack(), we fetch the frame
pointer from the regs but the stack pointer is taken from
the current frame.

Thus the frame and stack pointers may not come from the same
context. For example this can result in the unwinder to
think the context is in irq, due to the current value of
the stack, but the frame pointer coming from the regs points
to a frame from another place. It then tries to fix up
the irq link but ends up dereferencing a random frame
pointer that doesn't belong to the irq stack:

[ 9131.706906] ------------[ cut here ]------------
[ 9131.707003] WARNING: at arch/x86/kernel/dumpstack_64.c:129 dump_trace+0x2aa/0x330()
[ 9131.707003] Hardware name: AMD690VM-FMH
[ 9131.707003] Perf: bad frame pointer = 0000000000000005 in callchain
[ 9131.707003] Modules linked in:
[ 9131.707003] Pid: 1050, comm: perf Not tainted 3.0.0-rc3+ #181
[ 9131.707003] Call Trace:
[ 9131.707003]  &lt;IRQ&gt;  [&lt;ffffffff8104bd4a&gt;] warn_slowpath_common+0x7a/0xb0
[ 9131.707003]  [&lt;ffffffff8104be21&gt;] warn_slowpath_fmt+0x41/0x50
[ 9131.707003]  [&lt;ffffffff8178b873&gt;] ? bad_to_user+0x6d/0x10be
[ 9131.707003]  [&lt;ffffffff8100c2da&gt;] dump_trace+0x2aa/0x330
[ 9131.707003]  [&lt;ffffffff810107d3&gt;] ? native_sched_clock+0x13/0x50
[ 9131.707003]  [&lt;ffffffff8101b164&gt;] perf_callchain_kernel+0x54/0x70
[ 9131.707003]  [&lt;ffffffff810d391f&gt;] perf_prepare_sample+0x19f/0x2a0
[ 9131.707003]  [&lt;ffffffff810d546c&gt;] __perf_event_overflow+0x16c/0x290
[ 9131.707003]  [&lt;ffffffff810d5430&gt;] ? __perf_event_overflow+0x130/0x290
[ 9131.707003]  [&lt;ffffffff810107d3&gt;] ? native_sched_clock+0x13/0x50
[ 9131.707003]  [&lt;ffffffff8100fbb9&gt;] ? sched_clock+0x9/0x10
[ 9131.707003]  [&lt;ffffffff810752e5&gt;] ? T.375+0x15/0x90
[ 9131.707003]  [&lt;ffffffff81084da4&gt;] ? trace_hardirqs_on_caller+0x64/0x180
[ 9131.707003]  [&lt;ffffffff810817bd&gt;] ? trace_hardirqs_off+0xd/0x10
[ 9131.707003]  [&lt;ffffffff810d5764&gt;] perf_event_overflow+0x14/0x20
[ 9131.707003]  [&lt;ffffffff810d588c&gt;] perf_swevent_hrtimer+0x11c/0x130
[ 9131.707003]  [&lt;ffffffff817821a1&gt;] ? error_exit+0x51/0xb0
[ 9131.707003]  [&lt;ffffffff81072e93&gt;] __run_hrtimer+0x83/0x1e0
[ 9131.707003]  [&lt;ffffffff810d5770&gt;] ? perf_event_overflow+0x20/0x20
[ 9131.707003]  [&lt;ffffffff81073256&gt;] hrtimer_interrupt+0x106/0x250
[ 9131.707003]  [&lt;ffffffff812a3bfd&gt;] ? trace_hardirqs_off_thunk+0x3a/0x3c
[ 9131.707003]  [&lt;ffffffff81024833&gt;] smp_apic_timer_interrupt+0x53/0x90
[ 9131.707003]  [&lt;ffffffff81789053&gt;] apic_timer_interrupt+0x13/0x20
[ 9131.707003]  &lt;EOI&gt;  [&lt;ffffffff817821a1&gt;] ? error_exit+0x51/0xb0
[ 9131.707003]  [&lt;ffffffff8178219c&gt;] ? error_exit+0x4c/0xb0
[ 9131.707003] ---[ end trace b2560d4876709347 ]---

Fix this by simply taking the stack pointer from regs-&gt;sp
when regs are provided.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86, dumpstack: Correct stack dump info when frame pointer is available</title>
<updated>2011-03-18T09:51:42+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@gmail.com</email>
</author>
<published>2011-03-18T02:40:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e8e999cf3cc733482e390b02ff25a64cecdc0b64'/>
<id>e8e999cf3cc733482e390b02ff25a64cecdc0b64</id>
<content type='text'>
Current stack dump code scans entire stack and check each entry
contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it
could mark whether the pointer is valid or not based on value of
the frame pointer. Invalid entries could be preceded by '?' sign.

However this was not going to happen because scan start point
was always higher than the frame pointer so that they could not
meet.

Commit 9c0729dc8062 ("x86: Eliminate bp argument from the stack
tracing routines") delayed bp acquisition point, so the bp was
read in lower frame, thus all of the entries were marked
invalid.

This patch fixes this by reverting above commit while retaining
stack_frame() helper as suggested by Frederic Weisbecker.

End result looks like below:

before:

 [    3.508329] Call Trace:
 [    3.508551]  [&lt;ffffffff814f35c9&gt;] ? panic+0x91/0x199
 [    3.508662]  [&lt;ffffffff814f3739&gt;] ? printk+0x68/0x6a
 [    3.508770]  [&lt;ffffffff81a981b2&gt;] ? mount_block_root+0x257/0x26e
 [    3.508876]  [&lt;ffffffff81a9821f&gt;] ? mount_root+0x56/0x5a
 [    3.508975]  [&lt;ffffffff81a98393&gt;] ? prepare_namespace+0x170/0x1a9
 [    3.509216]  [&lt;ffffffff81a9772b&gt;] ? kernel_init+0x1d2/0x1e2
 [    3.509335]  [&lt;ffffffff81003894&gt;] ? kernel_thread_helper+0x4/0x10
 [    3.509442]  [&lt;ffffffff814f6880&gt;] ? restore_args+0x0/0x30
 [    3.509542]  [&lt;ffffffff81a97559&gt;] ? kernel_init+0x0/0x1e2
 [    3.509641]  [&lt;ffffffff81003890&gt;] ? kernel_thread_helper+0x0/0x10

after:

 [    3.522991] Call Trace:
 [    3.523351]  [&lt;ffffffff814f35b9&gt;] panic+0x91/0x199
 [    3.523468]  [&lt;ffffffff814f3729&gt;] ? printk+0x68/0x6a
 [    3.523576]  [&lt;ffffffff81a981b2&gt;] mount_block_root+0x257/0x26e
 [    3.523681]  [&lt;ffffffff81a9821f&gt;] mount_root+0x56/0x5a
 [    3.523780]  [&lt;ffffffff81a98393&gt;] prepare_namespace+0x170/0x1a9
 [    3.523885]  [&lt;ffffffff81a9772b&gt;] kernel_init+0x1d2/0x1e2
 [    3.523987]  [&lt;ffffffff81003894&gt;] kernel_thread_helper+0x4/0x10
 [    3.524228]  [&lt;ffffffff814f6880&gt;] ? restore_args+0x0/0x30
 [    3.524345]  [&lt;ffffffff81a97559&gt;] ? kernel_init+0x0/0x1e2
 [    3.524445]  [&lt;ffffffff81003890&gt;] ? kernel_thread_helper+0x0/0x10

 -v5:
   * fix build breakage with oprofile

 -v4:
   * use 0 instead of regs-&gt;bp
   * separate out printk changes

 -v3:
   * apply comment from Frederic
   * add a couple of printk fixes

Signed-off-by: Namhyung Kim &lt;namhyung@gmail.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Soren Sandmann &lt;ssp@redhat.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Robert Richter &lt;robert.richter@amd.com&gt;
LKML-Reference: &lt;1300416006-3163-1-git-send-email-namhyung@gmail.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Current stack dump code scans entire stack and check each entry
contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it
could mark whether the pointer is valid or not based on value of
the frame pointer. Invalid entries could be preceded by '?' sign.

However this was not going to happen because scan start point
was always higher than the frame pointer so that they could not
meet.

Commit 9c0729dc8062 ("x86: Eliminate bp argument from the stack
tracing routines") delayed bp acquisition point, so the bp was
read in lower frame, thus all of the entries were marked
invalid.

This patch fixes this by reverting above commit while retaining
stack_frame() helper as suggested by Frederic Weisbecker.

End result looks like below:

before:

 [    3.508329] Call Trace:
 [    3.508551]  [&lt;ffffffff814f35c9&gt;] ? panic+0x91/0x199
 [    3.508662]  [&lt;ffffffff814f3739&gt;] ? printk+0x68/0x6a
 [    3.508770]  [&lt;ffffffff81a981b2&gt;] ? mount_block_root+0x257/0x26e
 [    3.508876]  [&lt;ffffffff81a9821f&gt;] ? mount_root+0x56/0x5a
 [    3.508975]  [&lt;ffffffff81a98393&gt;] ? prepare_namespace+0x170/0x1a9
 [    3.509216]  [&lt;ffffffff81a9772b&gt;] ? kernel_init+0x1d2/0x1e2
 [    3.509335]  [&lt;ffffffff81003894&gt;] ? kernel_thread_helper+0x4/0x10
 [    3.509442]  [&lt;ffffffff814f6880&gt;] ? restore_args+0x0/0x30
 [    3.509542]  [&lt;ffffffff81a97559&gt;] ? kernel_init+0x0/0x1e2
 [    3.509641]  [&lt;ffffffff81003890&gt;] ? kernel_thread_helper+0x0/0x10

after:

 [    3.522991] Call Trace:
 [    3.523351]  [&lt;ffffffff814f35b9&gt;] panic+0x91/0x199
 [    3.523468]  [&lt;ffffffff814f3729&gt;] ? printk+0x68/0x6a
 [    3.523576]  [&lt;ffffffff81a981b2&gt;] mount_block_root+0x257/0x26e
 [    3.523681]  [&lt;ffffffff81a9821f&gt;] mount_root+0x56/0x5a
 [    3.523780]  [&lt;ffffffff81a98393&gt;] prepare_namespace+0x170/0x1a9
 [    3.523885]  [&lt;ffffffff81a9772b&gt;] kernel_init+0x1d2/0x1e2
 [    3.523987]  [&lt;ffffffff81003894&gt;] kernel_thread_helper+0x4/0x10
 [    3.524228]  [&lt;ffffffff814f6880&gt;] ? restore_args+0x0/0x30
 [    3.524345]  [&lt;ffffffff81a97559&gt;] ? kernel_init+0x0/0x1e2
 [    3.524445]  [&lt;ffffffff81003890&gt;] ? kernel_thread_helper+0x0/0x10

 -v5:
   * fix build breakage with oprofile

 -v4:
   * use 0 instead of regs-&gt;bp
   * separate out printk changes

 -v3:
   * apply comment from Frederic
   * add a couple of printk fixes

Signed-off-by: Namhyung Kim &lt;namhyung@gmail.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Soren Sandmann &lt;ssp@redhat.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Robert Richter &lt;robert.richter@amd.com&gt;
LKML-Reference: &lt;1300416006-3163-1-git-send-email-namhyung@gmail.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86-64: Don't use pointer to out-of-scope variable in dump_trace()</title>
<updated>2011-01-24T21:46:15+00:00</updated>
<author>
<name>Jesper Juhl</name>
<email>jj@chaosbits.net</email>
</author>
<published>2011-01-24T21:41:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2e5aa6824d9e0248d734573dad8858a2cc279cfe'/>
<id>2e5aa6824d9e0248d734573dad8858a2cc279cfe</id>
<content type='text'>
In arch/x86/kernel/dumpstack_64.c::dump_trace() we have this code:

...
  		if (!stack) {
  			unsigned long dummy;
  			stack = &amp;dummy;
  			if (task &amp;&amp; task != current)
  				stack = (unsigned long *)task-&gt;thread.sp;
  		}

  		bp = stack_frame(task, regs);
  		/*
  		 * Print function call entries in all stacks, starting at the
  		 * current stack address. If the stacks consist of nested
  		 * exceptions
  		 */
  		tinfo = task_thread_info(task);

  		for (;;) {
  			char *id;
  			unsigned long *estack_end;
  			estack_end = in_exception_stack(cpu, (unsigned long)stack,
  							&amp;used, &amp;id);
...

You'll notice that we assign to 'stack' the address of the variable
'dummy' which is only in-scope inside the 'if (!stack)'. So when we later
access stack (at the end of the above, and assuming we did not take the
'if (task &amp;&amp; task != current)' branch) we'll be using the address of a
variable that is no longer in scope. I believe this patch is the proper
fix, but I freely admit that I'm not 100% certain.

Signed-off-by: Jesper Juhl &lt;jj@chaosbits.net&gt;
LKML-Reference: &lt;alpine.LNX.2.00.1101242232590.10252@swampdragon.chaosbits.net&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In arch/x86/kernel/dumpstack_64.c::dump_trace() we have this code:

...
  		if (!stack) {
  			unsigned long dummy;
  			stack = &amp;dummy;
  			if (task &amp;&amp; task != current)
  				stack = (unsigned long *)task-&gt;thread.sp;
  		}

  		bp = stack_frame(task, regs);
  		/*
  		 * Print function call entries in all stacks, starting at the
  		 * current stack address. If the stacks consist of nested
  		 * exceptions
  		 */
  		tinfo = task_thread_info(task);

  		for (;;) {
  			char *id;
  			unsigned long *estack_end;
  			estack_end = in_exception_stack(cpu, (unsigned long)stack,
  							&amp;used, &amp;id);
...

You'll notice that we assign to 'stack' the address of the variable
'dummy' which is only in-scope inside the 'if (!stack)'. So when we later
access stack (at the end of the above, and assuming we did not take the
'if (task &amp;&amp; task != current)' branch) we'll be using the address of a
variable that is no longer in scope. I believe this patch is the proper
fix, but I freely admit that I'm not 100% certain.

Signed-off-by: Jesper Juhl &lt;jj@chaosbits.net&gt;
LKML-Reference: &lt;alpine.LNX.2.00.1101242232590.10252@swampdragon.chaosbits.net&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: Eliminate bp argument from the stack tracing routines</title>
<updated>2010-11-18T13:37:34+00:00</updated>
<author>
<name>Soeren Sandmann Pedersen</name>
<email>sandmann@redhat.com</email>
</author>
<published>2010-11-05T09:59:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9c0729dc8062bed96189bd14ac6d4920f3958743'/>
<id>9c0729dc8062bed96189bd14ac6d4920f3958743</id>
<content type='text'>
The various stack tracing routines take a 'bp' argument in which the
caller is supposed to provide the base pointer to use, or 0 if doesn't
have one. Since bp is garbage whenever CONFIG_FRAME_POINTER is not
defined, this means all callers in principle should either always pass
0, or be conditional on CONFIG_FRAME_POINTER.

However, there are only really three use cases for stack tracing:

(a) Trace the current task, including IRQ stack if any
(b) Trace the current task, but skip IRQ stack
(c) Trace some other task

In all cases, if CONFIG_FRAME_POINTER is not defined, bp should just
be 0.  If it _is_ defined, then

- in case (a) bp should be gotten directly from the CPU's register, so
  the caller should pass NULL for regs,

- in case (b) the caller should should pass the IRQ registers to
  dump_trace(),

- in case (c) bp should be gotten from the top of the task's stack, so
  the caller should pass NULL for regs.

Hence, the bp argument is not necessary because the combination of
task and regs is sufficient to determine an appropriate value for bp.

This patch introduces a new inline function stack_frame(task, regs)
that computes the desired bp. This function is then called from the
two versions of dump_stack().

Signed-off-by: Soren Sandmann &lt;ssp@redhat.com&gt;
Acked-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Arjan van de Ven &lt;arjan@infradead.org&gt;,
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;,
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;,
LKML-Reference: &lt;m3oc9rop28.fsf@dhcp-100-3-82.bos.redhat.com&gt;&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The various stack tracing routines take a 'bp' argument in which the
caller is supposed to provide the base pointer to use, or 0 if doesn't
have one. Since bp is garbage whenever CONFIG_FRAME_POINTER is not
defined, this means all callers in principle should either always pass
0, or be conditional on CONFIG_FRAME_POINTER.

However, there are only really three use cases for stack tracing:

(a) Trace the current task, including IRQ stack if any
(b) Trace the current task, but skip IRQ stack
(c) Trace some other task

In all cases, if CONFIG_FRAME_POINTER is not defined, bp should just
be 0.  If it _is_ defined, then

- in case (a) bp should be gotten directly from the CPU's register, so
  the caller should pass NULL for regs,

- in case (b) the caller should should pass the IRQ registers to
  dump_trace(),

- in case (c) bp should be gotten from the top of the task's stack, so
  the caller should pass NULL for regs.

Hence, the bp argument is not necessary because the combination of
task and regs is sufficient to determine an appropriate value for bp.

This patch introduces a new inline function stack_frame(task, regs)
that computes the desired bp. This function is then called from the
two versions of dump_stack().

Signed-off-by: Soren Sandmann &lt;ssp@redhat.com&gt;
Acked-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Arjan van de Ven &lt;arjan@infradead.org&gt;,
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;,
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;,
LKML-Reference: &lt;m3oc9rop28.fsf@dhcp-100-3-82.bos.redhat.com&gt;&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86, printk: Get rid of &lt;0&gt; from stack output</title>
<updated>2010-10-23T18:03:03+00:00</updated>
<author>
<name>Jiri Slaby</name>
<email>jslaby@suse.cz</email>
</author>
<published>2010-10-20T14:48:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e4072a9a9d186fe86293effe8828faa4be75b4a4'/>
<id>e4072a9a9d186fe86293effe8828faa4be75b4a4</id>
<content type='text'>
The stack output currently looks like this:

 7fffffffffffffff 0000000a00000000 ffffffff81093341 0000000000000046
&lt;0&gt; ffff88003a545fd8 0000000000000000 0000000000000000 00007fffa39769c0
&lt;0&gt; ffff88003e403f58 ffffffff8102fc4c ffff88003e403f58 ffff88003e403f78

The superfluous &lt;0&gt; are caused by recent printk KERN_CONT
change. &lt;*&gt; is now ignored in printk unless some text follows
the level and even then it still has to be the first in the
format message.

Note that the log_lvl parameter is now completely ignored in
show_stack_log_lvl and the stack is dumped with the default
level (like for quite some time already). It behaves the same as
the rest of the dump, function traces are dumped in the very
same manner. Only Code and maybe some lines are printed with
EMERG level.

Unfortunately I see no way how to fix this conceptually to have
the whole oops/BUG/panic output with the same level, so this
removed only the superfluous characters for the time being.

Just for illustration:

&lt;4&gt;Process kworker/0:0 (pid: 0, threadinfo ffff88003c8a6000, task ffff88003c85c100)
&lt;0&gt;Stack:
&lt;4&gt; ffffffff818022c0 0000000a00000001 0000000000000001 0000000000000046
&lt;4&gt; ffff88003c8a7fd8 0000000000000001 ffff88003c8a7e58 0000000000000000
&lt;4&gt; ffff88003e503f48 ffffffff8102fc4c ffff88003e503f48 ffff88003e503f68
&lt;0&gt;Call Trace:
&lt;0&gt; &lt;IRQ&gt;
&lt;4&gt; [&lt;ffffffff8102fc4c&gt;] ? call_softirq+0x1c/0x30 ...
&lt;0&gt;Code: 00 01 00 00 65 8b 04 25 80 c5 00 00 c7 45 ...

Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
Cc: jirislaby@gmail.com
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
LKML-Reference: &lt;1287586131-16222-1-git-send-email-jslaby@suse.cz&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The stack output currently looks like this:

 7fffffffffffffff 0000000a00000000 ffffffff81093341 0000000000000046
&lt;0&gt; ffff88003a545fd8 0000000000000000 0000000000000000 00007fffa39769c0
&lt;0&gt; ffff88003e403f58 ffffffff8102fc4c ffff88003e403f58 ffff88003e403f78

The superfluous &lt;0&gt; are caused by recent printk KERN_CONT
change. &lt;*&gt; is now ignored in printk unless some text follows
the level and even then it still has to be the first in the
format message.

Note that the log_lvl parameter is now completely ignored in
show_stack_log_lvl and the stack is dumped with the default
level (like for quite some time already). It behaves the same as
the rest of the dump, function traces are dumped in the very
same manner. Only Code and maybe some lines are printed with
EMERG level.

Unfortunately I see no way how to fix this conceptually to have
the whole oops/BUG/panic output with the same level, so this
removed only the superfluous characters for the time being.

Just for illustration:

&lt;4&gt;Process kworker/0:0 (pid: 0, threadinfo ffff88003c8a6000, task ffff88003c85c100)
&lt;0&gt;Stack:
&lt;4&gt; ffffffff818022c0 0000000a00000001 0000000000000001 0000000000000046
&lt;4&gt; ffff88003c8a7fd8 0000000000000001 ffff88003c8a7e58 0000000000000000
&lt;4&gt; ffff88003e503f48 ffffffff8102fc4c ffff88003e503f48 ffff88003e503f68
&lt;0&gt;Call Trace:
&lt;0&gt; &lt;IRQ&gt;
&lt;4&gt; [&lt;ffffffff8102fc4c&gt;] ? call_softirq+0x1c/0x30 ...
&lt;0&gt;Code: 00 01 00 00 65 8b 04 25 80 c5 00 00 c7 45 ...

Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
Cc: jirislaby@gmail.com
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
LKML-Reference: &lt;1287586131-16222-1-git-send-email-jslaby@suse.cz&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: Unify dumpstack.h and stacktrace.h</title>
<updated>2010-06-08T21:29:52+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2010-05-19T19:35:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c9cf4dbb4d9ca715d8fedf13301a53296429abc6'/>
<id>c9cf4dbb4d9ca715d8fedf13301a53296429abc6</id>
<content type='text'>
arch/x86/include/asm/stacktrace.h and arch/x86/kernel/dumpstack.h
declare headers of objects that deal with the same topic.
Actually most of the files that include stacktrace.h also include
dumpstack.h

Although dumpstack.h seems more reserved for internals of stack
traces, those are quite often needed to define specialized stack
trace operations. And perf event arch headers are going to need
access to such low level operations anyway. So don't continue to
bother with dumpstack.h as it's not anymore about isolated deep
internals.

v2: fix struct stack_frame definition conflict in sysprof

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Soeren Sandmann &lt;sandmann@daimi.au.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
arch/x86/include/asm/stacktrace.h and arch/x86/kernel/dumpstack.h
declare headers of objects that deal with the same topic.
Actually most of the files that include stacktrace.h also include
dumpstack.h

Although dumpstack.h seems more reserved for internals of stack
traces, those are quite often needed to define specialized stack
trace operations. And perf event arch headers are going to need
access to such low level operations anyway. So don't continue to
bother with dumpstack.h as it's not anymore about isolated deep
internals.

v2: fix struct stack_frame definition conflict in sysprof

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Soeren Sandmann &lt;sandmann@daimi.au.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/x86-64: Use frame pointer to walk on irq and process stacks</title>
<updated>2010-03-10T13:26:40+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2010-03-03T06:38:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=61e67fb9d3ed13e6a7f58652ae4979b9c872fa57'/>
<id>61e67fb9d3ed13e6a7f58652ae4979b9c872fa57</id>
<content type='text'>
We were using the frame pointer based stack walker on every
contexts in x86-32, but not in x86-64 where we only use the
seven-league boots on the exception stacks.

Use it also on irq and process stacks. This utterly accelerate
the captures.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We were using the frame pointer based stack walker on every
contexts in x86-32, but not in x86-64 where we only use the
seven-league boots on the exception stacks.

Use it also on irq and process stacks. This utterly accelerate
the captures.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge commit 'v2.6.34-rc1' into perf/urgent</title>
<updated>2010-03-09T16:11:53+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2010-03-09T16:11:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=548b84166917d6f5e2296123b85ad24aecd3801d'/>
<id>548b84166917d6f5e2296123b85ad24aecd3801d</id>
<content type='text'>
Conflicts:
	tools/perf/util/probe-event.c

Merge reason: Pick up -rc1 and resolve the conflict as well.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	tools/perf/util/probe-event.c

Merge reason: Pick up -rc1 and resolve the conflict as well.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
</feed>
