<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/kernel/trace, branch v4.13</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>perf/ftrace: Fix double traces of perf on ftrace:function</title>
<updated>2017-08-29T11:29:29+00:00</updated>
<author>
<name>Zhou Chengming</name>
<email>zhouchengming1@huawei.com</email>
</author>
<published>2017-08-25T13:49:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=75e8387685f6c65feb195a4556110b58f852b848'/>
<id>75e8387685f6c65feb195a4556110b58f852b848</id>
<content type='text'>
When running perf on the ftrace:function tracepoint, there is a bug
which can be reproduced by:

  perf record -e ftrace:function -a sleep 20 &amp;
  perf record -e ftrace:function ls
  perf script

              ls 10304 [005]   171.853235: ftrace:function:
  perf_output_begin
              ls 10304 [005]   171.853237: ftrace:function:
  perf_output_begin
              ls 10304 [005]   171.853239: ftrace:function:
  task_tgid_nr_ns
              ls 10304 [005]   171.853240: ftrace:function:
  task_tgid_nr_ns
              ls 10304 [005]   171.853242: ftrace:function:
  __task_pid_nr_ns
              ls 10304 [005]   171.853244: ftrace:function:
  __task_pid_nr_ns

We can see that all the function traces are doubled.

The problem is caused by the inconsistency of the register
function perf_ftrace_event_register() with the probe function
perf_ftrace_function_call(). The former registers one probe
for every perf_event. And the latter handles all perf_events
on the current cpu. So when two perf_events on the current cpu,
the traces of them will be doubled.

So this patch adds an extra parameter "event" for perf_tp_event,
only send sample data to this event when it's not NULL.

Signed-off-by: Zhou Chengming &lt;zhouchengming1@huawei.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Acked-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: acme@kernel.org
Cc: alexander.shishkin@linux.intel.com
Cc: huawei.libin@huawei.com
Link: http://lkml.kernel.org/r/1503668977-12526-1-git-send-email-zhouchengming1@huawei.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When running perf on the ftrace:function tracepoint, there is a bug
which can be reproduced by:

  perf record -e ftrace:function -a sleep 20 &amp;
  perf record -e ftrace:function ls
  perf script

              ls 10304 [005]   171.853235: ftrace:function:
  perf_output_begin
              ls 10304 [005]   171.853237: ftrace:function:
  perf_output_begin
              ls 10304 [005]   171.853239: ftrace:function:
  task_tgid_nr_ns
              ls 10304 [005]   171.853240: ftrace:function:
  task_tgid_nr_ns
              ls 10304 [005]   171.853242: ftrace:function:
  __task_pid_nr_ns
              ls 10304 [005]   171.853244: ftrace:function:
  __task_pid_nr_ns

We can see that all the function traces are doubled.

The problem is caused by the inconsistency of the register
function perf_ftrace_event_register() with the probe function
perf_ftrace_function_call(). The former registers one probe
for every perf_event. And the latter handles all perf_events
on the current cpu. So when two perf_events on the current cpu,
the traces of them will be doubled.

So this patch adds an extra parameter "event" for perf_tp_event,
only send sample data to this event when it's not NULL.

Signed-off-by: Zhou Chengming &lt;zhouchengming1@huawei.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Acked-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: acme@kernel.org
Cc: alexander.shishkin@linux.intel.com
Cc: huawei.libin@huawei.com
Link: http://lkml.kernel.org/r/1503668977-12526-1-git-send-email-zhouchengming1@huawei.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'trace-v4.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace</title>
<updated>2017-08-24T21:08:22+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-08-24T21:08:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=415be6c256ddb5ef2eccd9f45fc67d73d49f466a'/>
<id>415be6c256ddb5ef2eccd9f45fc67d73d49f466a</id>
<content type='text'>
Pull tracing fixes from Steven Rostedt:
 "Various bug fixes:

   - Two small memory leaks in error paths.

   - A missed return error code on an error path.

   - A fix to check the tracing ring buffer CPU when it doesn't exist
     (caused by setting maxcpus on the command line that is less than
     the actual number of CPUs, and then onlining them manually).

   - A fix to have the reset of boot tracers called by lateinit_sync()
     instead of just lateinit(). As some of the tracers register via
     lateinit(), and if the clear happens before the tracer is
     registered, it will never start even though it was told to via the
     kernel command line"

* tag 'trace-v4.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix freeing of filter in create_filter() when set_str is false
  tracing: Fix kmemleak in tracing_map_array_free()
  ftrace: Check for null ret_stack on profile function graph entry function
  ring-buffer: Have ring_buffer_alloc_read_page() return error on offline CPU
  tracing: Missing error code in tracer_alloc_buffers()
  tracing: Call clear_boot_tracer() at lateinit_sync
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull tracing fixes from Steven Rostedt:
 "Various bug fixes:

   - Two small memory leaks in error paths.

   - A missed return error code on an error path.

   - A fix to check the tracing ring buffer CPU when it doesn't exist
     (caused by setting maxcpus on the command line that is less than
     the actual number of CPUs, and then onlining them manually).

   - A fix to have the reset of boot tracers called by lateinit_sync()
     instead of just lateinit(). As some of the tracers register via
     lateinit(), and if the clear happens before the tracer is
     registered, it will never start even though it was told to via the
     kernel command line"

* tag 'trace-v4.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix freeing of filter in create_filter() when set_str is false
  tracing: Fix kmemleak in tracing_map_array_free()
  ftrace: Check for null ret_stack on profile function graph entry function
  ring-buffer: Have ring_buffer_alloc_read_page() return error on offline CPU
  tracing: Missing error code in tracer_alloc_buffers()
  tracing: Call clear_boot_tracer() at lateinit_sync
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Fix freeing of filter in create_filter() when set_str is false</title>
<updated>2017-08-24T14:07:38+00:00</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2017-08-23T16:46:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8b0db1a5bdfcee0dbfa89607672598ae203c9045'/>
<id>8b0db1a5bdfcee0dbfa89607672598ae203c9045</id>
<content type='text'>
Performing the following task with kmemleak enabled:

 # cd /sys/kernel/tracing/events/irq/irq_handler_entry/
 # echo 'enable_event:kmem:kmalloc:3 if irq &gt;' &gt; trigger
 # echo 'enable_event:kmem:kmalloc:3 if irq &gt; 31' &gt; trigger
 # echo scan &gt; /sys/kernel/debug/kmemleak
 # cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff8800b9290308 (size 32):
  comm "bash", pid 1114, jiffies 4294848451 (age 141.139s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;ffffffff81cef5aa&gt;] kmemleak_alloc+0x4a/0xa0
    [&lt;ffffffff81357938&gt;] kmem_cache_alloc_trace+0x158/0x290
    [&lt;ffffffff81261c09&gt;] create_filter_start.constprop.28+0x99/0x940
    [&lt;ffffffff812639c9&gt;] create_filter+0xa9/0x160
    [&lt;ffffffff81263bdc&gt;] create_event_filter+0xc/0x10
    [&lt;ffffffff812655e5&gt;] set_trigger_filter+0xe5/0x210
    [&lt;ffffffff812660c4&gt;] event_enable_trigger_func+0x324/0x490
    [&lt;ffffffff812652e2&gt;] event_trigger_write+0x1a2/0x260
    [&lt;ffffffff8138cf87&gt;] __vfs_write+0xd7/0x380
    [&lt;ffffffff8138f421&gt;] vfs_write+0x101/0x260
    [&lt;ffffffff8139187b&gt;] SyS_write+0xab/0x130
    [&lt;ffffffff81cfd501&gt;] entry_SYSCALL_64_fastpath+0x1f/0xbe
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

The function create_filter() is passed a 'filterp' pointer that gets
allocated, and if "set_str" is true, it is up to the caller to free it, even
on error. The problem is that the pointer is not freed by create_filter()
when set_str is false. This is a bug, and it is not up to the caller to free
the filter on error if it doesn't care about the string.

Link: http://lkml.kernel.org/r/1502705898-27571-2-git-send-email-chuhu@redhat.com

Cc: stable@vger.kernel.org
Fixes: 38b78eb85 ("tracing: Factorize filter creation")
Reported-by: Chunyu Hu &lt;chuhu@redhat.com&gt;
Tested-by: Chunyu Hu &lt;chuhu@redhat.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Performing the following task with kmemleak enabled:

 # cd /sys/kernel/tracing/events/irq/irq_handler_entry/
 # echo 'enable_event:kmem:kmalloc:3 if irq &gt;' &gt; trigger
 # echo 'enable_event:kmem:kmalloc:3 if irq &gt; 31' &gt; trigger
 # echo scan &gt; /sys/kernel/debug/kmemleak
 # cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff8800b9290308 (size 32):
  comm "bash", pid 1114, jiffies 4294848451 (age 141.139s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;ffffffff81cef5aa&gt;] kmemleak_alloc+0x4a/0xa0
    [&lt;ffffffff81357938&gt;] kmem_cache_alloc_trace+0x158/0x290
    [&lt;ffffffff81261c09&gt;] create_filter_start.constprop.28+0x99/0x940
    [&lt;ffffffff812639c9&gt;] create_filter+0xa9/0x160
    [&lt;ffffffff81263bdc&gt;] create_event_filter+0xc/0x10
    [&lt;ffffffff812655e5&gt;] set_trigger_filter+0xe5/0x210
    [&lt;ffffffff812660c4&gt;] event_enable_trigger_func+0x324/0x490
    [&lt;ffffffff812652e2&gt;] event_trigger_write+0x1a2/0x260
    [&lt;ffffffff8138cf87&gt;] __vfs_write+0xd7/0x380
    [&lt;ffffffff8138f421&gt;] vfs_write+0x101/0x260
    [&lt;ffffffff8139187b&gt;] SyS_write+0xab/0x130
    [&lt;ffffffff81cfd501&gt;] entry_SYSCALL_64_fastpath+0x1f/0xbe
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

The function create_filter() is passed a 'filterp' pointer that gets
allocated, and if "set_str" is true, it is up to the caller to free it, even
on error. The problem is that the pointer is not freed by create_filter()
when set_str is false. This is a bug, and it is not up to the caller to free
the filter on error if it doesn't care about the string.

Link: http://lkml.kernel.org/r/1502705898-27571-2-git-send-email-chuhu@redhat.com

Cc: stable@vger.kernel.org
Fixes: 38b78eb85 ("tracing: Factorize filter creation")
Reported-by: Chunyu Hu &lt;chuhu@redhat.com&gt;
Tested-by: Chunyu Hu &lt;chuhu@redhat.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Fix kmemleak in tracing_map_array_free()</title>
<updated>2017-08-24T14:05:51+00:00</updated>
<author>
<name>Chunyu Hu</name>
<email>chuhu@redhat.com</email>
</author>
<published>2017-08-14T10:18:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=475bb3c69ab05df2a6ecef6acc2393703d134180'/>
<id>475bb3c69ab05df2a6ecef6acc2393703d134180</id>
<content type='text'>
kmemleak reported the below leak when I was doing clear of the hist
trigger. With this patch, the kmeamleak is gone.

unreferenced object 0xffff94322b63d760 (size 32):
  comm "bash", pid 1522, jiffies 4403687962 (age 2442.311s)
  hex dump (first 32 bytes):
    00 01 00 00 04 00 00 00 08 00 00 00 ff 00 00 00  ................
    10 00 00 00 00 00 00 00 80 a8 7a f2 31 94 ff ff  ..........z.1...
  backtrace:
    [&lt;ffffffff9e96c27a&gt;] kmemleak_alloc+0x4a/0xa0
    [&lt;ffffffff9e424cba&gt;] kmem_cache_alloc_trace+0xca/0x1d0
    [&lt;ffffffff9e377736&gt;] tracing_map_array_alloc+0x26/0x140
    [&lt;ffffffff9e261be0&gt;] kretprobe_trampoline+0x0/0x50
    [&lt;ffffffff9e38b935&gt;] create_hist_data+0x535/0x750
    [&lt;ffffffff9e38bd47&gt;] event_hist_trigger_func+0x1f7/0x420
    [&lt;ffffffff9e38893d&gt;] event_trigger_write+0xfd/0x1a0
    [&lt;ffffffff9e44dfc7&gt;] __vfs_write+0x37/0x170
    [&lt;ffffffff9e44f552&gt;] vfs_write+0xb2/0x1b0
    [&lt;ffffffff9e450b85&gt;] SyS_write+0x55/0xc0
    [&lt;ffffffff9e203857&gt;] do_syscall_64+0x67/0x150
    [&lt;ffffffff9e977ce7&gt;] return_from_SYSCALL_64+0x0/0x6a
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff
unreferenced object 0xffff9431f27aa880 (size 128):
  comm "bash", pid 1522, jiffies 4403687962 (age 2442.311s)
  hex dump (first 32 bytes):
    00 00 8c 2a 32 94 ff ff 00 f0 8b 2a 32 94 ff ff  ...*2......*2...
    00 e0 8b 2a 32 94 ff ff 00 d0 8b 2a 32 94 ff ff  ...*2......*2...
  backtrace:
    [&lt;ffffffff9e96c27a&gt;] kmemleak_alloc+0x4a/0xa0
    [&lt;ffffffff9e425348&gt;] __kmalloc+0xe8/0x220
    [&lt;ffffffff9e3777c1&gt;] tracing_map_array_alloc+0xb1/0x140
    [&lt;ffffffff9e261be0&gt;] kretprobe_trampoline+0x0/0x50
    [&lt;ffffffff9e38b935&gt;] create_hist_data+0x535/0x750
    [&lt;ffffffff9e38bd47&gt;] event_hist_trigger_func+0x1f7/0x420
    [&lt;ffffffff9e38893d&gt;] event_trigger_write+0xfd/0x1a0
    [&lt;ffffffff9e44dfc7&gt;] __vfs_write+0x37/0x170
    [&lt;ffffffff9e44f552&gt;] vfs_write+0xb2/0x1b0
    [&lt;ffffffff9e450b85&gt;] SyS_write+0x55/0xc0
    [&lt;ffffffff9e203857&gt;] do_syscall_64+0x67/0x150
    [&lt;ffffffff9e977ce7&gt;] return_from_SYSCALL_64+0x0/0x6a
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

Link: http://lkml.kernel.org/r/1502705898-27571-1-git-send-email-chuhu@redhat.com

Cc: stable@vger.kernel.org
Fixes: 08d43a5fa063 ("tracing: Add lock-free tracing_map")
Signed-off-by: Chunyu Hu &lt;chuhu@redhat.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
kmemleak reported the below leak when I was doing clear of the hist
trigger. With this patch, the kmeamleak is gone.

unreferenced object 0xffff94322b63d760 (size 32):
  comm "bash", pid 1522, jiffies 4403687962 (age 2442.311s)
  hex dump (first 32 bytes):
    00 01 00 00 04 00 00 00 08 00 00 00 ff 00 00 00  ................
    10 00 00 00 00 00 00 00 80 a8 7a f2 31 94 ff ff  ..........z.1...
  backtrace:
    [&lt;ffffffff9e96c27a&gt;] kmemleak_alloc+0x4a/0xa0
    [&lt;ffffffff9e424cba&gt;] kmem_cache_alloc_trace+0xca/0x1d0
    [&lt;ffffffff9e377736&gt;] tracing_map_array_alloc+0x26/0x140
    [&lt;ffffffff9e261be0&gt;] kretprobe_trampoline+0x0/0x50
    [&lt;ffffffff9e38b935&gt;] create_hist_data+0x535/0x750
    [&lt;ffffffff9e38bd47&gt;] event_hist_trigger_func+0x1f7/0x420
    [&lt;ffffffff9e38893d&gt;] event_trigger_write+0xfd/0x1a0
    [&lt;ffffffff9e44dfc7&gt;] __vfs_write+0x37/0x170
    [&lt;ffffffff9e44f552&gt;] vfs_write+0xb2/0x1b0
    [&lt;ffffffff9e450b85&gt;] SyS_write+0x55/0xc0
    [&lt;ffffffff9e203857&gt;] do_syscall_64+0x67/0x150
    [&lt;ffffffff9e977ce7&gt;] return_from_SYSCALL_64+0x0/0x6a
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff
unreferenced object 0xffff9431f27aa880 (size 128):
  comm "bash", pid 1522, jiffies 4403687962 (age 2442.311s)
  hex dump (first 32 bytes):
    00 00 8c 2a 32 94 ff ff 00 f0 8b 2a 32 94 ff ff  ...*2......*2...
    00 e0 8b 2a 32 94 ff ff 00 d0 8b 2a 32 94 ff ff  ...*2......*2...
  backtrace:
    [&lt;ffffffff9e96c27a&gt;] kmemleak_alloc+0x4a/0xa0
    [&lt;ffffffff9e425348&gt;] __kmalloc+0xe8/0x220
    [&lt;ffffffff9e3777c1&gt;] tracing_map_array_alloc+0xb1/0x140
    [&lt;ffffffff9e261be0&gt;] kretprobe_trampoline+0x0/0x50
    [&lt;ffffffff9e38b935&gt;] create_hist_data+0x535/0x750
    [&lt;ffffffff9e38bd47&gt;] event_hist_trigger_func+0x1f7/0x420
    [&lt;ffffffff9e38893d&gt;] event_trigger_write+0xfd/0x1a0
    [&lt;ffffffff9e44dfc7&gt;] __vfs_write+0x37/0x170
    [&lt;ffffffff9e44f552&gt;] vfs_write+0xb2/0x1b0
    [&lt;ffffffff9e450b85&gt;] SyS_write+0x55/0xc0
    [&lt;ffffffff9e203857&gt;] do_syscall_64+0x67/0x150
    [&lt;ffffffff9e977ce7&gt;] return_from_SYSCALL_64+0x0/0x6a
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

Link: http://lkml.kernel.org/r/1502705898-27571-1-git-send-email-chuhu@redhat.com

Cc: stable@vger.kernel.org
Fixes: 08d43a5fa063 ("tracing: Add lock-free tracing_map")
Signed-off-by: Chunyu Hu &lt;chuhu@redhat.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ftrace: Check for null ret_stack on profile function graph entry function</title>
<updated>2017-08-24T14:04:01+00:00</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2017-08-17T20:37:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a8f0f9e49956a74718874b800251455680085600'/>
<id>a8f0f9e49956a74718874b800251455680085600</id>
<content type='text'>
There's a small race when function graph shutsdown and the calling of the
registered function graph entry callback. The callback must not reference
the task's ret_stack without first checking that it is not NULL. Note, when
a ret_stack is allocated for a task, it stays allocated until the task exits.
The problem here, is that function_graph is shutdown, and a new task was
created, which doesn't have its ret_stack allocated. But since some of the
functions are still being traced, the callbacks can still be called.

The normal function_graph code handles this, but starting with commit
8861dd303c ("ftrace: Access ret_stack-&gt;subtime only in the function
profiler") the profiler code references the ret_stack on function entry, but
doesn't check if it is NULL first.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=196611

Cc: stable@vger.kernel.org
Fixes: 8861dd303c ("ftrace: Access ret_stack-&gt;subtime only in the function profiler")
Reported-by: lilydjwg@gmail.com
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There's a small race when function graph shutsdown and the calling of the
registered function graph entry callback. The callback must not reference
the task's ret_stack without first checking that it is not NULL. Note, when
a ret_stack is allocated for a task, it stays allocated until the task exits.
The problem here, is that function_graph is shutdown, and a new task was
created, which doesn't have its ret_stack allocated. But since some of the
functions are still being traced, the callbacks can still be called.

The normal function_graph code handles this, but starting with commit
8861dd303c ("ftrace: Access ret_stack-&gt;subtime only in the function
profiler") the profiler code references the ret_stack on function entry, but
doesn't check if it is NULL first.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=196611

Cc: stable@vger.kernel.org
Fixes: 8861dd303c ("ftrace: Access ret_stack-&gt;subtime only in the function profiler")
Reported-by: lilydjwg@gmail.com
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: fix bpf_trace_printk on 32 bit archs</title>
<updated>2017-08-16T00:32:15+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2017-08-15T23:45:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=88a5c690b66110ad255380d8f629c629cf6ca559'/>
<id>88a5c690b66110ad255380d8f629c629cf6ca559</id>
<content type='text'>
James reported that on MIPS32 bpf_trace_printk() is currently
broken while MIPS64 works fine:

  bpf_trace_printk() uses conditional operators to attempt to
  pass different types to __trace_printk() depending on the
  format operators. This doesn't work as intended on 32-bit
  architectures where u32 and long are passed differently to
  u64, since the result of C conditional operators follows the
  "usual arithmetic conversions" rules, such that the values
  passed to __trace_printk() will always be u64 [causing issues
  later in the va_list handling for vscnprintf()].

  For example the samples/bpf/tracex5 test printed lines like
  below on MIPS32, where the fd and buf have come from the u64
  fd argument, and the size from the buf argument:

    [...] 1180.941542: 0x00000001: write(fd=1, buf=  (null), size=6258688)

  Instead of this:

    [...] 1625.616026: 0x00000001: write(fd=1, buf=009e4000, size=512)

One way to get it working is to expand various combinations
of argument types into 8 different combinations for 32 bit
and 64 bit kernels. Fix tested by James on MIPS32 and MIPS64
as well that it resolves the issue.

Fixes: 9c959c863f82 ("tracing: Allow BPF programs to call bpf_trace_printk()")
Reported-by: James Hogan &lt;james.hogan@imgtec.com&gt;
Tested-by: James Hogan &lt;james.hogan@imgtec.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
James reported that on MIPS32 bpf_trace_printk() is currently
broken while MIPS64 works fine:

  bpf_trace_printk() uses conditional operators to attempt to
  pass different types to __trace_printk() depending on the
  format operators. This doesn't work as intended on 32-bit
  architectures where u32 and long are passed differently to
  u64, since the result of C conditional operators follows the
  "usual arithmetic conversions" rules, such that the values
  passed to __trace_printk() will always be u64 [causing issues
  later in the va_list handling for vscnprintf()].

  For example the samples/bpf/tracex5 test printed lines like
  below on MIPS32, where the fd and buf have come from the u64
  fd argument, and the size from the buf argument:

    [...] 1180.941542: 0x00000001: write(fd=1, buf=  (null), size=6258688)

  Instead of this:

    [...] 1625.616026: 0x00000001: write(fd=1, buf=009e4000, size=512)

One way to get it working is to expand various combinations
of argument types into 8 different combinations for 32 bit
and 64 bit kernels. Fix tested by James on MIPS32 and MIPS64
as well that it resolves the issue.

Fixes: 9c959c863f82 ("tracing: Allow BPF programs to call bpf_trace_printk()")
Reported-by: James Hogan &lt;james.hogan@imgtec.com&gt;
Tested-by: James Hogan &lt;james.hogan@imgtec.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ring-buffer: Have ring_buffer_alloc_read_page() return error on offline CPU</title>
<updated>2017-08-02T18:23:02+00:00</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2017-08-02T18:20:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a7e52ad7ed82e21273eccff93d1477a7b313aabb'/>
<id>a7e52ad7ed82e21273eccff93d1477a7b313aabb</id>
<content type='text'>
Chunyu Hu reported:
  "per_cpu trace directories and files are created for all possible cpus,
   but only the cpus which have ever been on-lined have their own per cpu
   ring buffer (allocated by cpuhp threads). While trace_buffers_open, the
   open handler for trace file 'trace_pipe_raw' is always trying to access
   field of ring_buffer_per_cpu, and would panic with the NULL pointer.

   Align the behavior of trace_pipe_raw with trace_pipe, that returns -NODEV
   when openning it if that cpu does not have trace ring buffer.

   Reproduce:
   cat /sys/kernel/debug/tracing/per_cpu/cpu31/trace_pipe_raw
   (cpu31 is never on-lined, this is a 16 cores x86_64 box)

   Tested with:
   1) boot with maxcpus=14, read trace_pipe_raw of cpu15.
      Got -NODEV.
   2) oneline cpu15, read trace_pipe_raw of cpu15.
      Get the raw trace data.

   Call trace:
   [ 5760.950995] RIP: 0010:ring_buffer_alloc_read_page+0x32/0xe0
   [ 5760.961678]  tracing_buffers_read+0x1f6/0x230
   [ 5760.962695]  __vfs_read+0x37/0x160
   [ 5760.963498]  ? __vfs_read+0x5/0x160
   [ 5760.964339]  ? security_file_permission+0x9d/0xc0
   [ 5760.965451]  ? __vfs_read+0x5/0x160
   [ 5760.966280]  vfs_read+0x8c/0x130
   [ 5760.967070]  SyS_read+0x55/0xc0
   [ 5760.967779]  do_syscall_64+0x67/0x150
   [ 5760.968687]  entry_SYSCALL64_slow_path+0x25/0x25"

This was introduced by the addition of the feature to reuse reader pages
instead of re-allocating them. The problem is that the allocation of a
reader page (which is per cpu) does not check if the cpu is online and set
up for the ring buffer.

Link: http://lkml.kernel.org/r/1500880866-1177-1-git-send-email-chuhu@redhat.com

Cc: stable@vger.kernel.org
Fixes: 73a757e63114 ("ring-buffer: Return reader page back into existing ring buffer")
Reported-by: Chunyu Hu &lt;chuhu@redhat.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Chunyu Hu reported:
  "per_cpu trace directories and files are created for all possible cpus,
   but only the cpus which have ever been on-lined have their own per cpu
   ring buffer (allocated by cpuhp threads). While trace_buffers_open, the
   open handler for trace file 'trace_pipe_raw' is always trying to access
   field of ring_buffer_per_cpu, and would panic with the NULL pointer.

   Align the behavior of trace_pipe_raw with trace_pipe, that returns -NODEV
   when openning it if that cpu does not have trace ring buffer.

   Reproduce:
   cat /sys/kernel/debug/tracing/per_cpu/cpu31/trace_pipe_raw
   (cpu31 is never on-lined, this is a 16 cores x86_64 box)

   Tested with:
   1) boot with maxcpus=14, read trace_pipe_raw of cpu15.
      Got -NODEV.
   2) oneline cpu15, read trace_pipe_raw of cpu15.
      Get the raw trace data.

   Call trace:
   [ 5760.950995] RIP: 0010:ring_buffer_alloc_read_page+0x32/0xe0
   [ 5760.961678]  tracing_buffers_read+0x1f6/0x230
   [ 5760.962695]  __vfs_read+0x37/0x160
   [ 5760.963498]  ? __vfs_read+0x5/0x160
   [ 5760.964339]  ? security_file_permission+0x9d/0xc0
   [ 5760.965451]  ? __vfs_read+0x5/0x160
   [ 5760.966280]  vfs_read+0x8c/0x130
   [ 5760.967070]  SyS_read+0x55/0xc0
   [ 5760.967779]  do_syscall_64+0x67/0x150
   [ 5760.968687]  entry_SYSCALL64_slow_path+0x25/0x25"

This was introduced by the addition of the feature to reuse reader pages
instead of re-allocating them. The problem is that the allocation of a
reader page (which is per cpu) does not check if the cpu is online and set
up for the ring buffer.

Link: http://lkml.kernel.org/r/1500880866-1177-1-git-send-email-chuhu@redhat.com

Cc: stable@vger.kernel.org
Fixes: 73a757e63114 ("ring-buffer: Return reader page back into existing ring buffer")
Reported-by: Chunyu Hu &lt;chuhu@redhat.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Missing error code in tracer_alloc_buffers()</title>
<updated>2017-08-02T18:19:57+00:00</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@oracle.com</email>
</author>
<published>2017-08-01T11:02:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=147d88e0b5eb90191bc5c12ca0a3c410b75a13d2'/>
<id>147d88e0b5eb90191bc5c12ca0a3c410b75a13d2</id>
<content type='text'>
If ring_buffer_alloc() or one of the next couple function calls fail
then we should return -ENOMEM but the current code returns success.

Link: http://lkml.kernel.org/r/20170801110201.ajdkct7vwzixahvx@mwanda

Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: stable@vger.kernel.org
Fixes: b32614c03413 ('tracing/rb: Convert to hotplug state machine')
Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If ring_buffer_alloc() or one of the next couple function calls fail
then we should return -ENOMEM but the current code returns success.

Link: http://lkml.kernel.org/r/20170801110201.ajdkct7vwzixahvx@mwanda

Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: stable@vger.kernel.org
Fixes: b32614c03413 ('tracing/rb: Convert to hotplug state machine')
Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Call clear_boot_tracer() at lateinit_sync</title>
<updated>2017-08-02T18:19:57+00:00</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2017-08-01T16:01:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4bb0f0e73c8c30917d169c4a0f1ac083690c545b'/>
<id>4bb0f0e73c8c30917d169c4a0f1ac083690c545b</id>
<content type='text'>
The clear_boot_tracer function is used to reset the default_bootup_tracer
string to prevent it from being accessed after boot, as it originally points
to init data. But since clear_boot_tracer() is called via the
init_lateinit() call, it races with the initcall for registering the hwlat
tracer. If someone adds "ftrace=hwlat" to the kernel command line, depending
on how the linker sets up the text, the saved command line may be cleared,
and the hwlat tracer never is initialized.

Simply have the clear_boot_tracer() be called by initcall_lateinit_sync() as
that's for tasks to be called after lateinit.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=196551

Cc: stable@vger.kernel.org
Fixes: e7c15cd8a ("tracing: Added hardware latency tracer")
Reported-by: Zamir SUN &lt;sztsian@gmail.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The clear_boot_tracer function is used to reset the default_bootup_tracer
string to prevent it from being accessed after boot, as it originally points
to init data. But since clear_boot_tracer() is called via the
init_lateinit() call, it races with the initcall for registering the hwlat
tracer. If someone adds "ftrace=hwlat" to the kernel command line, depending
on how the linker sets up the text, the saved command line may be cleared,
and the hwlat tracer never is initialized.

Simply have the clear_boot_tracer() be called by initcall_lateinit_sync() as
that's for tasks to be called after lateinit.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=196551

Cc: stable@vger.kernel.org
Fixes: e7c15cd8a ("tracing: Added hardware latency tracer")
Reported-by: Zamir SUN &lt;sztsian@gmail.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>trace: fix the errors caused by incompatible type of RCU variables</title>
<updated>2017-07-20T13:27:29+00:00</updated>
<author>
<name>Chunyan Zhang</name>
<email>zhang.chunyan@linaro.org</email>
</author>
<published>2017-06-07T08:12:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f86f418059b94aa01f9342611a272ca60c583e89'/>
<id>f86f418059b94aa01f9342611a272ca60c583e89</id>
<content type='text'>
The variables which are processed by RCU functions should be annotated
as RCU, otherwise sparse will report the errors like below:

"error: incompatible types in comparison expression (different
address spaces)"

Link: http://lkml.kernel.org/r/1496823171-7758-1-git-send-email-zhang.chunyan@linaro.org

Signed-off-by: Chunyan Zhang &lt;zhang.chunyan@linaro.org&gt;
[ Updated to not be 100% 80 column strict ]
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The variables which are processed by RCU functions should be annotated
as RCU, otherwise sparse will report the errors like below:

"error: incompatible types in comparison expression (different
address spaces)"

Link: http://lkml.kernel.org/r/1496823171-7758-1-git-send-email-zhang.chunyan@linaro.org

Signed-off-by: Chunyan Zhang &lt;zhang.chunyan@linaro.org&gt;
[ Updated to not be 100% 80 column strict ]
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
