<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/kernel/trace, branch linux-6.4.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>tracing: Zero the pipe cpumask on alloc to avoid spurious -EBUSY</title>
<updated>2023-09-13T07:48:45+00:00</updated>
<author>
<name>Brian Foster</name>
<email>bfoster@redhat.com</email>
</author>
<published>2023-08-31T12:55:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e447d28a9313efe5f53a236b7cf60d9e65a16fcb'/>
<id>e447d28a9313efe5f53a236b7cf60d9e65a16fcb</id>
<content type='text'>
commit 3d07fa1dd19035eb0b13ae6697efd5caa9033e74 upstream.

The pipe cpumask used to serialize opens between the main and percpu
trace pipes is not zeroed or initialized. This can result in
spurious -EBUSY returns if underlying memory is not fully zeroed.
This has been observed by immediate failure to read the main
trace_pipe file on an otherwise newly booted and idle system:

 # cat /sys/kernel/debug/tracing/trace_pipe
 cat: /sys/kernel/debug/tracing/trace_pipe: Device or resource busy

Zero the allocation of pipe_cpumask to avoid the problem.

Link: https://lore.kernel.org/linux-trace-kernel/20230831125500.986862-1-bfoster@redhat.com

Cc: stable@vger.kernel.org
Fixes: c2489bb7e6be ("tracing: Introduce pipe_cpumask to avoid race on trace_pipes")
Reviewed-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Reviewed-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Brian Foster &lt;bfoster@redhat.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 3d07fa1dd19035eb0b13ae6697efd5caa9033e74 upstream.

The pipe cpumask used to serialize opens between the main and percpu
trace pipes is not zeroed or initialized. This can result in
spurious -EBUSY returns if underlying memory is not fully zeroed.
This has been observed by immediate failure to read the main
trace_pipe file on an otherwise newly booted and idle system:

 # cat /sys/kernel/debug/tracing/trace_pipe
 cat: /sys/kernel/debug/tracing/trace_pipe: Device or resource busy

Zero the allocation of pipe_cpumask to avoid the problem.

Link: https://lore.kernel.org/linux-trace-kernel/20230831125500.986862-1-bfoster@redhat.com

Cc: stable@vger.kernel.org
Fixes: c2489bb7e6be ("tracing: Introduce pipe_cpumask to avoid race on trace_pipes")
Reviewed-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Reviewed-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Brian Foster &lt;bfoster@redhat.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Fix race issue between cpu buffer write and swap</title>
<updated>2023-09-13T07:48:34+00:00</updated>
<author>
<name>Zheng Yejian</name>
<email>zhengyejian1@huawei.com</email>
</author>
<published>2023-08-31T13:27:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=89c89da92a60028013f9539be0dcce7e44405a43'/>
<id>89c89da92a60028013f9539be0dcce7e44405a43</id>
<content type='text'>
[ Upstream commit 3163f635b20e9e1fb4659e74f47918c9dddfe64e ]

Warning happened in rb_end_commit() at code:
	if (RB_WARN_ON(cpu_buffer, !local_read(&amp;cpu_buffer-&gt;committing)))

  WARNING: CPU: 0 PID: 139 at kernel/trace/ring_buffer.c:3142
	rb_commit+0x402/0x4a0
  Call Trace:
   ring_buffer_unlock_commit+0x42/0x250
   trace_buffer_unlock_commit_regs+0x3b/0x250
   trace_event_buffer_commit+0xe5/0x440
   trace_event_buffer_reserve+0x11c/0x150
   trace_event_raw_event_sched_switch+0x23c/0x2c0
   __traceiter_sched_switch+0x59/0x80
   __schedule+0x72b/0x1580
   schedule+0x92/0x120
   worker_thread+0xa0/0x6f0

It is because the race between writing event into cpu buffer and swapping
cpu buffer through file per_cpu/cpu0/snapshot:

  Write on CPU 0             Swap buffer by per_cpu/cpu0/snapshot on CPU 1
  --------                   --------
                             tracing_snapshot_write()
                               [...]

  ring_buffer_lock_reserve()
    cpu_buffer = buffer-&gt;buffers[cpu]; // 1. Suppose find 'cpu_buffer_a';
    [...]
    rb_reserve_next_event()
      [...]

                               ring_buffer_swap_cpu()
                                 if (local_read(&amp;cpu_buffer_a-&gt;committing))
                                     goto out_dec;
                                 if (local_read(&amp;cpu_buffer_b-&gt;committing))
                                     goto out_dec;
                                 buffer_a-&gt;buffers[cpu] = cpu_buffer_b;
                                 buffer_b-&gt;buffers[cpu] = cpu_buffer_a;
                                 // 2. cpu_buffer has swapped here.

      rb_start_commit(cpu_buffer);
      if (unlikely(READ_ONCE(cpu_buffer-&gt;buffer)
          != buffer)) { // 3. This check passed due to 'cpu_buffer-&gt;buffer'
        [...]           //    has not changed here.
        return NULL;
      }
                                 cpu_buffer_b-&gt;buffer = buffer_a;
                                 cpu_buffer_a-&gt;buffer = buffer_b;
                                 [...]

      // 4. Reserve event from 'cpu_buffer_a'.

  ring_buffer_unlock_commit()
    [...]
    cpu_buffer = buffer-&gt;buffers[cpu]; // 5. Now find 'cpu_buffer_b' !!!
    rb_commit(cpu_buffer)
      rb_end_commit()  // 6. WARN for the wrong 'committing' state !!!

Based on above analysis, we can easily reproduce by following testcase:
  ``` bash
  #!/bin/bash

  dmesg -n 7
  sysctl -w kernel.panic_on_warn=1
  TR=/sys/kernel/tracing
  echo 7 &gt; ${TR}/buffer_size_kb
  echo "sched:sched_switch" &gt; ${TR}/set_event
  while [ true ]; do
          echo 1 &gt; ${TR}/per_cpu/cpu0/snapshot
  done &amp;
  while [ true ]; do
          echo 1 &gt; ${TR}/per_cpu/cpu0/snapshot
  done &amp;
  while [ true ]; do
          echo 1 &gt; ${TR}/per_cpu/cpu0/snapshot
  done &amp;
  ```

To fix it, IIUC, we can use smp_call_function_single() to do the swap on
the target cpu where the buffer is located, so that above race would be
avoided.

Link: https://lore.kernel.org/linux-trace-kernel/20230831132739.4070878-1-zhengyejian1@huawei.com

Cc: &lt;mhiramat@kernel.org&gt;
Fixes: f1affcaaa861 ("tracing: Add snapshot in the per_cpu trace directories")
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 3163f635b20e9e1fb4659e74f47918c9dddfe64e ]

Warning happened in rb_end_commit() at code:
	if (RB_WARN_ON(cpu_buffer, !local_read(&amp;cpu_buffer-&gt;committing)))

  WARNING: CPU: 0 PID: 139 at kernel/trace/ring_buffer.c:3142
	rb_commit+0x402/0x4a0
  Call Trace:
   ring_buffer_unlock_commit+0x42/0x250
   trace_buffer_unlock_commit_regs+0x3b/0x250
   trace_event_buffer_commit+0xe5/0x440
   trace_event_buffer_reserve+0x11c/0x150
   trace_event_raw_event_sched_switch+0x23c/0x2c0
   __traceiter_sched_switch+0x59/0x80
   __schedule+0x72b/0x1580
   schedule+0x92/0x120
   worker_thread+0xa0/0x6f0

It is because the race between writing event into cpu buffer and swapping
cpu buffer through file per_cpu/cpu0/snapshot:

  Write on CPU 0             Swap buffer by per_cpu/cpu0/snapshot on CPU 1
  --------                   --------
                             tracing_snapshot_write()
                               [...]

  ring_buffer_lock_reserve()
    cpu_buffer = buffer-&gt;buffers[cpu]; // 1. Suppose find 'cpu_buffer_a';
    [...]
    rb_reserve_next_event()
      [...]

                               ring_buffer_swap_cpu()
                                 if (local_read(&amp;cpu_buffer_a-&gt;committing))
                                     goto out_dec;
                                 if (local_read(&amp;cpu_buffer_b-&gt;committing))
                                     goto out_dec;
                                 buffer_a-&gt;buffers[cpu] = cpu_buffer_b;
                                 buffer_b-&gt;buffers[cpu] = cpu_buffer_a;
                                 // 2. cpu_buffer has swapped here.

      rb_start_commit(cpu_buffer);
      if (unlikely(READ_ONCE(cpu_buffer-&gt;buffer)
          != buffer)) { // 3. This check passed due to 'cpu_buffer-&gt;buffer'
        [...]           //    has not changed here.
        return NULL;
      }
                                 cpu_buffer_b-&gt;buffer = buffer_a;
                                 cpu_buffer_a-&gt;buffer = buffer_b;
                                 [...]

      // 4. Reserve event from 'cpu_buffer_a'.

  ring_buffer_unlock_commit()
    [...]
    cpu_buffer = buffer-&gt;buffers[cpu]; // 5. Now find 'cpu_buffer_b' !!!
    rb_commit(cpu_buffer)
      rb_end_commit()  // 6. WARN for the wrong 'committing' state !!!

Based on above analysis, we can easily reproduce by following testcase:
  ``` bash
  #!/bin/bash

  dmesg -n 7
  sysctl -w kernel.panic_on_warn=1
  TR=/sys/kernel/tracing
  echo 7 &gt; ${TR}/buffer_size_kb
  echo "sched:sched_switch" &gt; ${TR}/set_event
  while [ true ]; do
          echo 1 &gt; ${TR}/per_cpu/cpu0/snapshot
  done &amp;
  while [ true ]; do
          echo 1 &gt; ${TR}/per_cpu/cpu0/snapshot
  done &amp;
  while [ true ]; do
          echo 1 &gt; ${TR}/per_cpu/cpu0/snapshot
  done &amp;
  ```

To fix it, IIUC, we can use smp_call_function_single() to do the swap on
the target cpu where the buffer is located, so that above race would be
avoided.

Link: https://lore.kernel.org/linux-trace-kernel/20230831132739.4070878-1-zhengyejian1@huawei.com

Cc: &lt;mhiramat@kernel.org&gt;
Fixes: f1affcaaa861 ("tracing: Add snapshot in the per_cpu trace directories")
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Remove extra space at the end of hwlat_detector/mode</title>
<updated>2023-09-13T07:48:34+00:00</updated>
<author>
<name>Mikhail Kobuk</name>
<email>m.kobuk@ispras.ru</email>
</author>
<published>2023-08-25T10:34:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e0c6a3679e3bc454cd8294a92e51e8c40127d6bb'/>
<id>e0c6a3679e3bc454cd8294a92e51e8c40127d6bb</id>
<content type='text'>
[ Upstream commit 2cf0dee989a8b2501929eaab29473b6b1fa11057 ]

Space is printed after each mode value including the last one:
$ echo \"$(sudo cat /sys/kernel/tracing/hwlat_detector/mode)\"
"none [round-robin] per-cpu "

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Link: https://lore.kernel.org/linux-trace-kernel/20230825103432.7750-1-m.kobuk@ispras.ru

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Fixes: 8fa826b7344d ("trace/hwlat: Implement the mode config option")
Signed-off-by: Mikhail Kobuk &lt;m.kobuk@ispras.ru&gt;
Reviewed-by: Alexey Khoroshilov &lt;khoroshilov@ispras.ru&gt;
Acked-by: Daniel Bristot de Oliveira &lt;bristot@kernel.org&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2cf0dee989a8b2501929eaab29473b6b1fa11057 ]

Space is printed after each mode value including the last one:
$ echo \"$(sudo cat /sys/kernel/tracing/hwlat_detector/mode)\"
"none [round-robin] per-cpu "

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Link: https://lore.kernel.org/linux-trace-kernel/20230825103432.7750-1-m.kobuk@ispras.ru

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Fixes: 8fa826b7344d ("trace/hwlat: Implement the mode config option")
Signed-off-by: Mikhail Kobuk &lt;m.kobuk@ispras.ru&gt;
Reviewed-by: Alexey Khoroshilov &lt;khoroshilov@ispras.ru&gt;
Acked-by: Daniel Bristot de Oliveira &lt;bristot@kernel.org&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Clear the probe_addr for uprobe</title>
<updated>2023-09-13T07:48:00+00:00</updated>
<author>
<name>Yafang Shao</name>
<email>laoar.shao@gmail.com</email>
</author>
<published>2023-07-09T02:56:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=62c4571f2752c52f1abac9251811fd39493056c6'/>
<id>62c4571f2752c52f1abac9251811fd39493056c6</id>
<content type='text'>
[ Upstream commit 5125e757e62f6c1d5478db4c2b61a744060ddf3f ]

To avoid returning uninitialized or random values when querying the file
descriptor (fd) and accessing probe_addr, it is necessary to clear the
variable prior to its use.

Fixes: 41bdc4b40ed6 ("bpf: introduce bpf subcommand BPF_TASK_FD_QUERY")
Signed-off-by: Yafang Shao &lt;laoar.shao@gmail.com&gt;
Acked-by: Yonghong Song &lt;yhs@fb.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: https://lore.kernel.org/r/20230709025630.3735-6-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 5125e757e62f6c1d5478db4c2b61a744060ddf3f ]

To avoid returning uninitialized or random values when querying the file
descriptor (fd) and accessing probe_addr, it is necessary to clear the
variable prior to its use.

Fixes: 41bdc4b40ed6 ("bpf: introduce bpf subcommand BPF_TASK_FD_QUERY")
Signed-off-by: Yafang Shao &lt;laoar.shao@gmail.com&gt;
Acked-by: Yonghong Song &lt;yhs@fb.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: https://lore.kernel.org/r/20230709025630.3735-6-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Introduce pipe_cpumask to avoid race on trace_pipes</title>
<updated>2023-09-13T07:47:55+00:00</updated>
<author>
<name>Zheng Yejian</name>
<email>zhengyejian1@huawei.com</email>
</author>
<published>2023-08-18T02:26:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=afdd9839e8b7d573da5ca0769579d52ddd365ea6'/>
<id>afdd9839e8b7d573da5ca0769579d52ddd365ea6</id>
<content type='text'>
[ Upstream commit c2489bb7e6be2e8cdced12c16c42fa128403ac03 ]

There is race issue when concurrently splice_read main trace_pipe and
per_cpu trace_pipes which will result in data read out being different
from what actually writen.

As suggested by Steven:
  &gt; I believe we should add a ref count to trace_pipe and the per_cpu
  &gt; trace_pipes, where if they are opened, nothing else can read it.
  &gt;
  &gt; Opening trace_pipe locks all per_cpu ref counts, if any of them are
  &gt; open, then the trace_pipe open will fail (and releases any ref counts
  &gt; it had taken).
  &gt;
  &gt; Opening a per_cpu trace_pipe will up the ref count for just that
  &gt; CPU buffer. This will allow multiple tasks to read different per_cpu
  &gt; trace_pipe files, but will prevent the main trace_pipe file from
  &gt; being opened.

But because we only need to know whether per_cpu trace_pipe is open or
not, using a cpumask instead of using ref count may be easier.

After this patch, users will find that:
 - Main trace_pipe can be opened by only one user, and if it is
   opened, all per_cpu trace_pipes cannot be opened;
 - Per_cpu trace_pipes can be opened by multiple users, but each per_cpu
   trace_pipe can only be opened by one user. And if one of them is
   opened, main trace_pipe cannot be opened.

Link: https://lore.kernel.org/linux-trace-kernel/20230818022645.1948314-1-zhengyejian1@huawei.com

Suggested-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Reviewed-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit c2489bb7e6be2e8cdced12c16c42fa128403ac03 ]

There is race issue when concurrently splice_read main trace_pipe and
per_cpu trace_pipes which will result in data read out being different
from what actually writen.

As suggested by Steven:
  &gt; I believe we should add a ref count to trace_pipe and the per_cpu
  &gt; trace_pipes, where if they are opened, nothing else can read it.
  &gt;
  &gt; Opening trace_pipe locks all per_cpu ref counts, if any of them are
  &gt; open, then the trace_pipe open will fail (and releases any ref counts
  &gt; it had taken).
  &gt;
  &gt; Opening a per_cpu trace_pipe will up the ref count for just that
  &gt; CPU buffer. This will allow multiple tasks to read different per_cpu
  &gt; trace_pipe files, but will prevent the main trace_pipe file from
  &gt; being opened.

But because we only need to know whether per_cpu trace_pipe is open or
not, using a cpumask instead of using ref count may be easier.

After this patch, users will find that:
 - Main trace_pipe can be opened by only one user, and if it is
   opened, all per_cpu trace_pipes cannot be opened;
 - Per_cpu trace_pipes can be opened by multiple users, but each per_cpu
   trace_pipe can only be opened by one user. And if one of them is
   opened, main trace_pipe cannot be opened.

Link: https://lore.kernel.org/linux-trace-kernel/20230818022645.1948314-1-zhengyejian1@huawei.com

Suggested-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Reviewed-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Fix memleak due to race between current_tracer and trace</title>
<updated>2023-08-30T12:52:29+00:00</updated>
<author>
<name>Zheng Yejian</name>
<email>zhengyejian1@huawei.com</email>
</author>
<published>2023-08-17T12:55:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2242640e9bd94e706acf75c60a2ab1d0e150e0fb'/>
<id>2242640e9bd94e706acf75c60a2ab1d0e150e0fb</id>
<content type='text'>
[ Upstream commit eecb91b9f98d6427d4af5fdb8f108f52572a39e7 ]

Kmemleak report a leak in graph_trace_open():

  unreferenced object 0xffff0040b95f4a00 (size 128):
    comm "cat", pid 204981, jiffies 4301155872 (age 99771.964s)
    hex dump (first 32 bytes):
      e0 05 e7 b4 ab 7d 00 00 0b 00 01 00 00 00 00 00 .....}..........
      f4 00 01 10 00 a0 ff ff 00 00 00 00 65 00 10 00 ............e...
    backtrace:
      [&lt;000000005db27c8b&gt;] kmem_cache_alloc_trace+0x348/0x5f0
      [&lt;000000007df90faa&gt;] graph_trace_open+0xb0/0x344
      [&lt;00000000737524cd&gt;] __tracing_open+0x450/0xb10
      [&lt;0000000098043327&gt;] tracing_open+0x1a0/0x2a0
      [&lt;00000000291c3876&gt;] do_dentry_open+0x3c0/0xdc0
      [&lt;000000004015bcd6&gt;] vfs_open+0x98/0xd0
      [&lt;000000002b5f60c9&gt;] do_open+0x520/0x8d0
      [&lt;00000000376c7820&gt;] path_openat+0x1c0/0x3e0
      [&lt;00000000336a54b5&gt;] do_filp_open+0x14c/0x324
      [&lt;000000002802df13&gt;] do_sys_openat2+0x2c4/0x530
      [&lt;0000000094eea458&gt;] __arm64_sys_openat+0x130/0x1c4
      [&lt;00000000a71d7881&gt;] el0_svc_common.constprop.0+0xfc/0x394
      [&lt;00000000313647bf&gt;] do_el0_svc+0xac/0xec
      [&lt;000000002ef1c651&gt;] el0_svc+0x20/0x30
      [&lt;000000002fd4692a&gt;] el0_sync_handler+0xb0/0xb4
      [&lt;000000000c309c35&gt;] el0_sync+0x160/0x180

The root cause is descripted as follows:

  __tracing_open() {  // 1. File 'trace' is being opened;
    ...
    *iter-&gt;trace = *tr-&gt;current_trace;  // 2. Tracer 'function_graph' is
                                        //    currently set;
    ...
    iter-&gt;trace-&gt;open(iter);  // 3. Call graph_trace_open() here,
                              //    and memory are allocated in it;
    ...
  }

  s_start() {  // 4. The opened file is being read;
    ...
    *iter-&gt;trace = *tr-&gt;current_trace;  // 5. If tracer is switched to
                                        //    'nop' or others, then memory
                                        //    in step 3 are leaked!!!
    ...
  }

To fix it, in s_start(), close tracer before switching then reopen the
new tracer after switching. And some tracers like 'wakeup' may not update
'iter-&gt;private' in some cases when reopen, then it should be cleared
to avoid being mistakenly closed again.

Link: https://lore.kernel.org/linux-trace-kernel/20230817125539.1646321-1-zhengyejian1@huawei.com

Fixes: d7350c3f4569 ("tracing/core: make the read callbacks reentrants")
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit eecb91b9f98d6427d4af5fdb8f108f52572a39e7 ]

Kmemleak report a leak in graph_trace_open():

  unreferenced object 0xffff0040b95f4a00 (size 128):
    comm "cat", pid 204981, jiffies 4301155872 (age 99771.964s)
    hex dump (first 32 bytes):
      e0 05 e7 b4 ab 7d 00 00 0b 00 01 00 00 00 00 00 .....}..........
      f4 00 01 10 00 a0 ff ff 00 00 00 00 65 00 10 00 ............e...
    backtrace:
      [&lt;000000005db27c8b&gt;] kmem_cache_alloc_trace+0x348/0x5f0
      [&lt;000000007df90faa&gt;] graph_trace_open+0xb0/0x344
      [&lt;00000000737524cd&gt;] __tracing_open+0x450/0xb10
      [&lt;0000000098043327&gt;] tracing_open+0x1a0/0x2a0
      [&lt;00000000291c3876&gt;] do_dentry_open+0x3c0/0xdc0
      [&lt;000000004015bcd6&gt;] vfs_open+0x98/0xd0
      [&lt;000000002b5f60c9&gt;] do_open+0x520/0x8d0
      [&lt;00000000376c7820&gt;] path_openat+0x1c0/0x3e0
      [&lt;00000000336a54b5&gt;] do_filp_open+0x14c/0x324
      [&lt;000000002802df13&gt;] do_sys_openat2+0x2c4/0x530
      [&lt;0000000094eea458&gt;] __arm64_sys_openat+0x130/0x1c4
      [&lt;00000000a71d7881&gt;] el0_svc_common.constprop.0+0xfc/0x394
      [&lt;00000000313647bf&gt;] do_el0_svc+0xac/0xec
      [&lt;000000002ef1c651&gt;] el0_svc+0x20/0x30
      [&lt;000000002fd4692a&gt;] el0_sync_handler+0xb0/0xb4
      [&lt;000000000c309c35&gt;] el0_sync+0x160/0x180

The root cause is descripted as follows:

  __tracing_open() {  // 1. File 'trace' is being opened;
    ...
    *iter-&gt;trace = *tr-&gt;current_trace;  // 2. Tracer 'function_graph' is
                                        //    currently set;
    ...
    iter-&gt;trace-&gt;open(iter);  // 3. Call graph_trace_open() here,
                              //    and memory are allocated in it;
    ...
  }

  s_start() {  // 4. The opened file is being read;
    ...
    *iter-&gt;trace = *tr-&gt;current_trace;  // 5. If tracer is switched to
                                        //    'nop' or others, then memory
                                        //    in step 3 are leaked!!!
    ...
  }

To fix it, in s_start(), close tracer before switching then reopen the
new tracer after switching. And some tracers like 'wakeup' may not update
'iter-&gt;private' in some cases when reopen, then it should be cleared
to avoid being mistakenly closed again.

Link: https://lore.kernel.org/linux-trace-kernel/20230817125539.1646321-1-zhengyejian1@huawei.com

Fixes: d7350c3f4569 ("tracing/core: make the read callbacks reentrants")
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing/synthetic: Allocate one additional element for size</title>
<updated>2023-08-30T12:52:29+00:00</updated>
<author>
<name>Sven Schnelle</name>
<email>svens@linux.ibm.com</email>
</author>
<published>2023-08-16T15:49:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=49834a2c43d5543631b576345f743c587f5242f3'/>
<id>49834a2c43d5543631b576345f743c587f5242f3</id>
<content type='text'>
[ Upstream commit c4d6b5438116c184027b2e911c0f2c7c406fb47c ]

While debugging another issue I noticed that the stack trace contains one
invalid entry at the end:

&lt;idle&gt;-0       [008] d..4.    26.484201: wake_lat: pid=0 delta=2629976084 000000009cc24024 stack=STACK:
=&gt; __schedule+0xac6/0x1a98
=&gt; schedule+0x126/0x2c0
=&gt; schedule_timeout+0x150/0x2c0
=&gt; kcompactd+0x9ca/0xc20
=&gt; kthread+0x2f6/0x3d8
=&gt; __ret_from_fork+0x8a/0xe8
=&gt; 0x6b6b6b6b6b6b6b6b

This is because the code failed to add the one element containing the
number of entries to field_size.

Link: https://lkml.kernel.org/r/20230816154928.4171614-4-svens@linux.ibm.com

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Fixes: 00cf3d672a9d ("tracing: Allow synthetic events to pass around stacktraces")
Signed-off-by: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit c4d6b5438116c184027b2e911c0f2c7c406fb47c ]

While debugging another issue I noticed that the stack trace contains one
invalid entry at the end:

&lt;idle&gt;-0       [008] d..4.    26.484201: wake_lat: pid=0 delta=2629976084 000000009cc24024 stack=STACK:
=&gt; __schedule+0xac6/0x1a98
=&gt; schedule+0x126/0x2c0
=&gt; schedule_timeout+0x150/0x2c0
=&gt; kcompactd+0x9ca/0xc20
=&gt; kthread+0x2f6/0x3d8
=&gt; __ret_from_fork+0x8a/0xe8
=&gt; 0x6b6b6b6b6b6b6b6b

This is because the code failed to add the one element containing the
number of entries to field_size.

Link: https://lkml.kernel.org/r/20230816154928.4171614-4-svens@linux.ibm.com

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Fixes: 00cf3d672a9d ("tracing: Allow synthetic events to pass around stacktraces")
Signed-off-by: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing/synthetic: Skip first entry for stack traces</title>
<updated>2023-08-30T12:52:29+00:00</updated>
<author>
<name>Sven Schnelle</name>
<email>svens@linux.ibm.com</email>
</author>
<published>2023-08-16T15:49:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=009e77a9169082c7c9813d4994530f020326b1e1'/>
<id>009e77a9169082c7c9813d4994530f020326b1e1</id>
<content type='text'>
[ Upstream commit 887f92e09ef34a949745ad26ce82be69e2dabcf6 ]

While debugging another issue I noticed that the stack trace output
contains the number of entries on top:

         &lt;idle&gt;-0       [000] d..4.   203.322502: wake_lat: pid=0 delta=2268270616 stack=STACK:
=&gt; 0x10
=&gt; __schedule+0xac6/0x1a98
=&gt; schedule+0x126/0x2c0
=&gt; schedule_timeout+0x242/0x2c0
=&gt; __wait_for_common+0x434/0x680
=&gt; __wait_rcu_gp+0x198/0x3e0
=&gt; synchronize_rcu+0x112/0x138
=&gt; ring_buffer_reset_online_cpus+0x140/0x2e0
=&gt; tracing_reset_online_cpus+0x15c/0x1d0
=&gt; tracing_set_clock+0x180/0x1d8
=&gt; hist_register_trigger+0x486/0x670
=&gt; event_hist_trigger_parse+0x494/0x1318
=&gt; trigger_process_regex+0x1d4/0x258
=&gt; event_trigger_write+0xb4/0x170
=&gt; vfs_write+0x210/0xad0
=&gt; ksys_write+0x122/0x208

Fix this by skipping the first element. Also replace the pointer
logic with an index variable which is easier to read.

Link: https://lkml.kernel.org/r/20230816154928.4171614-3-svens@linux.ibm.com

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Fixes: 00cf3d672a9d ("tracing: Allow synthetic events to pass around stacktraces")
Signed-off-by: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 887f92e09ef34a949745ad26ce82be69e2dabcf6 ]

While debugging another issue I noticed that the stack trace output
contains the number of entries on top:

         &lt;idle&gt;-0       [000] d..4.   203.322502: wake_lat: pid=0 delta=2268270616 stack=STACK:
=&gt; 0x10
=&gt; __schedule+0xac6/0x1a98
=&gt; schedule+0x126/0x2c0
=&gt; schedule_timeout+0x242/0x2c0
=&gt; __wait_for_common+0x434/0x680
=&gt; __wait_rcu_gp+0x198/0x3e0
=&gt; synchronize_rcu+0x112/0x138
=&gt; ring_buffer_reset_online_cpus+0x140/0x2e0
=&gt; tracing_reset_online_cpus+0x15c/0x1d0
=&gt; tracing_set_clock+0x180/0x1d8
=&gt; hist_register_trigger+0x486/0x670
=&gt; event_hist_trigger_parse+0x494/0x1318
=&gt; trigger_process_regex+0x1d4/0x258
=&gt; event_trigger_write+0xb4/0x170
=&gt; vfs_write+0x210/0xad0
=&gt; ksys_write+0x122/0x208

Fix this by skipping the first element. Also replace the pointer
logic with an index variable which is easier to read.

Link: https://lkml.kernel.org/r/20230816154928.4171614-3-svens@linux.ibm.com

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Fixes: 00cf3d672a9d ("tracing: Allow synthetic events to pass around stacktraces")
Signed-off-by: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing/synthetic: Use union instead of casts</title>
<updated>2023-08-30T12:52:29+00:00</updated>
<author>
<name>Sven Schnelle</name>
<email>svens@linux.ibm.com</email>
</author>
<published>2023-08-16T15:49:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5c2d886ea8cd1b893cd00138688db6245abe76e3'/>
<id>5c2d886ea8cd1b893cd00138688db6245abe76e3</id>
<content type='text'>
[ Upstream commit ddeea494a16f32522bce16ee65f191d05d4b8282 ]

The current code uses a lot of casts to access the fields member in struct
synth_trace_events with different sizes.  This makes the code hard to
read, and had already introduced an endianness bug. Use a union and struct
instead.

Link: https://lkml.kernel.org/r/20230816154928.4171614-2-svens@linux.ibm.com

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Fixes: 00cf3d672a9dd ("tracing: Allow synthetic events to pass around stacktraces")
Signed-off-by: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Stable-dep-of: 887f92e09ef3 ("tracing/synthetic: Skip first entry for stack traces")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit ddeea494a16f32522bce16ee65f191d05d4b8282 ]

The current code uses a lot of casts to access the fields member in struct
synth_trace_events with different sizes.  This makes the code hard to
read, and had already introduced an endianness bug. Use a union and struct
instead.

Link: https://lkml.kernel.org/r/20230816154928.4171614-2-svens@linux.ibm.com

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Fixes: 00cf3d672a9dd ("tracing: Allow synthetic events to pass around stacktraces")
Signed-off-by: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Stable-dep-of: 887f92e09ef3 ("tracing/synthetic: Skip first entry for stack traces")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Fix cpu buffers unavailable due to 'record_disabled' missed</title>
<updated>2023-08-30T12:52:28+00:00</updated>
<author>
<name>Zheng Yejian</name>
<email>zhengyejian1@huawei.com</email>
</author>
<published>2023-08-05T03:38:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=299e0033f1bd5320279b2db971914374499f7e5f'/>
<id>299e0033f1bd5320279b2db971914374499f7e5f</id>
<content type='text'>
[ Upstream commit b71645d6af10196c46cbe3732de2ea7d36b3ff6d ]

Trace ring buffer can no longer record anything after executing
following commands at the shell prompt:

  # cd /sys/kernel/tracing
  # cat tracing_cpumask
  fff
  # echo 0 &gt; tracing_cpumask
  # echo 1 &gt; snapshot
  # echo fff &gt; tracing_cpumask
  # echo 1 &gt; tracing_on
  # echo "hello world" &gt; trace_marker
  -bash: echo: write error: Bad file descriptor

The root cause is that:
  1. After `echo 0 &gt; tracing_cpumask`, 'record_disabled' of cpu buffers
     in 'tr-&gt;array_buffer.buffer' became 1 (see tracing_set_cpumask());
  2. After `echo 1 &gt; snapshot`, 'tr-&gt;array_buffer.buffer' is swapped
     with 'tr-&gt;max_buffer.buffer', then the 'record_disabled' became 0
     (see update_max_tr());
  3. After `echo fff &gt; tracing_cpumask`, the 'record_disabled' become -1;
Then array_buffer and max_buffer are both unavailable due to value of
'record_disabled' is not 0.

To fix it, enable or disable both array_buffer and max_buffer at the same
time in tracing_set_cpumask().

Link: https://lkml.kernel.org/r/20230805033816.3284594-2-zhengyejian1@huawei.com

Cc: &lt;mhiramat@kernel.org&gt;
Cc: &lt;vnagarnaik@google.com&gt;
Cc: &lt;shuah@kernel.org&gt;
Fixes: 71babb2705e2 ("tracing: change CPU ring buffer state from tracing_cpumask")
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit b71645d6af10196c46cbe3732de2ea7d36b3ff6d ]

Trace ring buffer can no longer record anything after executing
following commands at the shell prompt:

  # cd /sys/kernel/tracing
  # cat tracing_cpumask
  fff
  # echo 0 &gt; tracing_cpumask
  # echo 1 &gt; snapshot
  # echo fff &gt; tracing_cpumask
  # echo 1 &gt; tracing_on
  # echo "hello world" &gt; trace_marker
  -bash: echo: write error: Bad file descriptor

The root cause is that:
  1. After `echo 0 &gt; tracing_cpumask`, 'record_disabled' of cpu buffers
     in 'tr-&gt;array_buffer.buffer' became 1 (see tracing_set_cpumask());
  2. After `echo 1 &gt; snapshot`, 'tr-&gt;array_buffer.buffer' is swapped
     with 'tr-&gt;max_buffer.buffer', then the 'record_disabled' became 0
     (see update_max_tr());
  3. After `echo fff &gt; tracing_cpumask`, the 'record_disabled' become -1;
Then array_buffer and max_buffer are both unavailable due to value of
'record_disabled' is not 0.

To fix it, enable or disable both array_buffer and max_buffer at the same
time in tracing_set_cpumask().

Link: https://lkml.kernel.org/r/20230805033816.3284594-2-zhengyejian1@huawei.com

Cc: &lt;mhiramat@kernel.org&gt;
Cc: &lt;vnagarnaik@google.com&gt;
Cc: &lt;shuah@kernel.org&gt;
Fixes: 71babb2705e2 ("tracing: change CPU ring buffer state from tracing_cpumask")
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
