<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/kernel/trace/trace_irqsoff.c, branch v7.0</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>tracing: Allow tracer to add more than 32 options</title>
<updated>2025-11-04T12:44:00+00:00</updated>
<author>
<name>Masami Hiramatsu (Google)</name>
<email>mhiramat@kernel.org</email>
</author>
<published>2025-10-31T02:46:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bbec8e28cac5928c20052c489cb2e345e6bd4271'/>
<id>bbec8e28cac5928c20052c489cb2e345e6bd4271</id>
<content type='text'>
Since enum trace_iterator_flags is 32bit, the max number of the
option flags is limited to 32 and it is fully used now. To add
a new option, we need to expand it.

So replace the TRACE_ITER_##flag with TRACE_ITER(flag) macro which
is 64bit bitmask.

Link: https://lore.kernel.org/all/176187877103.994619.166076000668757232.stgit@devnote2/

Signed-off-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since enum trace_iterator_flags is 32bit, the max number of the
option flags is limited to 32 and it is fully used now. To add
a new option, we need to expand it.

So replace the TRACE_ITER_##flag with TRACE_ITER(flag) macro which
is 64bit bitmask.

Link: https://lore.kernel.org/all/176187877103.994619.166076000668757232.stgit@devnote2/

Signed-off-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Fix irqoff tracers on failure of acquiring calltime</title>
<updated>2025-10-08T16:10:44+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2025-10-08T15:49:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c834a97962c708ff5bb8582ca76b0e1225feb675'/>
<id>c834a97962c708ff5bb8582ca76b0e1225feb675</id>
<content type='text'>
The functions irqsoff_graph_entry() and irqsoff_graph_return() both call
func_prolog_dec() that will test if the data-&gt;disable is already set and
if not, increment it and return. If it was set, it returns false and the
caller exits.

The caller of this function must decrement the disable counter, but misses
doing so if the calltime fails to be acquired.

Instead of exiting out when calltime is NULL, change the logic to do the
work if it is not NULL and still do the clean up at the end of the
function if it is NULL.

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: https://lore.kernel.org/20251008114943.6f60f30f@gandalf.local.home
Fixes: a485ea9e3ef3 ("tracing: Fix irqsoff and wakeup latency tracers when using function graph")
Reported-by: Sasha Levin &lt;sashal@kernel.org&gt;
Closes: https://lore.kernel.org/linux-trace-kernel/20251006175848.1906912-2-sashal@kernel.org/
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The functions irqsoff_graph_entry() and irqsoff_graph_return() both call
func_prolog_dec() that will test if the data-&gt;disable is already set and
if not, increment it and return. If it was set, it returns false and the
caller exits.

The caller of this function must decrement the disable counter, but misses
doing so if the calltime fails to be acquired.

Instead of exiting out when calltime is NULL, change the logic to do the
work if it is not NULL and still do the clean up at the end of the
function if it is NULL.

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: https://lore.kernel.org/20251008114943.6f60f30f@gandalf.local.home
Fixes: a485ea9e3ef3 ("tracing: Fix irqsoff and wakeup latency tracers when using function graph")
Reported-by: Sasha Levin &lt;sashal@kernel.org&gt;
Closes: https://lore.kernel.org/linux-trace-kernel/20251006175848.1906912-2-sashal@kernel.org/
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Use atomic_inc_return() for updating "disabled" counter in irqsoff tracer</title>
<updated>2025-05-09T19:19:10+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2025-05-05T21:21:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c4a80c06154084d52851fe495d40b0e4da032f47'/>
<id>c4a80c06154084d52851fe495d40b0e4da032f47</id>
<content type='text'>
The irqsoff tracer uses the per CPU "disabled" field to prevent corruption
of the accounting when it starts to trace interrupts disabled, but there's
a slight race that could happen if for some reason it was called twice.
Use atomic_inc_return() instead.

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: https://lore.kernel.org/20250505212236.567884756@goodmis.org
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The irqsoff tracer uses the per CPU "disabled" field to prevent corruption
of the accounting when it starts to trace interrupts disabled, but there's
a slight race that could happen if for some reason it was called twice.
Use atomic_inc_return() instead.

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: https://lore.kernel.org/20250505212236.567884756@goodmis.org
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Convert the per CPU "disabled" counter to local from atomic</title>
<updated>2025-05-09T19:19:10+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2025-05-05T21:21:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=90633c34c36d0c15c9da4e19b2ceb46cab137478'/>
<id>90633c34c36d0c15c9da4e19b2ceb46cab137478</id>
<content type='text'>
The per CPU "disabled" counter is used for the latency tracers and stack
tracers to make sure that their accounting isn't messed up by an NMI or
interrupt coming in and affecting the same CPU data. But the counter is an
atomic_t type. As it only needs to synchronize against the current CPU,
switch it over to local_t type.

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: https://lore.kernel.org/20250505212236.394925376@goodmis.org
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The per CPU "disabled" counter is used for the latency tracers and stack
tracers to make sure that their accounting isn't messed up by an NMI or
interrupt coming in and affecting the same CPU data. But the counter is an
atomic_t type. As it only needs to synchronize against the current CPU,
switch it over to local_t type.

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: https://lore.kernel.org/20250505212236.394925376@goodmis.org
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Fix use-after-free in print_graph_function_flags during tracer switching</title>
<updated>2025-03-22T09:42:42+00:00</updated>
<author>
<name>Tengda Wu</name>
<email>wutengda@huaweicloud.com</email>
</author>
<published>2025-03-20T12:21:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7f81f27b1093e4895e87b74143c59c055c3b1906'/>
<id>7f81f27b1093e4895e87b74143c59c055c3b1906</id>
<content type='text'>
Kairui reported a UAF issue in print_graph_function_flags() during
ftrace stress testing [1]. This issue can be reproduced if puting a
'mdelay(10)' after 'mutex_unlock(&amp;trace_types_lock)' in s_start(),
and executing the following script:

  $ echo function_graph &gt; current_tracer
  $ cat trace &gt; /dev/null &amp;
  $ sleep 5  # Ensure the 'cat' reaches the 'mdelay(10)' point
  $ echo timerlat &gt; current_tracer

The root cause lies in the two calls to print_graph_function_flags
within print_trace_line during each s_show():

  * One through 'iter-&gt;trace-&gt;print_line()';
  * Another through 'event-&gt;funcs-&gt;trace()', which is hidden in
    print_trace_fmt() before print_trace_line returns.

Tracer switching only updates the former, while the latter continues
to use the print_line function of the old tracer, which in the script
above is print_graph_function_flags.

Moreover, when switching from the 'function_graph' tracer to the
'timerlat' tracer, s_start only calls graph_trace_close of the
'function_graph' tracer to free 'iter-&gt;private', but does not set
it to NULL. This provides an opportunity for 'event-&gt;funcs-&gt;trace()'
to use an invalid 'iter-&gt;private'.

To fix this issue, set 'iter-&gt;private' to NULL immediately after
freeing it in graph_trace_close(), ensuring that an invalid pointer
is not passed to other tracers. Additionally, clean up the unnecessary
'iter-&gt;private = NULL' during each 'cat trace' when using wakeup and
irqsoff tracers.

 [1] https://lore.kernel.org/all/20231112150030.84609-1-ryncsn@gmail.com/

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Link: https://lore.kernel.org/20250320122137.23635-1-wutengda@huaweicloud.com
Fixes: eecb91b9f98d ("tracing: Fix memleak due to race between current_tracer and trace")
Closes: https://lore.kernel.org/all/CAMgjq7BW79KDSCyp+tZHjShSzHsScSiJxn5ffskp-QzVM06fxw@mail.gmail.com/
Reported-by: Kairui Song &lt;kasong@tencent.com&gt;
Signed-off-by: Tengda Wu &lt;wutengda@huaweicloud.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Kairui reported a UAF issue in print_graph_function_flags() during
ftrace stress testing [1]. This issue can be reproduced if puting a
'mdelay(10)' after 'mutex_unlock(&amp;trace_types_lock)' in s_start(),
and executing the following script:

  $ echo function_graph &gt; current_tracer
  $ cat trace &gt; /dev/null &amp;
  $ sleep 5  # Ensure the 'cat' reaches the 'mdelay(10)' point
  $ echo timerlat &gt; current_tracer

The root cause lies in the two calls to print_graph_function_flags
within print_trace_line during each s_show():

  * One through 'iter-&gt;trace-&gt;print_line()';
  * Another through 'event-&gt;funcs-&gt;trace()', which is hidden in
    print_trace_fmt() before print_trace_line returns.

Tracer switching only updates the former, while the latter continues
to use the print_line function of the old tracer, which in the script
above is print_graph_function_flags.

Moreover, when switching from the 'function_graph' tracer to the
'timerlat' tracer, s_start only calls graph_trace_close of the
'function_graph' tracer to free 'iter-&gt;private', but does not set
it to NULL. This provides an opportunity for 'event-&gt;funcs-&gt;trace()'
to use an invalid 'iter-&gt;private'.

To fix this issue, set 'iter-&gt;private' to NULL immediately after
freeing it in graph_trace_close(), ensuring that an invalid pointer
is not passed to other tracers. Additionally, clean up the unnecessary
'iter-&gt;private = NULL' during each 'cat trace' when using wakeup and
irqsoff tracers.

 [1] https://lore.kernel.org/all/20231112150030.84609-1-ryncsn@gmail.com/

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Link: https://lore.kernel.org/20250320122137.23635-1-wutengda@huaweicloud.com
Fixes: eecb91b9f98d ("tracing: Fix memleak due to race between current_tracer and trace")
Closes: https://lore.kernel.org/all/CAMgjq7BW79KDSCyp+tZHjShSzHsScSiJxn5ffskp-QzVM06fxw@mail.gmail.com/
Reported-by: Kairui Song &lt;kasong@tencent.com&gt;
Signed-off-by: Tengda Wu &lt;wutengda@huaweicloud.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ftrace: Add arguments to function tracer</title>
<updated>2025-03-04T16:27:24+00:00</updated>
<author>
<name>Sven Schnelle</name>
<email>svens@linux.ibm.com</email>
</author>
<published>2025-02-27T18:58:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=76fe0337c2199988cb9ed7e41c05d687d95f2e18'/>
<id>76fe0337c2199988cb9ed7e41c05d687d95f2e18</id>
<content type='text'>
Wire up the code to print function arguments in the function tracer.
This functionality can be enabled/disabled during runtime with
options/func-args.

        ping-689     [004] b....    77.170220: dummy_xmit(skb = 0x82904800, dev = 0x882d0000) &lt;-dev_hard_start_xmit

Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: Albert Ou &lt;aou@eecs.berkeley.edu&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Donglin Peng &lt;dolinux.peng@gmail.com&gt;
Cc: Zheng Yejian &lt;zhengyejian@huaweicloud.com&gt;
Link: https://lore.kernel.org/20250227185823.154996172@goodmis.org
Reviewed-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Co-developed-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Wire up the code to print function arguments in the function tracer.
This functionality can be enabled/disabled during runtime with
options/func-args.

        ping-689     [004] b....    77.170220: dummy_xmit(skb = 0x82904800, dev = 0x882d0000) &lt;-dev_hard_start_xmit

Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: Albert Ou &lt;aou@eecs.berkeley.edu&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Donglin Peng &lt;dolinux.peng@gmail.com&gt;
Cc: Zheng Yejian &lt;zhengyejian@huaweicloud.com&gt;
Link: https://lore.kernel.org/20250227185823.154996172@goodmis.org
Reviewed-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Co-developed-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fgraph: Remove calltime and rettime from generic operations</title>
<updated>2025-01-22T02:55:49+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2025-01-22T00:44:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=66611c0475709607f398e2a5d691b1fc72fe9dfc'/>
<id>66611c0475709607f398e2a5d691b1fc72fe9dfc</id>
<content type='text'>
The function graph infrastructure is now generic so that kretprobes,
fprobes and BPF can use it. But there is still some leftover logic that
only the function graph tracer itself uses. This is the calculation of the
calltime and return time of the functions. The calculation of the calltime
has been moved into the function graph tracer and those users that need it
so that it doesn't cause overhead to the other users. But the return
function timestamp was still called.

Instead of just moving the taking of the timestamp into the function graph
trace remove the calltime and rettime completely from the ftrace_graph_ret
structure. Instead, move it into the function graph return entry event
structure and this also moves all the calltime and rettime logic out of
the generic fgraph.c code and into the tracing code that uses it.

This has been reported to decrease the overhead by ~27%.

Link: https://lore.kernel.org/all/Z3aSuql3fnXMVMoM@krava/
Link: https://lore.kernel.org/all/173665959558.1629214.16724136597211810729.stgit@devnote2/

Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: https://lore.kernel.org/20250121194436.15bdf71a@gandalf.local.home
Reported-by: Jiri Olsa &lt;olsajiri@gmail.com&gt;
Reviewed-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The function graph infrastructure is now generic so that kretprobes,
fprobes and BPF can use it. But there is still some leftover logic that
only the function graph tracer itself uses. This is the calculation of the
calltime and return time of the functions. The calculation of the calltime
has been moved into the function graph tracer and those users that need it
so that it doesn't cause overhead to the other users. But the return
function timestamp was still called.

Instead of just moving the taking of the timestamp into the function graph
trace remove the calltime and rettime completely from the ftrace_graph_ret
structure. Instead, move it into the function graph return entry event
structure and this also moves all the calltime and rettime logic out of
the generic fgraph.c code and into the tracing code that uses it.

This has been reported to decrease the overhead by ~27%.

Link: https://lore.kernel.org/all/Z3aSuql3fnXMVMoM@krava/
Link: https://lore.kernel.org/all/173665959558.1629214.16724136597211810729.stgit@devnote2/

Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: https://lore.kernel.org/20250121194436.15bdf71a@gandalf.local.home
Reported-by: Jiri Olsa &lt;olsajiri@gmail.com&gt;
Reviewed-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace</title>
<updated>2025-01-21T23:15:28+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-01-21T23:15:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2e04247f7cce8b8cd8381a29078701691fec684d'/>
<id>2e04247f7cce8b8cd8381a29078701691fec684d</id>
<content type='text'>
Pull ftrace updates from Steven Rostedt:

 - Have fprobes built on top of function graph infrastructure

   The fprobe logic is an optimized kprobe that uses ftrace to attach to
   functions when a probe is needed at the start or end of the function.
   The fprobe and kretprobe logic implements a similar method as the
   function graph tracer to trace the end of the function. That is to
   hijack the return address and jump to a trampoline to do the trace
   when the function exits. To do this, a shadow stack needs to be
   created to store the original return address. Fprobes and function
   graph do this slightly differently. Fprobes (and kretprobes) has
   slots per callsite that are reserved to save the return address. This
   is fine when just a few points are traced. But users of fprobes, such
   as BPF programs, are starting to add many more locations, and this
   method does not scale.

   The function graph tracer was created to trace all functions in the
   kernel. In order to do this, when function graph tracing is started,
   every task gets its own shadow stack to hold the return address that
   is going to be traced. The function graph tracer has been updated to
   allow multiple users to use its infrastructure. Now have fprobes be
   one of those users. This will also allow for the fprobe and kretprobe
   methods to trace the return address to become obsolete. With new
   technologies like CFI that need to know about these methods of
   hijacking the return address, going toward a solution that has only
   one method of doing this will make the kernel less complex.

 - Cleanup with guard() and free() helpers

   There were several places in the code that had a lot of "goto out" in
   the error paths to either unlock a lock or free some memory that was
   allocated. But this is error prone. Convert the code over to use the
   guard() and free() helpers that let the compiler unlock locks or free
   memory when the function exits.

 - Remove disabling of interrupts in the function graph tracer

   When function graph tracer was first introduced, it could race with
   interrupts and NMIs. To prevent that race, it would disable
   interrupts and not trace NMIs. But the code has changed to allow NMIs
   and also interrupts. This change was done a long time ago, but the
   disabling of interrupts was never removed. Remove the disabling of
   interrupts in the function graph tracer is it is not needed. This
   greatly improves its performance.

 - Allow the :mod: command to enable tracing module functions on the
   kernel command line.

   The function tracer already has a way to enable functions to be
   traced in modules by writing ":mod:&lt;module&gt;" into set_ftrace_filter.
   That will enable either all the functions for the module if it is
   loaded, or if it is not, it will cache that command, and when the
   module is loaded that matches &lt;module&gt;, its functions will be
   enabled. This also allows init functions to be traced. But currently
   events do not have that feature.

   Because enabling function tracing can be done very early at boot up
   (before scheduling is enabled), the commands that can be done when
   function tracing is started is limited. Having the ":mod:" command to
   trace module functions as they are loaded is very useful. Update the
   kernel command line function filtering to allow it.

* tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits)
  ftrace: Implement :mod: cache filtering on kernel command line
  tracing: Adopt __free() and guard() for trace_fprobe.c
  bpf: Use ftrace_get_symaddr() for kprobe_multi probes
  ftrace: Add ftrace_get_symaddr to convert fentry_ip to symaddr
  Documentation: probes: Update fprobe on function-graph tracer
  selftests/ftrace: Add a test case for repeating register/unregister fprobe
  selftests: ftrace: Remove obsolate maxactive syntax check
  tracing/fprobe: Remove nr_maxactive from fprobe
  fprobe: Add fprobe_header encoding feature
  fprobe: Rewrite fprobe on function-graph tracer
  s390/tracing: Enable HAVE_FTRACE_GRAPH_FUNC
  ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC
  bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled
  tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
  tracing: Add ftrace_fill_perf_regs() for perf event
  tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  fprobe: Use ftrace_regs in fprobe exit handler
  fprobe: Use ftrace_regs in fprobe entry handler
  fgraph: Pass ftrace_regs to retfunc
  fgraph: Replace fgraph_ret_regs with ftrace_regs
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull ftrace updates from Steven Rostedt:

 - Have fprobes built on top of function graph infrastructure

   The fprobe logic is an optimized kprobe that uses ftrace to attach to
   functions when a probe is needed at the start or end of the function.
   The fprobe and kretprobe logic implements a similar method as the
   function graph tracer to trace the end of the function. That is to
   hijack the return address and jump to a trampoline to do the trace
   when the function exits. To do this, a shadow stack needs to be
   created to store the original return address. Fprobes and function
   graph do this slightly differently. Fprobes (and kretprobes) has
   slots per callsite that are reserved to save the return address. This
   is fine when just a few points are traced. But users of fprobes, such
   as BPF programs, are starting to add many more locations, and this
   method does not scale.

   The function graph tracer was created to trace all functions in the
   kernel. In order to do this, when function graph tracing is started,
   every task gets its own shadow stack to hold the return address that
   is going to be traced. The function graph tracer has been updated to
   allow multiple users to use its infrastructure. Now have fprobes be
   one of those users. This will also allow for the fprobe and kretprobe
   methods to trace the return address to become obsolete. With new
   technologies like CFI that need to know about these methods of
   hijacking the return address, going toward a solution that has only
   one method of doing this will make the kernel less complex.

 - Cleanup with guard() and free() helpers

   There were several places in the code that had a lot of "goto out" in
   the error paths to either unlock a lock or free some memory that was
   allocated. But this is error prone. Convert the code over to use the
   guard() and free() helpers that let the compiler unlock locks or free
   memory when the function exits.

 - Remove disabling of interrupts in the function graph tracer

   When function graph tracer was first introduced, it could race with
   interrupts and NMIs. To prevent that race, it would disable
   interrupts and not trace NMIs. But the code has changed to allow NMIs
   and also interrupts. This change was done a long time ago, but the
   disabling of interrupts was never removed. Remove the disabling of
   interrupts in the function graph tracer is it is not needed. This
   greatly improves its performance.

 - Allow the :mod: command to enable tracing module functions on the
   kernel command line.

   The function tracer already has a way to enable functions to be
   traced in modules by writing ":mod:&lt;module&gt;" into set_ftrace_filter.
   That will enable either all the functions for the module if it is
   loaded, or if it is not, it will cache that command, and when the
   module is loaded that matches &lt;module&gt;, its functions will be
   enabled. This also allows init functions to be traced. But currently
   events do not have that feature.

   Because enabling function tracing can be done very early at boot up
   (before scheduling is enabled), the commands that can be done when
   function tracing is started is limited. Having the ":mod:" command to
   trace module functions as they are loaded is very useful. Update the
   kernel command line function filtering to allow it.

* tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits)
  ftrace: Implement :mod: cache filtering on kernel command line
  tracing: Adopt __free() and guard() for trace_fprobe.c
  bpf: Use ftrace_get_symaddr() for kprobe_multi probes
  ftrace: Add ftrace_get_symaddr to convert fentry_ip to symaddr
  Documentation: probes: Update fprobe on function-graph tracer
  selftests/ftrace: Add a test case for repeating register/unregister fprobe
  selftests: ftrace: Remove obsolate maxactive syntax check
  tracing/fprobe: Remove nr_maxactive from fprobe
  fprobe: Add fprobe_header encoding feature
  fprobe: Rewrite fprobe on function-graph tracer
  s390/tracing: Enable HAVE_FTRACE_GRAPH_FUNC
  ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC
  bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled
  tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
  tracing: Add ftrace_fill_perf_regs() for perf event
  tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  fprobe: Use ftrace_regs in fprobe exit handler
  fprobe: Use ftrace_regs in fprobe entry handler
  fgraph: Pass ftrace_regs to retfunc
  fgraph: Replace fgraph_ret_regs with ftrace_regs
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Fix irqsoff and wakeup latency tracers when using function graph</title>
<updated>2025-01-14T14:38:09+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2025-01-13T23:31:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a485ea9e3ef31ac4e3a2245cdb11fa73352b950f'/>
<id>a485ea9e3ef31ac4e3a2245cdb11fa73352b950f</id>
<content type='text'>
The function graph tracer has become generic so that kretprobes and BPF
can use it along with function graph tracing itself. Some of the
infrastructure was specific for function graph tracing such as recording
the calltime and return time of the functions. Calling the clock code on a
high volume function does add overhead. The calculation of the calltime
was removed from the generic code and placed into the function graph
tracer itself so that the other users did not incur this overhead as they
did not need that timestamp.

The calltime field was still kept in the generic return entry structure
and the function graph return entry callback filled it as that structure
was passed to other code.

But this broke both irqsoff and wakeup latency tracer as they still
depended on the trace structure containing the calltime when the option
display-graph is set as it used some of those same functions that the
function graph tracer used. But now the calltime was not set and was just
zero. This caused the calculation of the function time to be the absolute
value of the return timestamp and not the length of the function.

 # cd /sys/kernel/tracing
 # echo 1 &gt; options/display-graph
 # echo irqsoff &gt; current_tracer

The tracers went from:

 #   REL TIME      CPU  TASK/PID       ||||     DURATION                  FUNCTION CALLS
 #      |          |     |    |        ||||      |   |                     |   |   |   |
        0 us |   4)    &lt;idle&gt;-0    |  d..1. |   0.000 us    |  irqentry_enter();
        3 us |   4)    &lt;idle&gt;-0    |  d..2. |               |  irq_enter_rcu() {
        4 us |   4)    &lt;idle&gt;-0    |  d..2. |   0.431 us    |    preempt_count_add();
        5 us |   4)    &lt;idle&gt;-0    |  d.h2. |               |    tick_irq_enter() {
        5 us |   4)    &lt;idle&gt;-0    |  d.h2. |   0.433 us    |      tick_check_oneshot_broadcast_this_cpu();
        6 us |   4)    &lt;idle&gt;-0    |  d.h2. |   2.426 us    |      ktime_get();
        9 us |   4)    &lt;idle&gt;-0    |  d.h2. |               |      tick_nohz_stop_idle() {
       10 us |   4)    &lt;idle&gt;-0    |  d.h2. |   0.398 us    |        nr_iowait_cpu();
       11 us |   4)    &lt;idle&gt;-0    |  d.h1. |   1.903 us    |      }
       11 us |   4)    &lt;idle&gt;-0    |  d.h2. |               |      tick_do_update_jiffies64() {
       12 us |   4)    &lt;idle&gt;-0    |  d.h2. |               |        _raw_spin_lock() {
       12 us |   4)    &lt;idle&gt;-0    |  d.h2. |   0.360 us    |          preempt_count_add();
       13 us |   4)    &lt;idle&gt;-0    |  d.h3. |   0.354 us    |          do_raw_spin_lock();
       14 us |   4)    &lt;idle&gt;-0    |  d.h2. |   2.207 us    |        }
       15 us |   4)    &lt;idle&gt;-0    |  d.h3. |   0.428 us    |        calc_global_load();
       16 us |   4)    &lt;idle&gt;-0    |  d.h3. |               |        _raw_spin_unlock() {
       16 us |   4)    &lt;idle&gt;-0    |  d.h3. |   0.380 us    |          do_raw_spin_unlock();
       17 us |   4)    &lt;idle&gt;-0    |  d.h3. |   0.334 us    |          preempt_count_sub();
       18 us |   4)    &lt;idle&gt;-0    |  d.h1. |   1.768 us    |        }
       18 us |   4)    &lt;idle&gt;-0    |  d.h2. |               |        update_wall_time() {
      [..]

To:

 #   REL TIME      CPU  TASK/PID       ||||     DURATION                  FUNCTION CALLS
 #      |          |     |    |        ||||      |   |                     |   |   |   |
        0 us |   5)    &lt;idle&gt;-0    |  d.s2. |   0.000 us    |  _raw_spin_lock_irqsave();
        0 us |   5)    &lt;idle&gt;-0    |  d.s3. |   312159583 us |      preempt_count_add();
        2 us |   5)    &lt;idle&gt;-0    |  d.s4. |   312159585 us |      do_raw_spin_lock();
        3 us |   5)    &lt;idle&gt;-0    |  d.s4. |               |      _raw_spin_unlock() {
        3 us |   5)    &lt;idle&gt;-0    |  d.s4. |   312159586 us |        do_raw_spin_unlock();
        4 us |   5)    &lt;idle&gt;-0    |  d.s4. |   312159587 us |        preempt_count_sub();
        4 us |   5)    &lt;idle&gt;-0    |  d.s2. |   312159587 us |      }
        5 us |   5)    &lt;idle&gt;-0    |  d.s3. |               |      _raw_spin_lock() {
        5 us |   5)    &lt;idle&gt;-0    |  d.s3. |   312159588 us |        preempt_count_add();
        6 us |   5)    &lt;idle&gt;-0    |  d.s4. |   312159589 us |        do_raw_spin_lock();
        7 us |   5)    &lt;idle&gt;-0    |  d.s3. |   312159590 us |      }
        8 us |   5)    &lt;idle&gt;-0    |  d.s4. |   312159591 us |      calc_wheel_index();
        9 us |   5)    &lt;idle&gt;-0    |  d.s4. |               |      enqueue_timer() {
        9 us |   5)    &lt;idle&gt;-0    |  d.s4. |               |        wake_up_nohz_cpu() {
       11 us |   5)    &lt;idle&gt;-0    |  d.s4. |               |          native_smp_send_reschedule() {
       11 us |   5)    &lt;idle&gt;-0    |  d.s4. |   312171987 us |            default_send_IPI_single_phys();
    12408 us |   5)    &lt;idle&gt;-0    |  d.s3. |   312171990 us |          }
    12408 us |   5)    &lt;idle&gt;-0    |  d.s3. |   312171991 us |        }
    12409 us |   5)    &lt;idle&gt;-0    |  d.s3. |   312171991 us |      }

Where the calculation of the time for each function was the return time
minus zero and not the time of when the function returned.

Have these tracers also save the calltime in the fgraph data section and
retrieve it again on the return to get the correct timings again.

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Link: https://lore.kernel.org/20250113183124.61767419@gandalf.local.home
Fixes: f1f36e22bee9 ("ftrace: Have calltime be saved in the fgraph storage")
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The function graph tracer has become generic so that kretprobes and BPF
can use it along with function graph tracing itself. Some of the
infrastructure was specific for function graph tracing such as recording
the calltime and return time of the functions. Calling the clock code on a
high volume function does add overhead. The calculation of the calltime
was removed from the generic code and placed into the function graph
tracer itself so that the other users did not incur this overhead as they
did not need that timestamp.

The calltime field was still kept in the generic return entry structure
and the function graph return entry callback filled it as that structure
was passed to other code.

But this broke both irqsoff and wakeup latency tracer as they still
depended on the trace structure containing the calltime when the option
display-graph is set as it used some of those same functions that the
function graph tracer used. But now the calltime was not set and was just
zero. This caused the calculation of the function time to be the absolute
value of the return timestamp and not the length of the function.

 # cd /sys/kernel/tracing
 # echo 1 &gt; options/display-graph
 # echo irqsoff &gt; current_tracer

The tracers went from:

 #   REL TIME      CPU  TASK/PID       ||||     DURATION                  FUNCTION CALLS
 #      |          |     |    |        ||||      |   |                     |   |   |   |
        0 us |   4)    &lt;idle&gt;-0    |  d..1. |   0.000 us    |  irqentry_enter();
        3 us |   4)    &lt;idle&gt;-0    |  d..2. |               |  irq_enter_rcu() {
        4 us |   4)    &lt;idle&gt;-0    |  d..2. |   0.431 us    |    preempt_count_add();
        5 us |   4)    &lt;idle&gt;-0    |  d.h2. |               |    tick_irq_enter() {
        5 us |   4)    &lt;idle&gt;-0    |  d.h2. |   0.433 us    |      tick_check_oneshot_broadcast_this_cpu();
        6 us |   4)    &lt;idle&gt;-0    |  d.h2. |   2.426 us    |      ktime_get();
        9 us |   4)    &lt;idle&gt;-0    |  d.h2. |               |      tick_nohz_stop_idle() {
       10 us |   4)    &lt;idle&gt;-0    |  d.h2. |   0.398 us    |        nr_iowait_cpu();
       11 us |   4)    &lt;idle&gt;-0    |  d.h1. |   1.903 us    |      }
       11 us |   4)    &lt;idle&gt;-0    |  d.h2. |               |      tick_do_update_jiffies64() {
       12 us |   4)    &lt;idle&gt;-0    |  d.h2. |               |        _raw_spin_lock() {
       12 us |   4)    &lt;idle&gt;-0    |  d.h2. |   0.360 us    |          preempt_count_add();
       13 us |   4)    &lt;idle&gt;-0    |  d.h3. |   0.354 us    |          do_raw_spin_lock();
       14 us |   4)    &lt;idle&gt;-0    |  d.h2. |   2.207 us    |        }
       15 us |   4)    &lt;idle&gt;-0    |  d.h3. |   0.428 us    |        calc_global_load();
       16 us |   4)    &lt;idle&gt;-0    |  d.h3. |               |        _raw_spin_unlock() {
       16 us |   4)    &lt;idle&gt;-0    |  d.h3. |   0.380 us    |          do_raw_spin_unlock();
       17 us |   4)    &lt;idle&gt;-0    |  d.h3. |   0.334 us    |          preempt_count_sub();
       18 us |   4)    &lt;idle&gt;-0    |  d.h1. |   1.768 us    |        }
       18 us |   4)    &lt;idle&gt;-0    |  d.h2. |               |        update_wall_time() {
      [..]

To:

 #   REL TIME      CPU  TASK/PID       ||||     DURATION                  FUNCTION CALLS
 #      |          |     |    |        ||||      |   |                     |   |   |   |
        0 us |   5)    &lt;idle&gt;-0    |  d.s2. |   0.000 us    |  _raw_spin_lock_irqsave();
        0 us |   5)    &lt;idle&gt;-0    |  d.s3. |   312159583 us |      preempt_count_add();
        2 us |   5)    &lt;idle&gt;-0    |  d.s4. |   312159585 us |      do_raw_spin_lock();
        3 us |   5)    &lt;idle&gt;-0    |  d.s4. |               |      _raw_spin_unlock() {
        3 us |   5)    &lt;idle&gt;-0    |  d.s4. |   312159586 us |        do_raw_spin_unlock();
        4 us |   5)    &lt;idle&gt;-0    |  d.s4. |   312159587 us |        preempt_count_sub();
        4 us |   5)    &lt;idle&gt;-0    |  d.s2. |   312159587 us |      }
        5 us |   5)    &lt;idle&gt;-0    |  d.s3. |               |      _raw_spin_lock() {
        5 us |   5)    &lt;idle&gt;-0    |  d.s3. |   312159588 us |        preempt_count_add();
        6 us |   5)    &lt;idle&gt;-0    |  d.s4. |   312159589 us |        do_raw_spin_lock();
        7 us |   5)    &lt;idle&gt;-0    |  d.s3. |   312159590 us |      }
        8 us |   5)    &lt;idle&gt;-0    |  d.s4. |   312159591 us |      calc_wheel_index();
        9 us |   5)    &lt;idle&gt;-0    |  d.s4. |               |      enqueue_timer() {
        9 us |   5)    &lt;idle&gt;-0    |  d.s4. |               |        wake_up_nohz_cpu() {
       11 us |   5)    &lt;idle&gt;-0    |  d.s4. |               |          native_smp_send_reschedule() {
       11 us |   5)    &lt;idle&gt;-0    |  d.s4. |   312171987 us |            default_send_IPI_single_phys();
    12408 us |   5)    &lt;idle&gt;-0    |  d.s3. |   312171990 us |          }
    12408 us |   5)    &lt;idle&gt;-0    |  d.s3. |   312171991 us |        }
    12409 us |   5)    &lt;idle&gt;-0    |  d.s3. |   312171991 us |      }

Where the calculation of the time for each function was the return time
minus zero and not the time of when the function returned.

Have these tracers also save the calltime in the fgraph data section and
retrieve it again on the return to get the correct timings again.

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Link: https://lore.kernel.org/20250113183124.61767419@gandalf.local.home
Fixes: f1f36e22bee9 ("ftrace: Have calltime be saved in the fgraph storage")
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fgraph: Pass ftrace_regs to retfunc</title>
<updated>2024-12-26T15:50:03+00:00</updated>
<author>
<name>Masami Hiramatsu (Google)</name>
<email>mhiramat@kernel.org</email>
</author>
<published>2024-12-26T05:12:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2ca8c112c9676e2394d76760db78ffddf21d93b5'/>
<id>2ca8c112c9676e2394d76760db78ffddf21d93b5</id>
<content type='text'>
Pass ftrace_regs to the fgraph_ops::retfunc(). If ftrace_regs is not
available, it passes a NULL instead. User callback function can access
some registers (including return address) via this ftrace_regs.

Cc: Alexei Starovoitov &lt;alexei.starovoitov@gmail.com&gt;
Cc: Florent Revest &lt;revest@chromium.org&gt;
Cc: Martin KaFai Lau &lt;martin.lau@linux.dev&gt;
Cc: bpf &lt;bpf@vger.kernel.org&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alan Maguire &lt;alan.maguire@oracle.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Link: https://lore.kernel.org/173518992972.391279.14055405490327765506.stgit@devnote2
Signed-off-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pass ftrace_regs to the fgraph_ops::retfunc(). If ftrace_regs is not
available, it passes a NULL instead. User callback function can access
some registers (including return address) via this ftrace_regs.

Cc: Alexei Starovoitov &lt;alexei.starovoitov@gmail.com&gt;
Cc: Florent Revest &lt;revest@chromium.org&gt;
Cc: Martin KaFai Lau &lt;martin.lau@linux.dev&gt;
Cc: bpf &lt;bpf@vger.kernel.org&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alan Maguire &lt;alan.maguire@oracle.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Link: https://lore.kernel.org/173518992972.391279.14055405490327765506.stgit@devnote2
Signed-off-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
