<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/tools/perf/builtin-annotate.c, branch v6.11</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>perf hist: Add symbol_conf.skip_empty</title>
<updated>2024-06-16T04:04:04+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2024-06-07T20:29:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=411ee13598ef322c1a7f4a4022a84d995873f235'/>
<id>411ee13598ef322c1a7f4a4022a84d995873f235</id>
<content type='text'>
Add the skip_empty flag to symbol_conf and set the value from the report
command to preserve the existing behavior.  This makes the code simpler
and will be needed other code which is hard to add a new argument.

Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20240607202918.2357459-4-namhyung@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add the skip_empty flag to symbol_conf and set the value from the report
command to preserve the existing behavior.  This makes the code simpler
and will be needed other code which is hard to add a new argument.

Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20240607202918.2357459-4-namhyung@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>perf dso: Add reference count checking and accessor functions</title>
<updated>2024-05-06T18:28:49+00:00</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2024-05-04T21:38:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ee756ef7491eafd70f390343a1d90930af125a51'/>
<id>ee756ef7491eafd70f390343a1d90930af125a51</id>
<content type='text'>
Add reference count checking to struct dso, this can help with
implementing correct reference counting discipline. To avoid
RC_CHK_ACCESS everywhere, add accessor functions for the variables in
struct dso.

The majority of the change is mechanical in nature and not easy to
split up.

Committer testing:

'perf test' up to this patch shows no regressions.

But:

  util/symbol.c: In function ‘dso__load_bfd_symbols’:
  util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’
   1683 |         dso__set_adjust_symbols(dso);
        |         ^~~~~~~~~~~~~~~~~~~~~~~
  In file included from util/symbol.c:21:
  util/dso.h:268:20: note: declared here
    268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val)
        |                    ^~~~~~~~~~~~~~~~~~~~~~~
  make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1
    MKDIR   /tmp/tmp.ZWHbQftdN6/tests/workloads/
  make[6]: *** Waiting for unfinished jobs....

This was updated:

  -       symbols__fixup_end(&amp;dso-&gt;symbols, false);
  -       symbols__fixup_duplicate(&amp;dso-&gt;symbols);
  -       dso-&gt;adjust_symbols = 1;
  +       symbols__fixup_end(dso__symbols(dso), false);
  +       symbols__fixup_duplicate(dso__symbols(dso));
  +       dso__set_adjust_symbols(dso);

But not build tested with BUILD_NONDISTRO and libbfd devel files installed
(binutils-devel on fedora).

Add the missing argument:

   	symbols__fixup_end(dso__symbols(dso), false);
   	symbols__fixup_duplicate(dso__symbols(dso));
  -	dso__set_adjust_symbols(dso);
  +	dso__set_adjust_symbols(dso, true);

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ahelenia Ziemiańska &lt;nabijaczleweli@nabijaczleweli.xyz&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: Ben Gainey &lt;ben.gainey@arm.com&gt;
Cc: Changbin Du &lt;changbin.du@huawei.com&gt;
Cc: Chengen Du &lt;chengen.du@canonical.com&gt;
Cc: Colin Ian King &lt;colin.i.king@gmail.com&gt;
Cc: Dima Kogan &lt;dima@secretsauce.net&gt;
Cc: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Leo Yan &lt;leo.yan@linux.dev&gt;
Cc: Li Dong &lt;lidong@vivo.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Paran Lee &lt;p4ranlee@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Sun Haiyong &lt;sunhaiyong@loongson.cn&gt;
Cc: Thomas Richter &lt;tmricht@linux.ibm.com&gt;
Cc: Tiezhu Yang &lt;yangtiezhu@loongson.cn&gt;
Cc: Yanteng Si &lt;siyanteng@loongson.cn&gt;
Cc: zhaimingbing &lt;zhaimingbing@cmss.chinamobile.com&gt;
Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add reference count checking to struct dso, this can help with
implementing correct reference counting discipline. To avoid
RC_CHK_ACCESS everywhere, add accessor functions for the variables in
struct dso.

The majority of the change is mechanical in nature and not easy to
split up.

Committer testing:

'perf test' up to this patch shows no regressions.

But:

  util/symbol.c: In function ‘dso__load_bfd_symbols’:
  util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’
   1683 |         dso__set_adjust_symbols(dso);
        |         ^~~~~~~~~~~~~~~~~~~~~~~
  In file included from util/symbol.c:21:
  util/dso.h:268:20: note: declared here
    268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val)
        |                    ^~~~~~~~~~~~~~~~~~~~~~~
  make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1
    MKDIR   /tmp/tmp.ZWHbQftdN6/tests/workloads/
  make[6]: *** Waiting for unfinished jobs....

This was updated:

  -       symbols__fixup_end(&amp;dso-&gt;symbols, false);
  -       symbols__fixup_duplicate(&amp;dso-&gt;symbols);
  -       dso-&gt;adjust_symbols = 1;
  +       symbols__fixup_end(dso__symbols(dso), false);
  +       symbols__fixup_duplicate(dso__symbols(dso));
  +       dso__set_adjust_symbols(dso);

But not build tested with BUILD_NONDISTRO and libbfd devel files installed
(binutils-devel on fedora).

Add the missing argument:

   	symbols__fixup_end(dso__symbols(dso), false);
   	symbols__fixup_duplicate(dso__symbols(dso));
  -	dso__set_adjust_symbols(dso);
  +	dso__set_adjust_symbols(dso, true);

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ahelenia Ziemiańska &lt;nabijaczleweli@nabijaczleweli.xyz&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: Ben Gainey &lt;ben.gainey@arm.com&gt;
Cc: Changbin Du &lt;changbin.du@huawei.com&gt;
Cc: Chengen Du &lt;chengen.du@canonical.com&gt;
Cc: Colin Ian King &lt;colin.i.king@gmail.com&gt;
Cc: Dima Kogan &lt;dima@secretsauce.net&gt;
Cc: Ilkka Koskinen &lt;ilkka@os.amperecomputing.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Leo Yan &lt;leo.yan@linux.dev&gt;
Cc: Li Dong &lt;lidong@vivo.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Paran Lee &lt;p4ranlee@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Sun Haiyong &lt;sunhaiyong@loongson.cn&gt;
Cc: Thomas Richter &lt;tmricht@linux.ibm.com&gt;
Cc: Tiezhu Yang &lt;yangtiezhu@loongson.cn&gt;
Cc: Yanteng Si &lt;siyanteng@loongson.cn&gt;
Cc: zhaimingbing &lt;zhaimingbing@cmss.chinamobile.com&gt;
Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate: Fix data type profiling on stdio</title>
<updated>2024-04-27T01:13:10+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2024-04-23T02:06:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2b87383c885c52f051a90574fb857ca3b76b3f5c'/>
<id>2b87383c885c52f051a90574fb857ca3b76b3f5c</id>
<content type='text'>
The loop in hists__find_annotations() never set the 'nd' pointer to NULL
and it makes stdio output repeating the last element forever.  I think
it doesn't set to NULL for TUI to prevent it from exiting unexpectedly.
But it should just set on stdio mode.

Fixes: d001c7a7f4736743 ("perf annotate-data: Add hist_entry__annotate_data_tui()")
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240423020643.740029-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The loop in hists__find_annotations() never set the 'nd' pointer to NULL
and it makes stdio output repeating the last element forever.  I think
it doesn't set to NULL for TUI to prevent it from exiting unexpectedly.
But it should just set on stdio mode.

Fixes: d001c7a7f4736743 ("perf annotate-data: Add hist_entry__annotate_data_tui()")
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240423020643.740029-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate-data: Add hist_entry__annotate_data_tui()</title>
<updated>2024-04-12T15:02:06+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2024-04-11T03:32:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d001c7a7f473674353311a79cf140d054b26afcd'/>
<id>d001c7a7f473674353311a79cf140d054b26afcd</id>
<content type='text'>
Support data type profiling output on TUI.

Testing from Arnaldo:

First make sure that the debug information for your workload binaries
in embedded in them by building it with '-g' or install the debuginfo
packages, since our workload is 'find':

  root@number:~# type find
  find is hashed (/usr/bin/find)
  root@number:~# rpm -qf /usr/bin/find
  findutils-4.9.0-5.fc39.x86_64
  root@number:~# dnf debuginfo-install findutils
  &lt;SNIP&gt;
  root@number:~#

Then collect some data:

  root@number:~# echo 1 &gt; /proc/sys/vm/drop_caches
  root@number:~# perf mem record find / &gt; /dev/null
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.331 MB perf.data (3982 samples) ]
  root@number:~#

Finally do data-type annotation with the following command, that will
default, as 'perf report' to the --tui mode, with lines colored to
highlight the hotspots, etc.

  root@number:~# perf annotate --data-type
  Annotate type: 'struct predicate' (58 samples)
      Percent     Offset       Size  Field
       100.00          0        312  struct predicate {
         0.00          0          8      PRED_FUNC        pred_func;
         0.00          8          8      char*    p_name;
         0.00         16          4      enum predicate_type      p_type;
         0.00         20          4      enum predicate_precedence        p_prec;
         0.00         24          1      _Bool    side_effects;
         0.00         25          1      _Bool    no_default_print;
         0.00         26          1      _Bool    need_stat;
         0.00         27          1      _Bool    need_type;
         0.00         28          1      _Bool    need_inum;
         0.00         32          4      enum EvaluationCost      p_cost;
         0.00         36          4      float    est_success_rate;
         0.00         40          1      _Bool    literal_control_chars;
         0.00         41          1      _Bool    artificial;
         0.00         48          8      char*    arg_text;
  &lt;SNIP&gt;

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240411033256.2099646-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Support data type profiling output on TUI.

Testing from Arnaldo:

First make sure that the debug information for your workload binaries
in embedded in them by building it with '-g' or install the debuginfo
packages, since our workload is 'find':

  root@number:~# type find
  find is hashed (/usr/bin/find)
  root@number:~# rpm -qf /usr/bin/find
  findutils-4.9.0-5.fc39.x86_64
  root@number:~# dnf debuginfo-install findutils
  &lt;SNIP&gt;
  root@number:~#

Then collect some data:

  root@number:~# echo 1 &gt; /proc/sys/vm/drop_caches
  root@number:~# perf mem record find / &gt; /dev/null
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.331 MB perf.data (3982 samples) ]
  root@number:~#

Finally do data-type annotation with the following command, that will
default, as 'perf report' to the --tui mode, with lines colored to
highlight the hotspots, etc.

  root@number:~# perf annotate --data-type
  Annotate type: 'struct predicate' (58 samples)
      Percent     Offset       Size  Field
       100.00          0        312  struct predicate {
         0.00          0          8      PRED_FUNC        pred_func;
         0.00          8          8      char*    p_name;
         0.00         16          4      enum predicate_type      p_type;
         0.00         20          4      enum predicate_precedence        p_prec;
         0.00         24          1      _Bool    side_effects;
         0.00         25          1      _Bool    no_default_print;
         0.00         26          1      _Bool    need_stat;
         0.00         27          1      _Bool    need_type;
         0.00         28          1      _Bool    need_inum;
         0.00         32          4      enum EvaluationCost      p_cost;
         0.00         36          4      float    est_success_rate;
         0.00         40          1      _Bool    literal_control_chars;
         0.00         41          1      _Bool    artificial;
         0.00         48          8      char*    arg_text;
  &lt;SNIP&gt;

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240411033256.2099646-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate-data: Add hist_entry__annotate_data_tty()</title>
<updated>2024-04-12T15:02:06+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2024-04-11T03:32:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9b561be15febda6f9b314c9eab51e157f8f34dea'/>
<id>9b561be15febda6f9b314c9eab51e157f8f34dea</id>
<content type='text'>
And move the related code into util/annotate-data.c file.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240411033256.2099646-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
And move the related code into util/annotate-data.c file.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240411033256.2099646-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate: Show progress of sample processing</title>
<updated>2024-04-12T15:02:06+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2024-04-11T03:32:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d9aedc12d3477ab72ec48bb7abb0f038c902c937'/>
<id>d9aedc12d3477ab72ec48bb7abb0f038c902c937</id>
<content type='text'>
Like 'perf report', it can take a while to process samples.

Show a progress window to inform users how that it is not stuck.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240411033256.2099646-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Like 'perf report', it can take a while to process samples.

Show a progress window to inform users how that it is not stuck.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240411033256.2099646-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate: Honor output options with --data-type</title>
<updated>2024-04-03T14:48:56+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2024-03-22T22:43:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bdeaf6ffec8b3590740973712c6be673c28a54a4'/>
<id>bdeaf6ffec8b3590740973712c6be673c28a54a4</id>
<content type='text'>
For data type profiling output, it should be in sync with normal output
so make it display percentage for each field.  Also use coloring scheme
for users to identify fields with big overhead easily.

Users can use --show-total-period or --show-nr-samples to change the
output style like in the normal perf annotate output.

Before:

  $ perf annotate --data-type
  Annotate type: 'struct task_struct' in [kernel.kallsyms] (34 samples):
  ============================================================================
      samples     offset       size  field
           34          0       9792  struct task_struct    {
            2          0         24      struct thread_info       thread_info {
            0          0          8          long unsigned int    flags;
            1          8          8          long unsigned int    syscall_work;
            0         16          4          u32  status;
            1         20          4          u32  cpu;
                                         };

After:

  $ perf annotate --data-type
  Annotate type: 'struct task_struct' in [kernel.kallsyms] (34 samples):
  ============================================================================
   Percent     offset       size  field
    100.00          0       9792  struct task_struct       {
      3.55          0         24      struct thread_info  thread_info {
      0.00          0          8          long unsigned int       flags;
      1.63          8          8          long unsigned int       syscall_work;
      0.00         16          4          u32     status;
      1.91         20          4          u32     cpu;
                                      };

Committer testing:

First collect a suitable perf.data file for use with 'perf annotate --data-type':

  root@number:~# perf mem record -a sleep 1s
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 11.047 MB perf.data (3466 samples) ]
  root@number:~#

Then, before:

  root@number:~# perf annotate --data-type
  Annotate type: 'union ' in /usr/lib64/libc.so.6 (6 samples):
  ============================================================================
      samples     offset       size  field
            6          0         40  union         {
            6          0         40      struct __pthread_mutex_s __data {
            2          0          4          int  __lock;
            0          4          4          unsigned int __count;
            0          8          4          int  __owner;
            1         12          4          unsigned int __nusers;
            2         16          4          int  __kind;
            1         20          2          short int    __spins;
            0         22          2          short int    __elision;
            0         24         16          __pthread_list_t     __list {
            0         24          8              struct __pthread_internal_list*  __prev;
            0         32          8              struct __pthread_internal_list*  __next;
                                             };
                                         };
            0          0          0      char*    __size;
            2          0          8      long int __align;
                                     };
  &lt;SNIP&gt;

And after:

  Annotate type: 'union ' in /usr/lib64/libc.so.6 (6 samples):
  ============================================================================
   Percent     offset       size  field
    100.00          0         40  union    {
    100.00          0         40      struct __pthread_mutex_s    __data {
     31.27          0          4          int     __lock;
      0.00          4          4          unsigned int    __count;
      0.00          8          4          int     __owner;
      7.67         12          4          unsigned int    __nusers;
     53.10         16          4          int     __kind;
      7.96         20          2          short int       __spins;
      0.00         22          2          short int       __elision;
      0.00         24         16          __pthread_list_t        __list {
      0.00         24          8              struct __pthread_internal_list*     __prev;
      0.00         32          8              struct __pthread_internal_list*     __next;
                                          };
                                      };
      0.00          0          0      char*       __size;
     31.27          0          8      long int    __align;
                                  };
  &lt;SNIP&gt;

The lines with percentages &gt;= 7.67 have its percentages red colored.

Reviewed-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240322224313.423181-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For data type profiling output, it should be in sync with normal output
so make it display percentage for each field.  Also use coloring scheme
for users to identify fields with big overhead easily.

Users can use --show-total-period or --show-nr-samples to change the
output style like in the normal perf annotate output.

Before:

  $ perf annotate --data-type
  Annotate type: 'struct task_struct' in [kernel.kallsyms] (34 samples):
  ============================================================================
      samples     offset       size  field
           34          0       9792  struct task_struct    {
            2          0         24      struct thread_info       thread_info {
            0          0          8          long unsigned int    flags;
            1          8          8          long unsigned int    syscall_work;
            0         16          4          u32  status;
            1         20          4          u32  cpu;
                                         };

After:

  $ perf annotate --data-type
  Annotate type: 'struct task_struct' in [kernel.kallsyms] (34 samples):
  ============================================================================
   Percent     offset       size  field
    100.00          0       9792  struct task_struct       {
      3.55          0         24      struct thread_info  thread_info {
      0.00          0          8          long unsigned int       flags;
      1.63          8          8          long unsigned int       syscall_work;
      0.00         16          4          u32     status;
      1.91         20          4          u32     cpu;
                                      };

Committer testing:

First collect a suitable perf.data file for use with 'perf annotate --data-type':

  root@number:~# perf mem record -a sleep 1s
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 11.047 MB perf.data (3466 samples) ]
  root@number:~#

Then, before:

  root@number:~# perf annotate --data-type
  Annotate type: 'union ' in /usr/lib64/libc.so.6 (6 samples):
  ============================================================================
      samples     offset       size  field
            6          0         40  union         {
            6          0         40      struct __pthread_mutex_s __data {
            2          0          4          int  __lock;
            0          4          4          unsigned int __count;
            0          8          4          int  __owner;
            1         12          4          unsigned int __nusers;
            2         16          4          int  __kind;
            1         20          2          short int    __spins;
            0         22          2          short int    __elision;
            0         24         16          __pthread_list_t     __list {
            0         24          8              struct __pthread_internal_list*  __prev;
            0         32          8              struct __pthread_internal_list*  __next;
                                             };
                                         };
            0          0          0      char*    __size;
            2          0          8      long int __align;
                                     };
  &lt;SNIP&gt;

And after:

  Annotate type: 'union ' in /usr/lib64/libc.so.6 (6 samples):
  ============================================================================
   Percent     offset       size  field
    100.00          0         40  union    {
    100.00          0         40      struct __pthread_mutex_s    __data {
     31.27          0          4          int     __lock;
      0.00          4          4          unsigned int    __count;
      0.00          8          4          int     __owner;
      7.67         12          4          unsigned int    __nusers;
     53.10         16          4          int     __kind;
      7.96         20          2          short int       __spins;
      0.00         22          2          short int       __elision;
      0.00         24         16          __pthread_list_t        __list {
      0.00         24          8              struct __pthread_internal_list*     __prev;
      0.00         32          8              struct __pthread_internal_list*     __next;
                                          };
                                      };
      0.00          0          0      char*       __size;
     31.27          0          8      long int    __align;
                                  };
  &lt;SNIP&gt;

The lines with percentages &gt;= 7.67 have its percentages red colored.

Reviewed-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240322224313.423181-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate: Get rid of duplicate --group option item</title>
<updated>2024-04-03T14:48:56+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2024-03-22T22:43:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=374af9f1f06b5e991c810d2e4983d6f58df32136'/>
<id>374af9f1f06b5e991c810d2e4983d6f58df32136</id>
<content type='text'>
The options array in cmd_annotate() has duplicate --group options.  It
only needs one and let's get rid of the other.

  $ perf annotate -h 2&gt;&amp;1 | grep group
        --group           Show event group information together
        --group           Show event group information together

Fixes: 7ebaf4890f63eb90 ("perf annotate: Support '--group' option")
Reviewed-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240322224313.423181-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The options array in cmd_annotate() has duplicate --group options.  It
only needs one and let's get rid of the other.

  $ perf annotate -h 2&gt;&amp;1 | grep group
        --group           Show event group information together
        --group           Show event group information together

Fixes: 7ebaf4890f63eb90 ("perf annotate: Support '--group' option")
Reviewed-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20240322224313.423181-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate-data: Implement instruction tracking</title>
<updated>2024-03-21T13:41:29+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2024-03-19T05:51:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=eb8a55e01de9a671e130b4c0e5b483997f4f491f'/>
<id>eb8a55e01de9a671e130b4c0e5b483997f4f491f</id>
<content type='text'>
If it failed to find a variable for the location directly, it might be
due to a missing variable in the source code.  For example, accessing
pointer variables in a chain can result in the case like below:

  struct foo *foo = ...;

  int i = foo-&gt;bar-&gt;baz;

The DWARF debug information is created for each variable so it'd have
one for 'foo'.  But there's no variable for 'foo-&gt;bar' and then it
cannot know the type of 'bar' and 'baz'.

The above source code can be compiled to the follow x86 instructions:

  mov  0x8(%rax), %rcx
  mov  0x4(%rcx), %rdx   &lt;=== PMU sample
  mov  %rdx, -4(%rbp)

Let's say 'foo' is located in the %rax and it has a pointer to struct
foo.  But perf sample is captured in the second instruction and there
is no variable or type info for the %rcx.

It'd be great if compiler could generate debug info for %rcx, but we
should handle it on our side.  So this patch implements the logic to
iterate instructions and update the type table for each location.

As it already collected a list of scopes including the target
instruction, we can use it to construct the type table smartly.

  +----------------  scope[0] subprogram
  |
  | +--------------  scope[1] lexical_block
  | |
  | | +------------  scope[2] inlined_subroutine
  | | |
  | | | +----------  scope[3] inlined_subroutine
  | | | |
  | | | | +--------  scope[4] lexical_block
  | | | | |
  | | | | |     ***  target instruction
  ...

Image the target instruction has 5 scopes, each scope will have its own
variables and parameters.  Then it can start with the innermost scope
(4).  So it'd search the shortest path from the start of scope[4] to
the target address and build a list of basic blocks.  Then it iterates
the basic blocks with the variables in the scope and update the table.
If it finds a type at the target instruction, then returns it.

Otherwise, it moves to the upper scope[3].  Now it'd search the shortest
path from the start of scope[3] to the start of scope[4].  Then connect
it to the existing basic block list.  Then it'd iterate the blocks with
variables for both scopes.  It can repeat this until it finds a type at
the target instruction or reaches to the top scope[0].

As the basic blocks contain the shortest path, it won't worry about
branches and can update the table simply.

The final check will be done by find_matching_type() in the next patch.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: https://lore.kernel.org/r/20240319055115.4063940-15-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If it failed to find a variable for the location directly, it might be
due to a missing variable in the source code.  For example, accessing
pointer variables in a chain can result in the case like below:

  struct foo *foo = ...;

  int i = foo-&gt;bar-&gt;baz;

The DWARF debug information is created for each variable so it'd have
one for 'foo'.  But there's no variable for 'foo-&gt;bar' and then it
cannot know the type of 'bar' and 'baz'.

The above source code can be compiled to the follow x86 instructions:

  mov  0x8(%rax), %rcx
  mov  0x4(%rcx), %rdx   &lt;=== PMU sample
  mov  %rdx, -4(%rbp)

Let's say 'foo' is located in the %rax and it has a pointer to struct
foo.  But perf sample is captured in the second instruction and there
is no variable or type info for the %rcx.

It'd be great if compiler could generate debug info for %rcx, but we
should handle it on our side.  So this patch implements the logic to
iterate instructions and update the type table for each location.

As it already collected a list of scopes including the target
instruction, we can use it to construct the type table smartly.

  +----------------  scope[0] subprogram
  |
  | +--------------  scope[1] lexical_block
  | |
  | | +------------  scope[2] inlined_subroutine
  | | |
  | | | +----------  scope[3] inlined_subroutine
  | | | |
  | | | | +--------  scope[4] lexical_block
  | | | | |
  | | | | |     ***  target instruction
  ...

Image the target instruction has 5 scopes, each scope will have its own
variables and parameters.  Then it can start with the innermost scope
(4).  So it'd search the shortest path from the start of scope[4] to
the target address and build a list of basic blocks.  Then it iterates
the basic blocks with the variables in the scope and update the table.
If it finds a type at the target instruction, then returns it.

Otherwise, it moves to the upper scope[3].  Now it'd search the shortest
path from the start of scope[3] to the start of scope[4].  Then connect
it to the existing basic block list.  Then it'd iterate the blocks with
variables for both scopes.  It can repeat this until it finds a type at
the target instruction or reaches to the top scope[0].

As the basic blocks contain the shortest path, it won't worry about
branches and can update the table simply.

The final check will be done by find_matching_type() in the next patch.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: https://lore.kernel.org/r/20240319055115.4063940-15-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate: Add --insn-stat option for debugging</title>
<updated>2023-12-24T01:40:17+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2023-12-13T00:13:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=58824fa0087e1cb732edbf1f112a5ea0b2205c8b'/>
<id>58824fa0087e1cb732edbf1f112a5ea0b2205c8b</id>
<content type='text'>
This is for a debugging purpose.  It'd be useful to see per-instrucion
level success/failure stats.

  $ perf annotate --data-type --insn-stat
  Annotate Instruction stats
  total 264, ok 143 (54.2%), bad 121 (45.8%)

    Name      :  Good   Bad
  -----------------------------------------------------------
    movq      :    45    31
    movl      :    22    11
    popq      :     0    19
    cmpl      :    16     3
    addq      :     8     7
    cmpq      :    11     3
    cmpxchgl  :     3     7
    cmpxchgq  :     8     0
    incl      :     3     3
    movzbl    :     4     2
    incq      :     4     2
    decl      :     6     0
    ...

Committer notes:

So these are about being able to find the type for accesses from these
instructions, we should improve the naming, but it is for debugging, we
can improve this later:

  @@ -3726,6 +3759,10 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
                          continue;

                  mem_type = find_data_type(ms, ip, op_loc-&gt;reg, op_loc-&gt;offset);
  +               if (mem_type)
  +                       istat-&gt;good++;
  +               else
  +                       istat-&gt;bad++;

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-18-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is for a debugging purpose.  It'd be useful to see per-instrucion
level success/failure stats.

  $ perf annotate --data-type --insn-stat
  Annotate Instruction stats
  total 264, ok 143 (54.2%), bad 121 (45.8%)

    Name      :  Good   Bad
  -----------------------------------------------------------
    movq      :    45    31
    movl      :    22    11
    popq      :     0    19
    cmpl      :    16     3
    addq      :     8     7
    cmpq      :    11     3
    cmpxchgl  :     3     7
    cmpxchgq  :     8     0
    incl      :     3     3
    movzbl    :     4     2
    incq      :     4     2
    decl      :     6     0
    ...

Committer notes:

So these are about being able to find the type for accesses from these
instructions, we should improve the naming, but it is for debugging, we
can improve this later:

  @@ -3726,6 +3759,10 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
                          continue;

                  mem_type = find_data_type(ms, ip, op_loc-&gt;reg, op_loc-&gt;offset);
  +               if (mem_type)
  +                       istat-&gt;good++;
  +               else
  +                       istat-&gt;bad++;

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-18-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
