<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/tools/perf/util/annotate.c, branch v6.15</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>perf annotate: Implement code + data type annotation</title>
<updated>2025-03-13T07:19:51+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-03-10T22:49:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=30c5a3941d0f1633a6c4d6529eb3c6ff9b465b4a'/>
<id>30c5a3941d0f1633a6c4d6529eb3c6ff9b465b4a</id>
<content type='text'>
Sometimes it's useful to see both instructions and their data type
together.  Let's extend the annotate code to use data type profiling
functions.

To make it easy to pass more argument, introduce a struct to carry
necessary information together.  Also add a new annotation_option called
'code_with_type' to control the behavior.  This is not enabled yet but
it'll be set later from the command line.

For simplicity, this is implemented for --stdio only.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Link: https://lore.kernel.org/r/20250310224925.799005-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sometimes it's useful to see both instructions and their data type
together.  Let's extend the annotate code to use data type profiling
functions.

To make it easy to pass more argument, introduce a struct to carry
necessary information together.  Also add a new annotation_option called
'code_with_type' to control the behavior.  This is not enabled yet but
it'll be set later from the command line.

For simplicity, this is implemented for --stdio only.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Link: https://lore.kernel.org/r/20250310224925.799005-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate: Factor out __hist_entry__get_data_type()</title>
<updated>2025-03-13T07:19:51+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-03-10T22:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=236ee2569a5de7ae3bf2bce94a4101f528ce7de8'/>
<id>236ee2569a5de7ae3bf2bce94a4101f528ce7de8</id>
<content type='text'>
So that it can only handle a single disasm_linme and hopefully make the
code simpler.  This is also a preparation to be called from different
places later.

The NO_TYPE macro was added to distinguish when it failed or needs retry.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Link: https://lore.kernel.org/r/20250310224925.799005-6-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
So that it can only handle a single disasm_linme and hopefully make the
code simpler.  This is also a preparation to be called from different
places later.

The NO_TYPE macro was added to distinguish when it failed or needs retry.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Link: https://lore.kernel.org/r/20250310224925.799005-6-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate: Pass hist_entry to annotate functions</title>
<updated>2025-03-13T07:19:51+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-03-10T22:49:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fe8da6692aa8c1279b97a609f67dd56048b68bec'/>
<id>fe8da6692aa8c1279b97a609f67dd56048b68bec</id>
<content type='text'>
It's a prepartion to support code annotation and data type
annotation at the same time.  Data type annotation needs more
information in the hist_entry so it needs to be passed deeper.

Also rename a function with the same name in the builtin-annotate.c
to hist_entry__stdio_annotate since it matches better to the command
line option.  And change the condition inside to be simpler.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Link: https://lore.kernel.org/r/20250310224925.799005-5-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's a prepartion to support code annotation and data type
annotation at the same time.  Data type annotation needs more
information in the hist_entry so it needs to be passed deeper.

Also rename a function with the same name in the builtin-annotate.c
to hist_entry__stdio_annotate since it matches better to the command
line option.  And change the condition inside to be simpler.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Link: https://lore.kernel.org/r/20250310224925.799005-5-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate: Pass annotation_options to annotation_line__print()</title>
<updated>2025-03-13T07:19:51+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-03-10T22:49:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9aa3cbbffb166ffac6b276c00a6a991dc8de2f17'/>
<id>9aa3cbbffb166ffac6b276c00a6a991dc8de2f17</id>
<content type='text'>
The annotation_line__print() has many arguments.  But min_percent,
max_lines and percent_type are from struct annotaion_options.  So let's
pass a pointer to the option instead of passing them separately to
reduce the number of function arguments.

Actually it has a recursive call if 'queue' is set.  Add a new option
instance to pass different values for the case.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Link: https://lore.kernel.org/r/20250310224925.799005-4-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The annotation_line__print() has many arguments.  But min_percent,
max_lines and percent_type are from struct annotaion_options.  So let's
pass a pointer to the option instead of passing them separately to
reduce the number of function arguments.

Actually it has a recursive call if 'queue' is set.  Add a new option
instance to pass different values for the case.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Link: https://lore.kernel.org/r/20250310224925.799005-4-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate: Remove unused len parameter from annotation_line__print()</title>
<updated>2025-03-13T07:19:51+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-03-10T22:49:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1f284082b167d275ca1d291869da4ef3b2261fb9'/>
<id>1f284082b167d275ca1d291869da4ef3b2261fb9</id>
<content type='text'>
It's not used anywhere, let's get rid of it.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Link: https://lore.kernel.org/r/20250310224925.799005-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's not used anywhere, let's get rid of it.

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Link: https://lore.kernel.org/r/20250310224925.799005-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate: Use an array for the disassembler preference</title>
<updated>2025-01-27T23:58:01+00:00</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2025-01-24T04:38:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bde4ccfd5ab5361490514fc4af7497989cfbee17'/>
<id>bde4ccfd5ab5361490514fc4af7497989cfbee17</id>
<content type='text'>
Prior to this change a string was used which could cause issues with
an unrecognized disassembler in symbol__disassembler. Change to
initializing an array of perf_disassembler enum values. If a value
already exists then adding it a second time is ignored to avoid array
out of bounds problems present in the previous code, it also allows a
statically sized array and removes memory allocation needs. Errors in
the disassembler string are reported when the config is parsed during
perf annotate or perf top start up. If the array is uninitialized
after processing the config file the default llvm, capstone then
objdump values are added but without a need to parse a string.

Fixes: a6e8a58de629 ("perf disasm: Allow configuring what disassemblers to use")
Closes: https://lore.kernel.org/lkml/CAP-5=fUdfCyxmEiTpzS2uumUp3-SyQOseX2xZo81-dQtWXj6vA@mail.gmail.com/
Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Tested-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20250124043856.1177264-1-irogers@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Prior to this change a string was used which could cause issues with
an unrecognized disassembler in symbol__disassembler. Change to
initializing an array of perf_disassembler enum values. If a value
already exists then adding it a second time is ignored to avoid array
out of bounds problems present in the previous code, it also allows a
statically sized array and removes memory allocation needs. Errors in
the disassembler string are reported when the config is parsed during
perf annotate or perf top start up. If the array is uninitialized
after processing the config file the default llvm, capstone then
objdump values are added but without a need to parse a string.

Fixes: a6e8a58de629 ("perf disasm: Allow configuring what disassemblers to use")
Closes: https://lore.kernel.org/lkml/CAP-5=fUdfCyxmEiTpzS2uumUp3-SyQOseX2xZo81-dQtWXj6vA@mail.gmail.com/
Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Tested-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20250124043856.1177264-1-irogers@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate: Prefer passing evsel to evsel-&gt;core.idx</title>
<updated>2025-01-18T18:02:10+00:00</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2025-01-17T18:18:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=035f0c279bcfc07314240de273d90f4061aef04d'/>
<id>035f0c279bcfc07314240de273d90f4061aef04d</id>
<content type='text'>
An evsel idx may not be stable due to sorting, evlist removal,
etc. Try to reduce it being part of APIs by explicitly passing the
evsel in annotate code. Internally the code just reads evsel-&gt;core.idx
so behavior is unchanged.

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Chen Ni &lt;nichen@iscas.ac.cn&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Link: https://lore.kernel.org/r/20250117181848.690474-1-irogers@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
An evsel idx may not be stable due to sorting, evlist removal,
etc. Try to reduce it being part of APIs by explicitly passing the
evsel in annotate code. Internally the code just reads evsel-&gt;core.idx
so behavior is unchanged.

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Chen Ni &lt;nichen@iscas.ac.cn&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Link: https://lore.kernel.org/r/20250117181848.690474-1-irogers@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf disasm: Allow configuring what disassemblers to use</title>
<updated>2024-11-13T19:27:35+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2024-11-11T15:17:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a6e8a58de6294578195447596fb975a9027b4d2c'/>
<id>a6e8a58de6294578195447596fb975a9027b4d2c</id>
<content type='text'>
The perf tools annotation code used for a long time parsing the output
of binutils's objdump (or its reimplementations, like llvm's) to then
parse and augment it with samples, allow navigation, etc.

More recently disassemblers from the capstone and llvm (libraries, not
parsing the output of tools using those libraries to mimic binutils's
objdump output) were introduced.

So when all those methods are available, there is a static preference
for a series of attempts of disassembling a binary, with the 'llvm,
capstone, objdump' sequence being hard coded.

This patch allows users to change that sequence, specifying via a 'perf
config' 'annotate.disassemblers' entry which and in what order
disassemblers should be attempted.

As alluded to in the comments in the source code of this series, this
flexibility is useful for users and developers alike, elliminating the
requirement to rebuild the tool with some specific set of libraries to
see how the output of disassembling would be for one of these methods.

  root@x1:~# rm -f ~/.perfconfig
  root@x1:~# perf annotate -v --stdio2 update_load_avg
  &lt;SNIP&gt;
  symbol__disassemble:
    filename=/usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux,
    sym=update_load_avg, start=0xffffffffb6148fe0, en&gt;
  annotating [0x6ff7170]
    /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux :
    [0x7407ca0] update_load_avg
  Disassembled with llvm
  annotate.disassemblers=llvm,capstone,objdump
  Samples: 66  of event 'cpu_atom/cycles/P', 10000 Hz,
	Event count (approx.): 5185444, [percent: local period]
  update_load_avg()
    /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux
  Percent       0xffffffff81148fe0 &lt;update_load_avg&gt;:
     1.61         pushq   %r15
                  pushq   %r14
     1.00         pushq   %r13
                  movl    %edx,%r13d
     1.90         pushq   %r12
                  pushq   %rbp
                  movq    %rsi,%rbp
                  pushq   %rbx
                  movq    %rdi,%rbx
                  subq    $0x18,%rsp
    15.14         movl    0x1a4(%rdi),%eax

  root@x1:~# perf config annotate.disassemblers=capstone
  root@x1:~# cat ~/.perfconfig
  # this file is auto-generated.
  [annotate]
	  disassemblers = capstone
  root@x1:~#
  root@x1:~# perf annotate -v --stdio2 update_load_avg
  &lt;SNIP&gt;
  Disassembled with capstone
  annotate.disassemblers=capstone
  Samples: 66  of event 'cpu_atom/cycles/P', 10000 Hz,
  Event count (approx.): 5185444, [percent: local period]
  update_load_avg()
  /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux
  Percent       0xffffffff81148fe0 &lt;update_load_avg&gt;:
     1.61         pushq   %r15
                  pushq   %r14
     1.00         pushq   %r13
                  movl    %edx,%r13d
     1.90         pushq   %r12
                  pushq   %rbp
                  movq    %rsi,%rbp
                  pushq   %rbx
                  movq    %rdi,%rbx
                  subq    $0x18,%rsp
    15.14         movl    0x1a4(%rdi),%eax
  root@x1:~# perf config annotate.disassemblers=objdump,capstone
  root@x1:~# perf config annotate.disassemblers
  annotate.disassemblers=objdump,capstone
  root@x1:~# cat ~/.perfconfig
  # this file is auto-generated.
  [annotate]
	  disassemblers = objdump,capstone
  root@x1:~# perf annotate -v --stdio2 update_load_avg
  Executing: objdump  --start-address=0xffffffff81148fe0 \
		      --stop-address=0xffffffff811497aa  \
		      -d --no-show-raw-insn -S -C "$1"
  Disassembled with objdump
  annotate.disassemblers=objdump,capstone
  Samples: 66  of event 'cpu_atom/cycles/P', 10000 Hz,
  Event count (approx.): 5185444, [percent: local period]
  update_load_avg()
  /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux
  Percent

                Disassembly of section .text:

                ffffffff81148fe0 &lt;update_load_avg&gt;:
                #define DO_ATTACH       0x4

                ffffffff81148fe0 &lt;update_load_avg&gt;:
                #define DO_ATTACH       0x4
                #define DO_DETACH       0x8

                /* Update task and its cfs_rq load average */
                static inline void update_load_avg(struct cfs_rq *cfs_rq,
						   struct sched_entity *se,
						   int flags)
                {
     1.61         push   %r15
                  push   %r14
     1.00         push   %r13
                  mov    %edx,%r13d
     1.90         push   %r12
                  push   %rbp
                  mov    %rsi,%rbp
                  push   %rbx
                  mov    %rdi,%rbx
                  sub    $0x18,%rsp
                }

                /* rq-&gt;task_clock normalized against any time
		   this cfs_rq has spent throttled */
                static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq)
                {
                if (unlikely(cfs_rq-&gt;throttle_count))
    15.14         mov    0x1a4(%rdi),%eax
  root@x1:~#

After adding a way to select the disassembler from the command line a
'perf test' comparing the output of the various diassemblers should be
introduced, to test these codebases.

Acked-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Steinar H. Gunderson &lt;sesse@google.com&gt;
Link: https://lore.kernel.org/r/20241111151734.1018476-4-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The perf tools annotation code used for a long time parsing the output
of binutils's objdump (or its reimplementations, like llvm's) to then
parse and augment it with samples, allow navigation, etc.

More recently disassemblers from the capstone and llvm (libraries, not
parsing the output of tools using those libraries to mimic binutils's
objdump output) were introduced.

So when all those methods are available, there is a static preference
for a series of attempts of disassembling a binary, with the 'llvm,
capstone, objdump' sequence being hard coded.

This patch allows users to change that sequence, specifying via a 'perf
config' 'annotate.disassemblers' entry which and in what order
disassemblers should be attempted.

As alluded to in the comments in the source code of this series, this
flexibility is useful for users and developers alike, elliminating the
requirement to rebuild the tool with some specific set of libraries to
see how the output of disassembling would be for one of these methods.

  root@x1:~# rm -f ~/.perfconfig
  root@x1:~# perf annotate -v --stdio2 update_load_avg
  &lt;SNIP&gt;
  symbol__disassemble:
    filename=/usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux,
    sym=update_load_avg, start=0xffffffffb6148fe0, en&gt;
  annotating [0x6ff7170]
    /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux :
    [0x7407ca0] update_load_avg
  Disassembled with llvm
  annotate.disassemblers=llvm,capstone,objdump
  Samples: 66  of event 'cpu_atom/cycles/P', 10000 Hz,
	Event count (approx.): 5185444, [percent: local period]
  update_load_avg()
    /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux
  Percent       0xffffffff81148fe0 &lt;update_load_avg&gt;:
     1.61         pushq   %r15
                  pushq   %r14
     1.00         pushq   %r13
                  movl    %edx,%r13d
     1.90         pushq   %r12
                  pushq   %rbp
                  movq    %rsi,%rbp
                  pushq   %rbx
                  movq    %rdi,%rbx
                  subq    $0x18,%rsp
    15.14         movl    0x1a4(%rdi),%eax

  root@x1:~# perf config annotate.disassemblers=capstone
  root@x1:~# cat ~/.perfconfig
  # this file is auto-generated.
  [annotate]
	  disassemblers = capstone
  root@x1:~#
  root@x1:~# perf annotate -v --stdio2 update_load_avg
  &lt;SNIP&gt;
  Disassembled with capstone
  annotate.disassemblers=capstone
  Samples: 66  of event 'cpu_atom/cycles/P', 10000 Hz,
  Event count (approx.): 5185444, [percent: local period]
  update_load_avg()
  /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux
  Percent       0xffffffff81148fe0 &lt;update_load_avg&gt;:
     1.61         pushq   %r15
                  pushq   %r14
     1.00         pushq   %r13
                  movl    %edx,%r13d
     1.90         pushq   %r12
                  pushq   %rbp
                  movq    %rsi,%rbp
                  pushq   %rbx
                  movq    %rdi,%rbx
                  subq    $0x18,%rsp
    15.14         movl    0x1a4(%rdi),%eax
  root@x1:~# perf config annotate.disassemblers=objdump,capstone
  root@x1:~# perf config annotate.disassemblers
  annotate.disassemblers=objdump,capstone
  root@x1:~# cat ~/.perfconfig
  # this file is auto-generated.
  [annotate]
	  disassemblers = objdump,capstone
  root@x1:~# perf annotate -v --stdio2 update_load_avg
  Executing: objdump  --start-address=0xffffffff81148fe0 \
		      --stop-address=0xffffffff811497aa  \
		      -d --no-show-raw-insn -S -C "$1"
  Disassembled with objdump
  annotate.disassemblers=objdump,capstone
  Samples: 66  of event 'cpu_atom/cycles/P', 10000 Hz,
  Event count (approx.): 5185444, [percent: local period]
  update_load_avg()
  /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux
  Percent

                Disassembly of section .text:

                ffffffff81148fe0 &lt;update_load_avg&gt;:
                #define DO_ATTACH       0x4

                ffffffff81148fe0 &lt;update_load_avg&gt;:
                #define DO_ATTACH       0x4
                #define DO_DETACH       0x8

                /* Update task and its cfs_rq load average */
                static inline void update_load_avg(struct cfs_rq *cfs_rq,
						   struct sched_entity *se,
						   int flags)
                {
     1.61         push   %r15
                  push   %r14
     1.00         push   %r13
                  mov    %edx,%r13d
     1.90         push   %r12
                  push   %rbp
                  mov    %rsi,%rbp
                  push   %rbx
                  mov    %rdi,%rbx
                  sub    $0x18,%rsp
                }

                /* rq-&gt;task_clock normalized against any time
		   this cfs_rq has spent throttled */
                static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq)
                {
                if (unlikely(cfs_rq-&gt;throttle_count))
    15.14         mov    0x1a4(%rdi),%eax
  root@x1:~#

After adding a way to select the disassembler from the command line a
'perf test' comparing the output of the various diassemblers should be
introduced, to test these codebases.

Acked-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Steinar H. Gunderson &lt;sesse@google.com&gt;
Link: https://lore.kernel.org/r/20241111151734.1018476-4-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf dwarf-regs: Pass accurate disassembly machine to get_dwarf_regnum</title>
<updated>2024-11-09T16:39:13+00:00</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2024-11-08T23:45:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9fc4489a16f41d9306af6c94ca97be6364d51ea9'/>
<id>9fc4489a16f41d9306af6c94ca97be6364d51ea9</id>
<content type='text'>
Rather than pass 0/EM_NONE, use the value computed in the disasm
struct arch. Switch the EM_NONE case to EM_HOST, rewriting EM_NONE if
it were passed to get_dwarf_regnum. Pass a flags value as
architectures like csky need the flags to determine the ABI variant.

Reviewed-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Anup Patel &lt;anup@brainfault.org&gt;
Cc: Yang Jihong &lt;yangjihong@bytedance.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Albert Ou &lt;aou@eecs.berkeley.edu&gt;
Cc: Shenlin Liang &lt;liangshenlin@eswincomputing.com&gt;
Cc: Nick Terrell &lt;terrelln@fb.com&gt;
Cc: Guilherme Amadio &lt;amadio@gentoo.org&gt;
Cc: Steinar H. Gunderson &lt;sesse@google.com&gt;
Cc: Changbin Du &lt;changbin.du@huawei.com&gt;
Cc: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Cc: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Cc: Huacai Chen &lt;chenhuacai@kernel.org&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: James Clark &lt;james.clark@linaro.org&gt;
Cc: Mike Leach &lt;mike.leach@linaro.org&gt;
Cc: Chen Pei &lt;cp0613@linux.alibaba.com&gt;
Cc: Leo Yan &lt;leo.yan@linux.dev&gt;
Cc: Oliver Upton &lt;oliver.upton@linux.dev&gt;
Cc: Aditya Gupta &lt;adityag@linux.ibm.com&gt;
Cc: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao &lt;maobibo@loongson.cn&gt;
Cc: John Garry &lt;john.g.garry@oracle.com&gt;
Cc: Atish Patra &lt;atishp@rivosinc.com&gt;
Cc: Dima Kogan &lt;dima@secretsauce.net&gt;
Cc: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Cc: Dr. David Alan Gilbert &lt;linux@treblig.org&gt;
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-6-irogers@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rather than pass 0/EM_NONE, use the value computed in the disasm
struct arch. Switch the EM_NONE case to EM_HOST, rewriting EM_NONE if
it were passed to get_dwarf_regnum. Pass a flags value as
architectures like csky need the flags to determine the ABI variant.

Reviewed-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Anup Patel &lt;anup@brainfault.org&gt;
Cc: Yang Jihong &lt;yangjihong@bytedance.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Albert Ou &lt;aou@eecs.berkeley.edu&gt;
Cc: Shenlin Liang &lt;liangshenlin@eswincomputing.com&gt;
Cc: Nick Terrell &lt;terrelln@fb.com&gt;
Cc: Guilherme Amadio &lt;amadio@gentoo.org&gt;
Cc: Steinar H. Gunderson &lt;sesse@google.com&gt;
Cc: Changbin Du &lt;changbin.du@huawei.com&gt;
Cc: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Cc: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Cc: Huacai Chen &lt;chenhuacai@kernel.org&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: James Clark &lt;james.clark@linaro.org&gt;
Cc: Mike Leach &lt;mike.leach@linaro.org&gt;
Cc: Chen Pei &lt;cp0613@linux.alibaba.com&gt;
Cc: Leo Yan &lt;leo.yan@linux.dev&gt;
Cc: Oliver Upton &lt;oliver.upton@linux.dev&gt;
Cc: Aditya Gupta &lt;adityag@linux.ibm.com&gt;
Cc: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao &lt;maobibo@loongson.cn&gt;
Cc: John Garry &lt;john.g.garry@oracle.com&gt;
Cc: Atish Patra &lt;atishp@rivosinc.com&gt;
Cc: Dima Kogan &lt;dima@secretsauce.net&gt;
Cc: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Cc: Dr. David Alan Gilbert &lt;linux@treblig.org&gt;
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-6-irogers@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf env: Find correct branch counter info on hybrid</title>
<updated>2024-09-11T16:08:46+00:00</updated>
<author>
<name>Kan Liang</name>
<email>kan.liang@linux.intel.com</email>
</author>
<published>2024-09-09T18:42:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=edf3ce0ed38e2d04a817984e4ea7f05b18102926'/>
<id>edf3ce0ed38e2d04a817984e4ea7f05b18102926</id>
<content type='text'>
No event is printed in the "Branch Counter" column on hybrid machines.

For example,

  $ perf record -e "{cpu_core/branch-instructions/pp,cpu_core/branches/}:S" -j any,counter
  $ perf report --total-cycles

  # Branch counter abbr list:
  # cpu_core/branch-instructions/pp = A
  # cpu_core/branches/ = B
  # '-' No event occurs
  # '+' Event occurrences may be lost due to branch counter saturated
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles  Branch Counter
  # ...............  ..............  ...........  ..........  ..............
            44.54%          727.1K        0.00%           1   |+   |+   |
            36.31%          592.7K        0.00%           2   |+   |+   |
            17.83%          291.1K        0.00%           1   |+   |+   |

The branch counter information (br_cntr_width and br_cntr_nr) in the
perf_env is retrieved from the CPU_PMU_CAPS. However, the CPU_PMU_CAPS
is not available on hybrid machines. Without the width information, the
number of occurrences of an event cannot be calculated.

For a hybrid machine, the caps information should be retrieved from the
PMU_CAPS, and stored in the perf_env-&gt;pmu_caps.

Add a perf_env__find_br_cntr_info() to return the correct branch counter
information from the corresponding fields.

Committer notes:

While testing I couldn't s ee those "Branch counter" columns enabled by
pressing 'B' on the TUI, after reporting it to the list Kan explained
the situation:

&lt;quote Kan Liang&gt;
For a hybrid client, the "Branch Counter" feature is only supported
starting from the just released Lunar Lake. Perf falls back to only
"ANY" on your Raptor Lake.

The "The branch counter is not available" message is expected.

Here is the 'perf evlist' result from my Lunar Lake machine,

  # perf evlist -v
  cpu_core/branch-instructions/pp: type: 4 (cpu_core), size: 136, config: 0xc4 (branch-instructions), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|READ|PERIOD|BRANCH_STACK|IDENTIFIER, read_format: ID|GROUP|LOST, disabled: 1, freq: 1, enable_on_exec: 1, precise_ip: 2, sample_id_all: 1, exclude_guest: 1, branch_sample_type: ANY|COUNTERS
  #
&lt;/quote&gt;

Fixes: 6f9d8d1de2c61288 ("perf script: Add branch counters")
Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20240909184201.553519-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
No event is printed in the "Branch Counter" column on hybrid machines.

For example,

  $ perf record -e "{cpu_core/branch-instructions/pp,cpu_core/branches/}:S" -j any,counter
  $ perf report --total-cycles

  # Branch counter abbr list:
  # cpu_core/branch-instructions/pp = A
  # cpu_core/branches/ = B
  # '-' No event occurs
  # '+' Event occurrences may be lost due to branch counter saturated
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles  Branch Counter
  # ...............  ..............  ...........  ..........  ..............
            44.54%          727.1K        0.00%           1   |+   |+   |
            36.31%          592.7K        0.00%           2   |+   |+   |
            17.83%          291.1K        0.00%           1   |+   |+   |

The branch counter information (br_cntr_width and br_cntr_nr) in the
perf_env is retrieved from the CPU_PMU_CAPS. However, the CPU_PMU_CAPS
is not available on hybrid machines. Without the width information, the
number of occurrences of an event cannot be calculated.

For a hybrid machine, the caps information should be retrieved from the
PMU_CAPS, and stored in the perf_env-&gt;pmu_caps.

Add a perf_env__find_br_cntr_info() to return the correct branch counter
information from the corresponding fields.

Committer notes:

While testing I couldn't s ee those "Branch counter" columns enabled by
pressing 'B' on the TUI, after reporting it to the list Kan explained
the situation:

&lt;quote Kan Liang&gt;
For a hybrid client, the "Branch Counter" feature is only supported
starting from the just released Lunar Lake. Perf falls back to only
"ANY" on your Raptor Lake.

The "The branch counter is not available" message is expected.

Here is the 'perf evlist' result from my Lunar Lake machine,

  # perf evlist -v
  cpu_core/branch-instructions/pp: type: 4 (cpu_core), size: 136, config: 0xc4 (branch-instructions), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|READ|PERIOD|BRANCH_STACK|IDENTIFIER, read_format: ID|GROUP|LOST, disabled: 1, freq: 1, enable_on_exec: 1, precise_ip: 2, sample_id_all: 1, exclude_guest: 1, branch_sample_type: ANY|COUNTERS
  #
&lt;/quote&gt;

Fixes: 6f9d8d1de2c61288 ("perf script: Add branch counters")
Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20240909184201.553519-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
