linux-stable.git/tools/perf/util, branch v6.14

perf cpumap: Fix die and cluster IDs

2025-01-28T18:03:26+00:00

Now that filename__read_int() returns -errno instead of -1 these
statements need to be updated otherwise error values will be used as
die IDs.

This appears as a -2 die ID when the platform doesn't export one:

  $ perf stat --per-core -a -- true

  S36-D-2-C0            1               9.45 msec cpu-clock

And the session topology test fails:

  $ perf test -vvv topology

  CPU 0, core 0, socket 36
  CPU 1, core 1, socket 36
  CPU 2, core 2, socket 36
  CPU 3, core 3, socket 36
  FAILED tests/topology.c:137 Cpu map - Die ID doesn't match
  ---- end(-1) ----
  38: Session topology                                                : FAILED!

Fixes: 05be17eed774 ("tool api fs: Correctly encode errno for read/write open failures")
Reported-by: Thomas Richter 
Signed-off-by: James Clark 
Acked-by: Namhyung Kim 
Link: https://lore.kernel.org/r/20241218115552.912517-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim

perf annotate: Use an array for the disassembler preference

2025-01-27T23:58:01+00:00

Prior to this change a string was used which could cause issues with
an unrecognized disassembler in symbol__disassembler. Change to
initializing an array of perf_disassembler enum values. If a value
already exists then adding it a second time is ignored to avoid array
out of bounds problems present in the previous code, it also allows a
statically sized array and removes memory allocation needs. Errors in
the disassembler string are reported when the config is parsed during
perf annotate or perf top start up. If the array is uninitialized
after processing the config file the default llvm, capstone then
objdump values are added but without a need to parse a string.

Fixes: a6e8a58de629 ("perf disasm: Allow configuring what disassemblers to use")
Closes: https://lore.kernel.org/lkml/CAP-5=fUdfCyxmEiTpzS2uumUp3-SyQOseX2xZo81-dQtWXj6vA@mail.gmail.com/
Signed-off-by: Ian Rogers 
Tested-by: Namhyung Kim 
Link: https://lore.kernel.org/r/20250124043856.1177264-1-irogers@google.com
Signed-off-by: Namhyung Kim

perf trace: Fix BPF loading failure (-E2BIG)

2025-01-23T23:55:52+00:00

As reported by Namhyung Kim and acknowledged by Qiao Zhao (link:
https://lore.kernel.org/linux-perf-users/20241206001436.1947528-1-namhyung@kernel.org/),
on certain machines, perf trace failed to load the BPF program into the
kernel. The verifier runs perf trace's BPF program for up to 1 million
instructions, returning an E2BIG error, whereas the perf trace BPF
program should be much less complex than that. This patch aims to fix
the issue described above.

The E2BIG problem from clang-15 to clang-16 is cause by this line:
 } else if (size < 0 && size >= -6) { /* buffer */

Specifically this check: size < 0. seems like clang generates a cool
optimization to this sign check that breaks things.

Making 'size' s64, and use
 } else if ((int)size < 0 && size >= -6) { /* buffer */

Solves the problem. This is some Hogwarts magic.

And the unbounded access of clang-12 and clang-14 (clang-13 works this
time) is fixed by making variable 'aug_size' s64.

As for this:
-if (aug_size > TRACE_AUG_MAX_BUF)
-	aug_size = TRACE_AUG_MAX_BUF;
+aug_size = args->args[index] > TRACE_AUG_MAX_BUF ? TRACE_AUG_MAX_BUF : args->args[index];

This makes the BPF skel generated by clang-18 work. Yes, new clangs
introduce problems too.

Sorry, I only know that it works, but I don't know how it works. I'm not
an expert in the BPF verifier. I really hope this is not a kernel
version issue, as that would make the test case (kernel_nr) *
(clang_nr), a true horror story. I will test it on more kernel versions
in the future.

Fixes: 395d38419f18: ("perf trace augmented_raw_syscalls: Add more check s to pass the verifier")
Reported-by: Namhyung Kim 
Signed-off-by: Howard Chu 
Tested-by: Namhyung Kim 
Link: https://lore.kernel.org/r/20241213023047.541218-1-howardchu95@gmail.com
Signed-off-by: Namhyung Kim

perf annotate: Prefer passing evsel to evsel->core.idx

2025-01-18T18:02:10+00:00

An evsel idx may not be stable due to sorting, evlist removal,
etc. Try to reduce it being part of APIs by explicitly passing the
evsel in annotate code. Internally the code just reads evsel->core.idx
so behavior is unchanged.

Signed-off-by: Ian Rogers 
Cc: Chen Ni 
Cc: Athira Rajeev 
Link: https://lore.kernel.org/r/20250117181848.690474-1-irogers@google.com
Signed-off-by: Namhyung Kim

perf hist: Fix bogus profiles when filters are enabled

2025-01-16T21:43:28+00:00

When a filtered column is not present in the sort order, profiles become
arbitrary broken. Filtered and non-filtered entries are collapsed
together, and the filtered-by field ends up with a random value (either
from a filtered or non-filtered entry). If we end up with filtered
entry/value, then the whole collapsed entry will be filtered out and will
be missing in the profile. If we end up with non-filtered entry/value,
then the overhead value will be wrongly larger (include some subset
of filtered out samples).

This leads to very confusing profiles. The problem is hard to notice,
and if noticed hard to understand. If the filter is for a single value,
then it can be fixed by adding the corresponding field to the sort order
(provided user understood the problem). But if the filter is for multiple
values, it's impossible to fix b/c there is no concept of binary sorting
based on filter predicate (we want to group all non-filtered values in
one bucket, and all filtered values in another).

Examples of affected commands:
perf report --tid=123
perf report --sort overhead,symbol --comm=foo,bar

Fix this by considering filtered status as the highest priority
sort/collapse predicate.

As a side effect this effectively adds a new feature of showing profile
where several lines are combined based on arbitrary filtering predicate.
For example, showing symbols from binaries foo and bar combined together,
but not from other binaries; or showing combined overhead of several
particular threads.

Signed-off-by: Dmitry Vyukov 
Link: https://lore.kernel.org/r/359dc444ce94d20e59d3a9e360c36fbeac833a04.1736927981.git.dvyukov@google.com
Signed-off-by: Namhyung Kim

perf hist: Deduplicate cmp/sort/collapse code

2025-01-16T21:43:28+00:00

Application of cmp/sort/collapse fmt callbacks is duplicated 6 times.
Factor it into a common helper function. NFC.

Signed-off-by: Dmitry Vyukov 
Link: https://lore.kernel.org/r/84c4b55614e24a344f86ae0db62e8fa8f251f874.1736927981.git.dvyukov@google.com
Signed-off-by: Namhyung Kim

perf config: Add a function to set one variable in .perfconfig

2025-01-14T18:05:56+00:00

To allow for setting a variable from some other tool, like with the
"wallclock" patchset needs to allow the user to opt-in to having
that key in the sort order for 'perf report'.

Cc: Adrian Hunter 
Cc: Dmitriy Vyukov 
Cc: Ian Rogers 
Cc: Ingo Molnar 
Cc: James Clark 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Link: https://lore.kernel.org/lkml/Z4akewi7UPXpagce@x1
Signed-off-by: Arnaldo Carvalho de Melo

perf probe: Rename err label

2025-01-14T17:57:19+00:00

Rename err to out to avoid confusion because buf is still supposed to be
freed in non error cases.

Reviewed-by: Arnaldo Carvalho de Melo 
Signed-off-by: James Clark 
Tested-by: Namhyung Kim 
Acked-by: Masami Hiramatsu 
Cc: Adrian Hunter 
Cc: Alexander Shishkin 
Cc: Dr. David Alan Gilbert 
Cc: Ian Rogers 
Cc: Ingo Molnar 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Leo Yan 
Cc: Mark Rutland 
Cc: Peter Zijlstra 
Link: https://lore.kernel.org/r/20241211085525.519458-3-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo

perf record: Fix segfault with --off-cpu when debuginfo is not enabled

2025-01-14T17:57:19+00:00

When kernel is built without debuginfo, running 'perf record' with
--off-cpu results in segfault as below:

   ./perf record --off-cpu -e dummy sleep 1
   libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
   libbpf: failed to find '.BTF' ELF section in /lib/modules/6.13.0-rc3+/build/vmlinux
   libbpf: failed to find valid kernel BTF
   Segmentation fault (core dumped)

The backtrace pointed to:

   #0  0x00000000100fb17c in btf.type_cnt ()
   #1  0x00000000100fc1a8 in btf_find_by_name_kind ()
   #2  0x00000000100fc38c in btf.find_by_name_kind ()
   #3  0x00000000102ee3ac in off_cpu_prepare ()
   #4  0x000000001002f78c in cmd_record ()
   #5  0x00000000100aee78 in run_builtin ()
   #6  0x00000000100af3e4 in handle_internal_command ()
   #7  0x000000001001004c in main ()

Code sequence is:

   static void check_sched_switch_args(void)
   {
        struct btf *btf = btf__load_vmlinux_btf();
        const struct btf_type *t1, *t2, *t3;
        u32 type_id;

        type_id = btf__find_by_name_kind(btf, "btf_trace_sched_switch",
                                         BTF_KIND_TYPEDEF);

btf__load_vmlinux_btf() fails when CONFIG_DEBUG_INFO_BTF is not enabled.

Here bpf__find_by_name_kind() calls btf__type_cnt() with NULL btf value
and results in segfault.

To fix this, add a check to see if btf is not NULL before invoking
bpf__find_by_name_kind().

Reviewed-by: Namhyung Kim 
Signed-off-by: Athira Rajeev 
Cc: Adrian Hunter 
Cc: Disha Goel 
Cc: Hari Bathini 
Cc: Ian Rogers 
Cc: Jiri Olsa 
Cc: Kajol Jain 
Cc: Madhavan Srinivasan 
Link: https://lore.kernel.org/r/20241223135813.8175-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo

perf tools: Fixup end address of modules

2025-01-10T13:59:42+00:00

In machine__create_module(), it reads /proc/modules to get a list of
modules in the system.  The file shows the start address (of text) and
the size of the module so it uses the info to reconstruct system memory
maps for symbol resolution.

But module memory consists of multiple segments and they can be
scaterred.  Currently perf tools assume they are contiguous and see some
overlaps.  This can confuse the tool when it finds a map containing a
given address.

As we mostly care about the function symbols in the text segment, it can
fixup the size or end address of modules when there's an overlap.  We
can use maps__fixup_end() which updates the end address using the start
address of the next map.

Ideally it should be able to track other segments (like data/rodata),
but that would require some changes in /proc/modules IMHO.

Reported-by: Blake Jones 
Signed-off-by: Namhyung Kim 
Acked-by: Ian Rogers 
Cc: Adrian Hunter 
Cc: Daniel Gomez 
Cc: Ingo Molnar 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Luis Chamberlain 
Cc: Peter Zijlstra 
Cc: Petr Pavlu 
Cc: Sami Tolvanen 
Link: https://lore.kernel.org/r/20241218220453.203069-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo