summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2025-11-13perf build: Don't fail fast path feature detection when binutils-devel is ↵Arnaldo Carvalho de Melo
not available This is one more remnant of the BUILD_NONDISTRO series to make building with binutils-devel opt-in due to license incompatibility. In this case just the references at link time were still in place, which make building the test-all.bin file fail, which wasn't detected before probably because the last test was done with binutils-devel available, doh. Now: $ rpm -q binutils-devel package binutils-devel is not installed $ file /tmp/build/perf-tools/feature/test-all.bin /tmp/build/perf-tools/feature/test-all.bin: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=4b5388a346b51f1b993f0b0dbd49f4570769b03c, for GNU/Linux 3.2.0, not stripped $ Fixes: 970ae86307718c34 ("perf build: The bfd features are opt-in, stop testing for them by default") Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-13perf header: Write bpf_prog (infos|btfs)_cnt to data fileThomas Falcon
With commit f0d0f978f3f5830a ("perf header: Don't write empty BPF/BTF info"), the write_bpf_( prog_info() | btf() ) functions exit without writing anything if env->bpf_prog.(infos| btfs)_cnt is zero. process_bpf_( prog_info() | btf() ), however, still expect a "count" value to exist in the data file. If btf information is empty, for example, process_bpf_btf will read garbage or some other data as the number of btf nodes in the data file. As a result, the data file will not be processed correctly. Instead, write the count to the data file and exit if it is zero. Fixes: f0d0f978f3f5830a ("perf header: Don't write empty BPF/BTF info") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-13Merge tag 'v6.18-rc5' into objtool/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-11-12perf test: Add a perf event fallback testZide Chen
This adds test cases to verify the precise ip fallback logic: - If the system supports precise ip, for an event given with the maximum precision level, it should be able to decrease precise_ip to find a supported level. - The same fallback behavior should also work in more complex scenarios, such as event groups or when PEBS is involved Additional fallback tests, such as those covering missing feature cases, can be added in the future. Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Reviewed-by: Ian Rogers <irogers!@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf stat: Align metric output without eventsNamhyung Kim
One of my concern in the perf stat output was the alignment in the metrics and shadow stats. I think it missed to calculate the basic output length using COUNTS_LEN and EVNAME_LEN but missed to add the unit length like "msec" and surround 2 spaces. I'm not sure why it's not printed below though. But anyway, now it shows correctly aligned metric output. $ perf stat true Performance counter stats for 'true': 859,772 task-clock # 0.395 CPUs utilized 0 context-switches # 0.000 /sec 0 cpu-migrations # 0.000 /sec 56 page-faults # 65.134 K/sec 1,075,022 instructions # 0.86 insn per cycle 1,255,911 cycles # 1.461 GHz 220,573 branches # 256.548 M/sec 7,381 branch-misses # 3.35% of all branches TopdownL1 # 19.2 % tma_retiring # 28.6 % tma_backend_bound # 9.5 % tma_bad_speculation # 42.6 % tma_frontend_bound 0.002174871 seconds time elapsed ^ | 0.002154000 seconds user | 0.000000000 seconds sys here Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf tool_pmu: Make core_wide and target_cpu json eventsIan Rogers
For the sake of better documentation, add core_wide and target_cpu to the tool.json. When the values of system_wide and user_requested_cpu_list are unknown, use the values from the global stat_config. Example output showing how '-a' modifies the values in `perf stat`: ``` $ perf stat -e core_wide,target_cpu true Performance counter stats for 'true': 0 core_wide 0 target_cpu 0.000993787 seconds time elapsed 0.001128000 seconds user 0.000000000 seconds sys $ perf stat -e core_wide,target_cpu -a true Performance counter stats for 'system wide': 1 core_wide 1 target_cpu 0.002271723 seconds time elapsed $ perf list ... tool: core_wide [1 if not SMT,if SMT are events being gathered on all SMT threads 1 otherwise 0. Unit: tool] ... target_cpu [1 if CPUs being analyzed,0 if threads/processes. Unit: tool] ... ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf test stat csv: Update test expectations and eventsIan Rogers
Explicitly use a metric rather than implicitly expecting '-e instructions,cycles' to produce a metric. Use a metric with software events to make it more compatible. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf test stat: Update test expectations and eventsIan Rogers
test_stat_record_report and test_stat_record_script used default output which triggers a bug when sending metrics. As this isn't relevant to the test switch to using named software events. Update the match in test_hybrid as the cycles event is now cpu-cycles to workaround potential ARM issues. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf test stat: Update shadow test to use metricsIan Rogers
Previously '-e cycles,instructions' would implicitly create an IPC metric. This now has to be explicit with '-M insn_per_cycle'. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf test metrics: Update all metrics for possibly failing default metricsIan Rogers
Default metrics may use unsupported events and be ignored. These metrics shouldn't cause metric testing to fail. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf test stat: Update std_output testing metric expectationsIan Rogers
Make the expectations match json metrics rather than the previous hard coded ones. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf test stat: Ignore failures in Default[234] metricgroupsIan Rogers
The Default[234] metric groups may contain unsupported legacy events. Allow those metric groups to fail. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf test stat+json: Improve metric-only testingIan Rogers
When testing metric-only, pass a metric to perf rather than expecting a hard coded metric value to be generated. Remove keys that were really metric-only units and instead don't expect metric only to have a matching json key as it encodes metrics as {"metric_name", "metric_value"}. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf stat: Remove "unit" workarounds for metric-onlyIan Rogers
Remove code that tested the "unit" as in KB/sec for certain hard coded metric values and did workarounds. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf stat: Sort default events/metricsIan Rogers
To improve the readability of default events/metrics, sort the evsels after the Default metric groups have be parsed. Before: ``` $ perf stat -a sleep 1 Performance counter stats for 'system wide': 22,087 context-switches # nan cs/sec cs_per_second TopdownL1 (cpu_core) # 10.3 % tma_bad_speculation # 25.8 % tma_frontend_bound # 34.5 % tma_backend_bound # 29.3 % tma_retiring 7,829 page-faults # nan faults/sec page_faults_per_second 880,144,270 cpu_atom/cpu-cycles/ # nan GHz cycles_frequency (50.10%) 1,693,081,235 cpu_core/cpu-cycles/ # nan GHz cycles_frequency TopdownL1 (cpu_atom) # 20.5 % tma_bad_speculation # 13.8 % tma_retiring (50.26%) # 34.6 % tma_frontend_bound (50.23%) 89,326,916 cpu_atom/branches/ # nan M/sec branch_frequency (60.19%) 538,123,088 cpu_core/branches/ # nan M/sec branch_frequency 1,368 cpu-migrations # nan migrations/sec migrations_per_second # 31.1 % tma_backend_bound (60.19%) 0.00 msec cpu-clock # 0.0 CPUs CPUs_utilized 485,744,856 cpu_atom/instructions/ # 0.6 instructions insn_per_cycle (59.87%) 3,093,112,283 cpu_core/instructions/ # 1.8 instructions insn_per_cycle 4,939,427 cpu_atom/branch-misses/ # 5.0 % branch_miss_rate (49.77%) 7,632,248 cpu_core/branch-misses/ # 1.4 % branch_miss_rate 1.005084693 seconds time elapsed ``` After: ``` $ perf stat -a sleep 1 Performance counter stats for 'system wide': 22,165 context-switches # nan cs/sec cs_per_second 0.00 msec cpu-clock # 0.0 CPUs CPUs_utilized 2,260 cpu-migrations # nan migrations/sec migrations_per_second 20,476 page-faults # nan faults/sec page_faults_per_second 17,052,357 cpu_core/branch-misses/ # 1.5 % branch_miss_rate 1,120,090,590 cpu_core/branches/ # nan M/sec branch_frequency 3,402,892,275 cpu_core/cpu-cycles/ # nan GHz cycles_frequency 6,129,236,701 cpu_core/instructions/ # 1.8 instructions insn_per_cycle 6,159,523 cpu_atom/branch-misses/ # 3.1 % branch_miss_rate (49.86%) 222,158,812 cpu_atom/branches/ # nan M/sec branch_frequency (50.25%) 1,547,610,244 cpu_atom/cpu-cycles/ # nan GHz cycles_frequency (50.40%) 1,304,901,260 cpu_atom/instructions/ # 0.8 instructions insn_per_cycle (50.41%) TopdownL1 (cpu_core) # 13.7 % tma_bad_speculation # 23.5 % tma_frontend_bound # 33.3 % tma_backend_bound # 29.6 % tma_retiring TopdownL1 (cpu_atom) # 32.1 % tma_backend_bound (59.65%) # 30.1 % tma_frontend_bound (59.51%) # 22.3 % tma_bad_speculation # 15.5 % tma_retiring (59.53%) 1.008405429 seconds time elapsed ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf stat: Fix default metricgroup display on hybridIan Rogers
The logic to skip output of a default metric line was firing on Alderlake and not displaying 'TopdownL1 (cpu_atom)'. Remove the need_full_name check as it is equivalent to the different PMU test in the cases we care about, merge the 'if's and flip the evsel of the PMU test. The 'if' is now basically saying, if the output matches the last printed output then skip the output. Before: ``` TopdownL1 (cpu_core) # 11.3 % tma_bad_speculation # 24.3 % tma_frontend_bound TopdownL1 (cpu_core) # 33.9 % tma_backend_bound # 30.6 % tma_retiring # 42.2 % tma_backend_bound # 25.0 % tma_frontend_bound (49.81%) # 12.8 % tma_bad_speculation # 20.0 % tma_retiring (59.46%) ``` After: ``` TopdownL1 (cpu_core) # 8.3 % tma_bad_speculation # 43.7 % tma_frontend_bound # 30.7 % tma_backend_bound # 17.2 % tma_retiring TopdownL1 (cpu_atom) # 31.9 % tma_backend_bound # 37.6 % tma_frontend_bound (49.66%) # 18.0 % tma_bad_speculation # 12.6 % tma_retiring (59.58%) ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf stat: Remove hard coded shadow metricsIan Rogers
Now that the metrics are encoded in common json the hard coded printing means the metrics are shown twice. Remove the hard coded version. This means that when specifying events, and those events correspond to a hard coded metric, the metric will no longer be displayed. The metric will be displayed if the metric is requested. Due to the adhoc printing in the previous approach it was often found frustrating, the new approach avoids this. The default perf stat output on an alderlake now looks like: ``` $ perf stat -a -- sleep 1 Performance counter stats for 'system wide': 19,697 context-switches # nan cs/sec cs_per_second TopdownL1 (cpu_core) # 10.7 % tma_bad_speculation # 24.9 % tma_frontend_bound TopdownL1 (cpu_core) # 34.3 % tma_backend_bound # 30.1 % tma_retiring 6,593 page-faults # nan faults/sec page_faults_per_second 729,065,658 cpu_atom/cpu-cycles/ # nan GHz cycles_frequency (49.79%) 1,605,131,101 cpu_core/cpu-cycles/ # nan GHz cycles_frequency # 19.7 % tma_bad_speculation # 14.2 % tma_retiring (50.14%) # 37.3 % tma_frontend_bound (50.31%) 87,302,268 cpu_atom/branches/ # nan M/sec branch_frequency (60.27%) 512,046,956 cpu_core/branches/ # nan M/sec branch_frequency 1,111 cpu-migrations # nan migrations/sec migrations_per_second # 28.8 % tma_backend_bound (60.26%) 0.00 msec cpu-clock # 0.0 CPUs CPUs_utilized 392,509,323 cpu_atom/instructions/ # 0.6 instructions insn_per_cycle (60.19%) 2,990,369,310 cpu_core/instructions/ # 1.9 instructions insn_per_cycle 3,493,478 cpu_atom/branch-misses/ # 5.9 % branch_miss_rate (49.69%) 7,297,531 cpu_core/branch-misses/ # 1.4 % branch_miss_rate 1.006621701 seconds time elapsed ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf script: Change metric format to use json metricsIan Rogers
The metric format option isn't properly supported. This change improves that by making the sample events update the counts of an evsel, where the shadow metric code expects to read the values. To support printing metrics, metrics need to be found. This is done on the first attempt to print a metric. Every metric is parsed and then the evsels in the metric's evlist compared to those in perf script using the perf_event_attr type and config. If the metric matches then it is added for printing. As an event in the perf script's evlist may have >1 metric id, or different leader for aggregation, the first metric matched will be displayed in those cases. An example use is: ``` $ perf record -a -e '{instructions,cpu-cycles}:S' -a -- sleep 1 $ perf script -F period,metric ... 867817 metric: 0.30 insn per cycle 125394 metric: 0.04 insn per cycle 313516 metric: 0.11 insn per cycle metric: 1.00 insn per cycle ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf stat: Add detail -d,-dd,-ddd metricsIan Rogers
Add metrics for the stat-shadow -d, -dd and -ddd events and hard coded metrics. Remove the events as these now come from the metrics. Following this change a detailed perf stat output looks like: ``` $ perf stat -a -ddd -- sleep 1 Performance counter stats for 'system wide': 21,089 context-switches # nan cs/sec cs_per_second TopdownL1 (cpu_core) # 14.1 % tma_bad_speculation # 27.3 % tma_frontend_bound (30.56%) TopdownL1 (cpu_core) # 31.5 % tma_backend_bound # 27.2 % tma_retiring (30.56%) 6,302 page-faults # nan faults/sec page_faults_per_second 928,495,163 cpu_atom/cpu-cycles/ # nan GHz cycles_frequency (28.41%) 1,841,409,834 cpu_core/cpu-cycles/ # nan GHz cycles_frequency (38.51%) # 14.5 % tma_bad_speculation # 16.0 % tma_retiring (28.41%) # 36.8 % tma_frontend_bound (35.57%) 100,859,118 cpu_atom/branches/ # nan M/sec branch_frequency (42.73%) 572,657,734 cpu_core/branches/ # nan M/sec branch_frequency (54.43%) 1,527 cpu-migrations # nan migrations/sec migrations_per_second # 32.7 % tma_backend_bound (42.73%) 0.00 msec cpu-clock # 0.000 CPUs utilized # 0.0 CPUs CPUs_utilized 498,668,509 cpu_atom/instructions/ # 0.57 insn per cycle # 0.6 instructions insn_per_cycle (42.97%) 3,281,762,225 cpu_core/instructions/ # 1.84 insn per cycle # 1.8 instructions insn_per_cycle (62.20%) 4,919,511 cpu_atom/branch-misses/ # 5.43% of all branches # 5.4 % branch_miss_rate (35.80%) 7,431,776 cpu_core/branch-misses/ # 1.39% of all branches # 1.4 % branch_miss_rate (62.20%) 2,517,007 cpu_atom/LLC-loads/ # 0.1 % llc_miss_rate (28.62%) 3,931,318 cpu_core/LLC-loads/ # 40.4 % llc_miss_rate (45.98%) 14,918,674 cpu_core/L1-dcache-load-misses/ # 2.25% of all L1-dcache accesses # nan % l1d_miss_rate (37.80%) 27,067,264 cpu_atom/L1-icache-load-misses/ # 15.92% of all L1-icache accesses # 15.9 % l1i_miss_rate (21.47%) 116,848,994 cpu_atom/dTLB-loads/ # 0.8 % dtlb_miss_rate (21.47%) 764,870,407 cpu_core/dTLB-loads/ # 0.1 % dtlb_miss_rate (15.12%) 1.006181526 seconds time elapsed ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf jevents: Add metric DefaultShowEventsIan Rogers
Some Default group metrics require their events showing for consistency with perf's previous behavior. Add a flag to indicate when this is the case and use it in stat-display. As events are coming from Default metrics remove that default hardware and software events from perf stat. Following this change the default perf stat output on an alderlake looks like: ``` $ perf stat -a -- sleep 1 Performance counter stats for 'system wide': 20,550 context-switches # nan cs/sec cs_per_second TopdownL1 (cpu_core) # 9.0 % tma_bad_speculation # 28.1 % tma_frontend_bound TopdownL1 (cpu_core) # 29.2 % tma_backend_bound # 33.7 % tma_retiring 6,685 page-faults # nan faults/sec page_faults_per_second 790,091,064 cpu_atom/cpu-cycles/ # nan GHz cycles_frequency (49.83%) 2,563,918,366 cpu_core/cpu-cycles/ # nan GHz cycles_frequency # 12.3 % tma_bad_speculation # 14.5 % tma_retiring (50.20%) # 33.8 % tma_frontend_bound (50.24%) 76,390,322 cpu_atom/branches/ # nan M/sec branch_frequency (60.20%) 1,015,173,047 cpu_core/branches/ # nan M/sec branch_frequency 1,325 cpu-migrations # nan migrations/sec migrations_per_second # 39.3 % tma_backend_bound (60.17%) 0.00 msec cpu-clock # 0.000 CPUs utilized # 0.0 CPUs CPUs_utilized 554,347,072 cpu_atom/instructions/ # 0.64 insn per cycle # 0.6 instructions insn_per_cycle (60.14%) 5,228,931,991 cpu_core/instructions/ # 2.04 insn per cycle # 2.0 instructions insn_per_cycle 4,308,874 cpu_atom/branch-misses/ # 5.65% of all branches # 5.6 % branch_miss_rate (49.76%) 9,890,606 cpu_core/branch-misses/ # 0.97% of all branches # 1.0 % branch_miss_rate 1.005477803 seconds time elapsed ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf jevents: Add set of common metrics based on default onesIan Rogers
Add support to getting a common set of metrics from a default table. It simplifies the generation to add json metrics at the same time. The metrics added are CPUs_utilized, cs_per_second, migrations_per_second, page_faults_per_second, insn_per_cycle, stalled_cycles_per_instruction, frontend_cycles_idle, backend_cycles_idle, cycles_frequency, branch_frequency and branch_miss_rate based on the shadow metric definitions. Following this change the default perf stat output on an alderlake looks like: ``` $ perf stat -a -- sleep 2 Performance counter stats for 'system wide': 0.00 msec cpu-clock # 0.000 CPUs utilized 77,739 context-switches 15,033 cpu-migrations 321,313 page-faults 14,355,634,225 cpu_atom/instructions/ # 1.40 insn per cycle (35.37%) 134,561,560,583 cpu_core/instructions/ # 3.44 insn per cycle (57.85%) 10,263,836,145 cpu_atom/cycles/ (35.42%) 39,138,632,894 cpu_core/cycles/ (57.60%) 2,989,658,777 cpu_atom/branches/ (42.60%) 32,170,570,388 cpu_core/branches/ (57.39%) 29,789,870 cpu_atom/branch-misses/ # 1.00% of all branches (42.69%) 165,991,152 cpu_core/branch-misses/ # 0.52% of all branches (57.19%) (software) # nan cs/sec cs_per_second TopdownL1 (cpu_core) # 11.9 % tma_bad_speculation # 19.6 % tma_frontend_bound (63.97%) TopdownL1 (cpu_core) # 18.8 % tma_backend_bound # 49.7 % tma_retiring (63.97%) (software) # nan faults/sec page_faults_per_second # nan GHz cycles_frequency (42.88%) # nan GHz cycles_frequency (69.88%) TopdownL1 (cpu_atom) # 11.7 % tma_bad_speculation # 29.9 % tma_retiring (50.07%) TopdownL1 (cpu_atom) # 31.3 % tma_frontend_bound (43.09%) (cpu_atom) # nan M/sec branch_frequency (43.09%) # nan M/sec branch_frequency (70.07%) # nan migrations/sec migrations_per_second TopdownL1 (cpu_atom) # 27.1 % tma_backend_bound (43.08%) (software) # 0.0 CPUs CPUs_utilized # 1.4 instructions insn_per_cycle (43.04%) # 3.5 instructions insn_per_cycle (69.99%) # 1.0 % branch_miss_rate (35.46%) # 0.5 % branch_miss_rate (65.02%) 2.005626564 seconds time elapsed ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf expr: Add #target_cpu literalIan Rogers
For CPU nanoseconds a lot of the stat-shadow metrics use either task-clock or cpu-clock, the latter being used when target__has_cpu. Add a #target_cpu literal so that json metrics can perform the same test. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf metricgroup: Add care to picking the evsel for displaying a metricIan Rogers
Rather than using the first evsel in the matched events, try to find the least shared non-tool evsel. The aim is to pick the first evsel that typifies the metric within the list of metrics. This addresses an issue where Default metric group metrics may lose their counter value due to how the stat displaying hides counters for default event/metric output. For a metricgroup like TopdownL1 on an Intel Alderlake the change is, before there are 4 events with metrics: ``` $ perf stat -M topdownL1 -a sleep 1 Performance counter stats for 'system wide': 7,782,334,296 cpu_core/TOPDOWN.SLOTS/ # 10.4 % tma_bad_speculation # 19.7 % tma_frontend_bound 2,668,927,977 cpu_core/topdown-retiring/ # 35.7 % tma_backend_bound # 34.1 % tma_retiring 803,623,987 cpu_core/topdown-bad-spec/ 167,514,386 cpu_core/topdown-heavy-ops/ 1,555,265,776 cpu_core/topdown-fe-bound/ 2,792,733,013 cpu_core/topdown-be-bound/ 279,769,310 cpu_atom/TOPDOWN_RETIRING.ALL/ # 12.2 % tma_retiring # 15.1 % tma_bad_speculation 457,917,232 cpu_atom/CPU_CLK_UNHALTED.CORE/ # 38.4 % tma_backend_bound # 34.2 % tma_frontend_bound 783,519,226 cpu_atom/TOPDOWN_FE_BOUND.ALL/ 10,790,192 cpu_core/INT_MISC.UOP_DROPPING/ 879,845,633 cpu_atom/TOPDOWN_BE_BOUND.ALL/ ``` After there are 6 events with metrics: ``` $ perf stat -M topdownL1 -a sleep 1 Performance counter stats for 'system wide': 2,377,551,258 cpu_core/TOPDOWN.SLOTS/ # 7.9 % tma_bad_speculation # 36.4 % tma_frontend_bound 480,791,142 cpu_core/topdown-retiring/ # 35.5 % tma_backend_bound 186,323,991 cpu_core/topdown-bad-spec/ 65,070,590 cpu_core/topdown-heavy-ops/ # 20.1 % tma_retiring 871,733,444 cpu_core/topdown-fe-bound/ 848,286,598 cpu_core/topdown-be-bound/ 260,936,456 cpu_atom/TOPDOWN_RETIRING.ALL/ # 12.4 % tma_retiring # 17.6 % tma_bad_speculation 419,576,513 cpu_atom/CPU_CLK_UNHALTED.CORE/ 797,132,597 cpu_atom/TOPDOWN_FE_BOUND.ALL/ # 38.0 % tma_frontend_bound 3,055,447 cpu_core/INT_MISC.UOP_DROPPING/ 671,014,164 cpu_atom/TOPDOWN_BE_BOUND.ALL/ # 32.0 % tma_backend_bound ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-11perf tools: Fix missing feature check for inherit + SAMPLE_READNamhyung Kim
It should also have PERF_SAMPLE_TID to enable inherit and PERF_SAMPLE_READ on recent kernels. Not having _TID makes the feature check wrongly detect the inherit and _READ support. It was reported that the following command failed due to the error in the missing feature check on Intel SPR machines. $ perf record -e '{cpu/mem-loads-aux/S,cpu/mem-loads,ldlat=3/PS}' -- ls Error: Failure to open event 'cpu/mem-loads,ldlat=3/PS' on PMU 'cpu' which will be removed. Invalid event (cpu/mem-loads,ldlat=3/PS) in per-thread mode, enable system wide with '-a'. Reviewed-by: Ian Rogers <irogers@google.com> Fixes: 3b193a57baf15c468 ("perf tools: Detect missing kernel features properly") Reported-and-tested-by: Chen, Zide <zide.chen@intel.com> Closes: https://lore.kernel.org/lkml/20251022220802.1335131-1-zide.chen@intel.com/ Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-10perf symbol: Remove unneeded semicolonChen Ni
Remove unnecessary semicolons reported by Coccinelle/coccicheck and the semantic patch at scripts/coccinelle/misc/semicolon.cocci. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-09perf test: Add test that command line period overrides sysfs/json valuesIan Rogers
The behavior of weak terms is subtle, add a test that they aren't accidentally broken. The test finds an event with a weak 'period' and then overrides it. In no such event is present then the test skips. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-09perf pmu: Make pmu_alias_terms weak againIan Rogers
The terms for a json event should be weak so they don't override command line options. Before: ``` $ perf record -vv -c 1000 -e uops_issued.any -o /dev/null true 2>&1 |grep "{ sample_period, sample_freq }" { sample_period, sample_freq } 200003 { sample_period, sample_freq } 2000003 { sample_period, sample_freq } 1000 ``` After: ``` $ perf record -vv -c 1000 -e uops_issued.any -o /dev/null true 2>&1 |grep "{ sample_period, sample_freq }" { sample_period, sample_freq } 1000 { sample_period, sample_freq } 1000 { sample_period, sample_freq } 1000 ``` Fixes: 84bae3af20d0 ("perf pmu: Don't eagerly parse event terms") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-07perf tool: Add a delegate_tool that just delegates actions to another toolIan Rogers
Add an ability to be able to compose perf_tools, by having one perform an action and then calling a delegate. Currently the perf_tools have if-then-elses setting the callback and then if-then-elses within the callback. Understanding the behavior is complex as it is in two places and logic for numerous operations, within things like perf inject, is interwoven. By chaining perf_tools together based on command line options this kind of code can be avoided. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-07perf tool: Add the perf_tool argument to all callbacksIan Rogers
Getting context for what a tool is doing, such as the perf_inject instance, using container_of the tool is a common pattern in the code. This isn't possible event_op2, event_op3 and event_op4 callbacks as the tool isn't passed. Add the argument and then fix function signatures to match. As tools maybe reading a tool from somewhere else, change that code to use the passed in tool. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-06perf vendor events arm64:: Add i.MX94 DDR Performance Monitor metricsXu Yang
Add JSON metrics for i.MX94 DDR Performance Monitor. Reviewed-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-06perf stat: Add ScaleUnit to {cpu,task}-clock JSON descriptionNamhyung Kim
This changes the output of the event like below. In fact, that's the output it used to have before the JSON conversion. Before: $ perf stat -e task-clock true Performance counter stats for 'true': 313,848 task-clock # 0.290 CPUs utilized 0.001081223 seconds time elapsed 0.001122000 seconds user 0.000000000 seconds sys After: $ perf stat -e task-clock true Performance counter stats for 'true': 0.36 msec task-clock # 0.297 CPUs utilized 0.001225435 seconds time elapsed 0.001268000 seconds user 0.000000000 seconds sys Reviewed-by: Ian Rogers <irogers@google.com> Fixes: 9957d8c801fe0cb90 ("perf jevents: Add common software event json") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-06perf record: Make sure to update build-ID cacheNamhyung Kim
Recent change on enabling --buildid-mmap by default brought an issue with build-id handling. With build-ID in MMAP2 records, we don't need to save the build-ID table in the header of a perf data file. But the actual file contents still need to be cached in the debug directory for annotation etc. Split the build-ID header processing and caching and make sure perf record to save hit DSOs in the build-ID cache by moving perf_session__cache_build_ids() to the end of the record__ finish_output(). Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-04net: Convert struct sockaddr to fixed-size "sa_data[14]"Kees Cook
Revert struct sockaddr from flexible array to fixed 14-byte "sa_data", to solve over 36,000 -Wflex-array-member-not-at-end warnings, since struct sockaddr is embedded within many network structs. With socket/proto sockaddr-based internal APIs switched to use struct sockaddr_unsized, there should be no more uses of struct sockaddr that depend on reading beyond the end of struct sockaddr::sa_data that might trigger bounds checking. Comparing an x86_64 "allyesconfig" vmlinux build before and after this patch showed no new "ud1" instructions from CONFIG_UBSAN_BOUNDS nor any new "field-spanning" memcpy CONFIG_FORTIFY_SOURCE instrumentations. Cc: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20251104002617.2752303-8-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-03perf jevents: Make all tables staticIan Rogers
The tables created by jevents.py are only used within the pmu-events.c file. Change the declarations of those global variables to be static to encapsulate this. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-03perf metricgroup: When copy metrics copy default informationIan Rogers
When copy metrics into a group also copy default information from the original metrics. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-03perf metricgroup: Missed free on error pathIan Rogers
If an out-of-memory occurs the expr also needs freeing. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-03perf metricgroup: Update comment on location of metric_event listIan Rogers
Update comment as the stat_config no longer holds all metrics. Signed-off-by: Ian Rogers <irogers@google.com> Fixes: faebee18d720 ("perf stat: Move metric list from config to evlist") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-03perf evsel: Remove unused metric_events variableIan Rogers
The metric_events exist in the metric_expr list and so this variable has been unused for a while. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-03perf symbols: Handle '1' symbols in /proc/kallsymsArnaldo Carvalho de Melo
I started seeing this in recent Fedora 42 kernels: root@x1:~# uname -a Linux x1 6.17.4-200.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Oct 19 18:47:49 UTC 2025 x86_64 GNU/Linux root@x1:~# root@x1:~# perf test 1 1: vmlinux symtab matches kallsyms : FAILED! root@x1:~# Related to: root@x1:~# grep ' 1 ' /proc/kallsyms ffffffffb098bc00 1 __pfx__RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ ffffffffb098bc10 1 _RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ root@x1:~# That is found in: root@x1:~# pahole --running_kernel_vmlinux /usr/lib/debug/lib/modules/6.17.4-200.fc42.x86_64/vmlinux root@x1:~# root@x1:~# readelf -sW /usr/lib/debug/lib/modules/6.17.4-200.fc42.x86_64/vmlinux | grep __pfx__RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ 150649: ffffffff81f8bc00 16 FUNC LOCAL DEFAULT 1 __pfx__RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ root@x1:~# But was being filtered out when reading /proc/kallsyms, as the '1' symbol type was not being handled, do it, there are just two of them at this point. Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <lossin@kernel.org> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Trevor Gross <tmgross@umich.edu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01tools headers x86: Sync table due to introducion of uprobe syscallArnaldo Carvalho de Melo
To pick the changes in this cset: 56101b69c9190667 ("uprobes/x86: Add uprobe syscall to speed up uprobe") That add support for this new 'uprobe' syscall in tools such as 'perf trace'. Now it is possible to do a system wide 'perf trace' to look if this new syscall is being used: root@number:~# perf trace -v -e uprobe <SNIP> event qualifier tracepoint filter: (common_pid != 33989) && (id == 336) ^C root@number# $ grep -w uprobe tools/perf/arch/x86/entry/syscalls/syscall_64.tbl 336 common uprobe sys_uprobe $ This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl Please see tools/include/uapi/README for further details. Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01tools headers: Sync uapi/linux/fcntl.h with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes in this cset: e83f0b5d10dcf628 ("nsfs: support exhaustive file handles") That doesn't introduce anything of interest for tools/, just addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h Please see tools/include/uapi/README for further details. Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01tools headers: Sync uapi/linux/prctl.h with the kernel sourceArnaldo Carvalho de Melo
To pick up the changes in these csets: 8cdc4d27019356b0 ("mm/huge_memory: respect MADV_COLLAPSE with PR_THP_DISABLE_EXCEPT_ADVISED") 9dc21bbd62edeae6 ("prctl: extend PR_SET_THP_DISABLE to optionally exclude VM_HUGEPAGE") That don't introduce anything of interest for the tools/, just addressing these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Please see tools/include/uapi/README for further details. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01tools headers uapi: Update fs.h with the kernel sourcesArnaldo Carvalho de Melo
To pick up changes from: db2ab24a341ce893 ("Add RWF_NOSIGNAL flag for pwritev2") These are used to beautify fs syscall arguments, albeit the changes in this update are not affecting those beautifiers. This addresses these tools/ build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/fs.h include/uapi/linux/fs.h Please see tools/include/uapi/README for details (it's in the first patch of this series). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lauri Vasama <git@vasama.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-31perf tools: Cache counter names for raw samples on s390Ian Rogers
Searching all event names is slower now that legacy names are included. Add a cache to avoid long iterative searches. Note, the cache isn't cleaned up and is as such a memory leak, however, globally reachable leaks like this aren't treated as leaks by leak sanitizer. Reported-by: Thomas Richter <tmricht@linux.ibm.com> Closes: https://lore.kernel.org/linux-perf-users/09943f4f-516c-4b93-877c-e4a64ed61d38@linux.ibm.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-31perf trace: Increase syscall handler map size to 1024Namhyung Kim
The syscalls_sys_{enter,exit} map in augmented_raw_syscalls.bpf.c has max entries of 512. Usually syscall numbers are smaller than this but x86 has x32 ABI where syscalls start from 512. That makes trace__init_syscalls_bpf_prog_array_maps() fail in the middle of the loop when it accesses those keys. As the loop iteration is not ordered by syscall numbers anymore, the failure can affect non-x32 syscalls. Let's increase the map size to 1024 so that it can handle those ABIs too. While most systems won't need this, increasing the size will be safer for potential future changes. Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-31perf vendor events AmpereOneX: Fix spelling typo in the metrics fileChu Guangqing
The json file incorrectly used "acceses" instead of "accesses". Signed-off-by: Chu Guangqing <chuguangqing@inspur.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-31perf vendor events arm64: Fix typo in Ampere eMag json fileChu Guangqing
Correct instruction spelling errors. Signed-off-by: Chu Guangqing <chuguangqing@inspur.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-31perf record: skip synthesize event when open evsel failedShuai Xue
When using perf record with the `--overwrite` option, a segmentation fault occurs if an event fails to open. For example: perf record -e cycles-ct -F 1000 -a --overwrite Error: cycles-ct:H: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' perf: Segmentation fault #0 0x6466b6 in dump_stack debug.c:366 #1 0x646729 in sighandler_dump_stack debug.c:378 #2 0x453fd1 in sigsegv_handler builtin-record.c:722 #3 0x7f8454e65090 in __restore_rt libc-2.32.so[54090] #4 0x6c5671 in __perf_event__synthesize_id_index synthetic-events.c:1862 #5 0x6c5ac0 in perf_event__synthesize_id_index synthetic-events.c:1943 #6 0x458090 in record__synthesize builtin-record.c:2075 #7 0x45a85a in __cmd_record builtin-record.c:2888 #8 0x45deb6 in cmd_record builtin-record.c:4374 #9 0x4e5e33 in run_builtin perf.c:349 #10 0x4e60bf in handle_internal_command perf.c:401 #11 0x4e6215 in run_argv perf.c:448 #12 0x4e653a in main perf.c:555 #13 0x7f8454e4fa72 in __libc_start_main libc-2.32.so[3ea72] #14 0x43a3ee in _start ??:0 The --overwrite option implies --tail-synthesize, which collects non-sample events reflecting the system status when recording finishes. However, when evsel opening fails (e.g., unsupported event 'cycles-ct'), session->evlist is not initialized and remains NULL. The code unconditionally calls record__synthesize() in the error path, which iterates through the NULL evlist pointer and causes a segfault. To fix it, move the record__synthesize() call inside the error check block, so it's only called when there was no error during recording, ensuring that evlist is properly initialized. Fixes: 4ea648aec019 ("perf record: Add --tail-synthesize option") Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-31perf lock contention: Load kernel map before lookupNamhyung Kim
On some machines, it caused troubles when it tried to find kernel symbols. I think it's because kernel modules and kallsyms are messed up during load and split. Basically we want to make sure the kernel map is loaded and the code has it in the lock_contention_read(). But recently we added more lookups in the lock_contention_prepare() which is called before _read(). Also the kernel map (kallsyms) may not be the first one in the group like on ARM. Let's use machine__kernel_map() rather than just loading the first map. Reviewed-by: Ian Rogers <irogers@google.com> Fixes: 688d2e8de231c54e ("perf lock contention: Add -l/--lock-addr option") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-28perf test workload: Add thread count argument to thloopIan Rogers
Allow the number of threads for the thloop workload to be increased beyond the normal 2. Add error checking to the parsed time and thread count values. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>