diff options
| author | Dmitry Ilvokhin <d@ilvokhin.com> | 2026-05-17 22:01:33 +0200 |
|---|---|---|
| committer | Thomas Gleixner <tglx@kernel.org> | 2026-05-26 16:21:11 +0200 |
| commit | 115bbf0c1b60cb7bed348c64694eb88e21e7d458 (patch) | |
| tree | a1e6cec0e2b5e0889d37252fc22c24303b877116 /drivers/platform/wmi/tests/git@git.tavy.me:linux.git | |
| parent | c2c7983c93f5d86962318be7e7298f1bc3feb1a6 (diff) | |
x86/irq: Optimize interrupts decimals printing
Monitoring tools periodically scan /proc/interrupts to export metrics as a
timeseries for future analysis and investigation.
In large fleets, /proc/interrupts is polled (often every few seconds) on
every machine. The cumulative overhead adds up quickly across thousands
of nodes, so reducing the cost of generating these stats does have a
measurable operational impact. With the ongoing trend toward higher core
counts per machine, this cost becomes even more noticeable over time,
since interrupt counters are per-CPU. In Meta's fleet, we have observed
this overhead at scale.
Although a binary /proc interface would be a better long-term solution
due to lower formatting (kernel side) and parsing (userspace side)
overhead, the text interface will remain in use for some time, even if
better solutions will be available. Optimizing the /proc/interrupts
printing code is therefore still beneficial.
Function seq_printf() supports rich format string for decimals printing,
but it doesn't required for printing /proc/interrupts per CPU counters,
seq_put_decimal_ull_width() function can be used instead to print per
CPU counters, because very limited formatting is required for this case.
Similar optimization idea is already used in show_interrupts().
As a side effect this aligns the x86 decriptions with the generic
interrupts event descriptions.
Performance counter stats (truncated) for 'sh -c cat /proc/interrupts
Before:
3.42 msec task-clock # 0.802 CPUs utilized ( +- 0.05% )
1 context-switches # 291.991 /sec ( +- 0.74% )
0 cpu-migrations # 0.000 /sec
343 page-faults # 100.153 K/sec ( +- 0.01% )
8,932,242 instructions # 1.66 insn per cycle ( +- 0.34% )
5,374,427 cycles # 1.569 GHz ( +- 0.04% )
1,483,154 branches # 433.068 M/sec ( +- 0.22% )
28,768 branch-misses # 1.94% of all branches ( +- 0.31% )
0.00427182 +- 0.00000215 seconds time elapsed ( +- 0.05% )
After:
2.39 msec task-clock # 0.796 CPUs utilized ( +- 0.06% )
1 context-switches # 418.541 /sec ( +- 0.70% )
0 cpu-migrations # 0.000 /sec
343 page-faults # 143.560 K/sec ( +- 0.01% )
7,020,982 instructions # 1.30 insn per cycle ( +- 0.52% )
5,397,266 cycles # 2.259 GHz ( +- 0.06% )
1,569,648 branches # 656.962 M/sec ( +- 0.08% )
25,419 branch-misses # 1.62% of all branches ( +- 0.72% )
0.00299996 +- 0.00000206 seconds time elapsed ( +- 0.07% )
Relative speed up in time elapsed is around 29%.
[ tglx: Fixed it up so it applies to current mainline ]
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Radu Rendec <radu@rendec.net>
Link: https://patch.msgid.link/aQj5mGZ6_BBlAm3B@shell.ilvokhin.com
Link: https://patch.msgid.link/20260517194930.949709489@kernel.org
Diffstat (limited to 'drivers/platform/wmi/tests/git@git.tavy.me:linux.git')
0 files changed, 0 insertions, 0 deletions
