summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2026-01-13ACPI: PM: s2idle: Add module parameter for LPS0 constraints checkingRafael J. Wysocki
Commit 32ece31db4df ("ACPI: PM: s2idle: Only retrieve constraints when needed") attempted to avoid useless evaluation of LPS0 _DSM Function 1 in lps0_device_attach() because pm_debug_messages_on might never be set (and that is the case on production systems most of the time), but it turns out that LPS0 _DSM Function 1 is generally problematic on some platforms and causes suspend issues to occur when pm_debug_messages_on is set now. In Linux, LPS0 _DSM Function 1 is only useful for diagnostics and only in the cases when the system does not reach the deepest platform idle state during suspend-to-idle for some reason. If such diagnostics is not necessary, evaluating it is a loss of time, so using it along with the other pm_debug_messages_on diagnostics is questionable because the latter is expected to be suitable for collecting debug information even during production use of system suspend. For this reason, add a module parameter called check_lps0_constraints to control whether or not the list of LPS0 constraints will be checked in acpi_s2idle_prepare_late_lps0() and so whether or not to evaluate LPS0 _DSM Function 1 (once) in acpi_s2idle_begin_lps0(). Fixes: 32ece31db4df ("ACPI: PM: s2idle: Only retrieve constraints when needed") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Link: https://patch.msgid.link/2827214.mvXUDI8C0e@rafael.j.wysocki
2026-01-13nvme: expose active quirks in sysfsMaurizio Lombardi
Currently, there is no straightforward way for a user to inspect which quirks are active for a given device from userspace. Add a new "quirks" sysfs attribute to the nvme controller device. Reading this file will display a human-readable list of all active quirks, with each quirk name on a new line. If no quirks are active, it will display "none". Tested-by: John Meneghini <jmeneghi@redhat.com> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2026-01-13nvmet: do not copy beyond sybsysnqn string lengthShin'ichiro Kawasaki
Commit edd17206e363 ("nvmet: remove redundant subsysnqn field from ctrl") replaced ctrl->subsysnqn with ctrl->subsys->subsysnqn. This change works as expected because both point to strings with the same data. However, their memory allocation lengths differ. ctrl->subsysnqn had the fixed size defined as NVMF_NQN_FILED_LEN, while ctrl->subsys->subsysnqn has variable length determined by kstrndup(). Due to this difference, KASAN slab-out-of-bounds occurs at memcpy() in nvmet_passthru_override_id_ctrl() after the commit. The failure can be recreated by running the blktests test case nvme/033. To prevent such failures, replace memcpy() with strscpy(), which copies only the string length and avoids overruns. Fixes: edd17206e363 ("nvmet: remove redundant subsysnqn field from ctrl") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2026-01-13PCI/portdrv: Use bus-type functionsUwe Kleine-König
Instead of assigning the probe function for each driver individually, use .probe() and .remove() from the pci_express bus. Rename the functions for consistency. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/83d1edc7d619423331fa6802f0e7da3919a308a9.1764688034.git.u.kleine-koenig@baylibre.com
2026-01-13PCI/portdrv: Don't check for valid device and driver in bus callbacksUwe Kleine-König
The driver core ensures that in .probe() and .remove() both dev and dev->driver are valid. So drop the respective check. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/2cc2e15e05318b9f0d7b6a2b69b3169d2a6f0bd3.1764688034.git.u.kleine-koenig@baylibre.com
2026-01-13nvme/host: fixup some typosWilfred Mallawa
Fix up some minor typos in the nvme host driver and a comment style to conform to the standard kernel style. Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2026-01-13PCI/portdrv: Move pcie_port_bus_type to pcie source fileUwe Kleine-König
Conceptually the pci_express bus doesn't belong in generic PCI code. Move pcie_port_bus_match() and pcie_port_bus_type to pcie/portdrv.c. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/420d771f0091dea7cf18f445b94301576dcee4c8.1764688034.git.u.kleine-koenig@baylibre.com
2026-01-13PCI/portdrv: Don't check for the driver's and device's busUwe Kleine-König
The driver core ensures that the match function is only called for drivers and devices of the right bus. So drop the useless check. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/09ca261912a37d2b253f43359a5dfeec42c016dc.1764688034.git.u.kleine-koenig@baylibre.com
2026-01-13PCI/portdrv: Drop empty shutdown callbackUwe Kleine-König
.shutdown() is an optional callback and the core only calls it if the pointer in struct device_driver is non-NULL. So make nothing in a bit shorter time and remove the empty function. Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/283fef06ac51efbb7df25f347d6f3a2967f96429.1764688034.git.u.kleine-koenig@baylibre.com
2026-01-13PCI/portdrv: Fix potential resource leakUwe Kleine-König
pcie_port_probe_service() unconditionally calls get_device() (unless it fails). So drop that reference also unconditionally as it's fine for a PCIe driver to not have a remove callback. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/e1c68c3b3f1af8427e98ca5e2c79f8bf0ebe2ce4.1764688034.git.u.kleine-koenig@baylibre.com
2026-01-13pinctrl: lynxpoint: Convert to use intel_gpio_add_pin_ranges()Andy Shevchenko
Driver is ready to use intel_gpio_add_pin_ranges() directly instead of custom approach. Convert it now. Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2026-01-13pinctrl: baytrail: Convert to use intel_gpio_add_pin_ranges()Andy Shevchenko
Driver is ready to use intel_gpio_add_pin_ranges() directly instead of custom approach. Convert it now. Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2026-01-13selinux: add support for BPF token access controlEric Suen
BPF token support was introduced to allow a privileged process to delegate limited BPF functionality—such as map creation and program loading—to an unprivileged process: https://lore.kernel.org/linux-security-module/20231130185229.2688956-1-andrii@kernel.org/ This patch adds SELinux support for controlling BPF token access. With this change, SELinux policies can now enforce constraints on BPF token usage based on both the delegating (privileged) process and the recipient (unprivileged) process. Supported operations currently include: - map_create - prog_load High-level workflow: 1. An unprivileged process creates a VFS context via `fsopen()` and obtains a file descriptor. 2. This descriptor is passed to a privileged process, which configures BPF token delegation options and mounts a BPF filesystem. 3. SELinux records the `creator_sid` of the privileged process during mount setup. 4. The unprivileged process then uses this BPF fs mount to create a token and attach it to subsequent BPF syscalls. 5. During verification of `map_create` and `prog_load`, SELinux uses `creator_sid` and the current SID to check policy permissions via: avc_has_perm(creator_sid, current_sid, SECCLASS_BPF, BPF__MAP_CREATE, NULL); The implementation introduces two new permissions: - map_create_as - prog_load_as At token creation time, SELinux verifies that the current process has the appropriate `*_as` permission (depending on the `allowed_cmds` value in the bpf_token) to act on behalf of the `creator_sid`. Example SELinux policy: allow test_bpf_t self:bpf { map_create map_read map_write prog_load prog_run map_create_as prog_load_as }; Additionally, a new policy capability bpf_token_perms is added to ensure backward compatibility. If disabled, previous behavior ((checks based on current process SID)) is preserved. Signed-off-by: Eric Suen <ericsu@linux.microsoft.com> Tested-by: Daniel Durning <danieldurning.work@gmail.com> Reviewed-by: Daniel Durning <danieldurning.work@gmail.com> [PM: merge fuzz, subject tweaks, whitespace tweaks, line length tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com>
2026-01-13perf addr_location: Update outdated commentJulia Lawall
The function addr_location__put() was renamed addr_location__exit() in commit 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy functions"). Make the comment preceding the function consistent with the function itself. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kexin Sun <kexinsun@smail.nju.edu.cn> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ratnadira Widyasari <ratnadiraw@smu.edu.sg> Cc: Xutong Ma <xutong.ma@inria.fr> Cc: Yumbo Lyu <yunbolyu@smu.edu.sg> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-13perf/core: Speed up kexec shutdown by avoiding unnecessary cross CPU callsJan H. Schönherr
There are typically a lot of PMUs registered, but in many cases only few of them have an event registered (like the "cpu" PMU in the presence of the watchdog). As the mutex is already held, it's safe to just check for existing events before doing the cross CPU call. This change saves tens of milliseconds from kexec time (perceived as steal time during a hypervisor host update), with <2ms remaining for this step in the shutdown. There might be additional potential for parallelization or we could just disable performance monitoring during the actual shutdown and be less graceful about it. Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2026-01-13perf tools: Dump callchain context marker namesJames Clark
These are hard to interpret in the raw output because they are printed as hex but are defined in perf_event.h as decimal. Make it much easier to read the raw callchains by just printing their names. For example: $ perf report -D 1798195372321 0x4638 [0xb0]: PERF_RECORD_SAMPLE(IP, 0x4002): 44922/44922: 0x7c8046dd3400 period: 120218 addr: 0 ... FP chain: nr:12 ..... 0: fffffffffffffe00 (PERF_CONTEXT_USER) ..... 1: 00007c8046dd3400 ..... 2: 00007c8046db86d3 Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> [ Add PERF_CONTEXT_USER_DEFERRED too, as per Namhyung's review comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-13perf vendor events arm64: Remove uncountable eventsJames Clark
These events are never countable by the PMU and are only intended to be used as external inputs to trace. Therefore showing them in 'perf list' is misleading so remove them. The generator script doesn't emit these events when used with the new telemetry-solution input files [1]. 'perf list' should only show countable events because there are events that are sometimes implemented, sometimes countable and sometimes not, for example TRB_TRIG. If we always include any implemented events whether they are countable or not then it's not possible to tell whether they are usable in perf without going to the docs, defeating the point of 'perf list'. It's also not useful yet to display implemented events that are not countable (for help in using trace rather than perf stat), because PMU_OVFS and PMU_HOVFS are practically always implemented and TRB_TRIG is always implemented when there is TRBE. [1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/tree/main/data/pmu/cpu Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akio Kakuno <fj3333bs@aa.jp.fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Yoshihiro Furudera <fj5100bi@fujitsu.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-13perf Documentation: Correct branch stack sampling call-stack optionDapeng Mi
The correct call-stack option for branch stack sampling should be "stack" instead of "call_stack". Correct it. $perf record -e instructions -j call_stack -- sleep 1 unknown branch filter call_stack, check man page Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -j, --branch-filter <branch filter mask> branch stack filter modes Fixes: 955f6def5590ce6c ("perf record: Add remaining branch filters: "no_cycles", "no_flags" & "hw_index"") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Xudong Hao <xudong.hao@intel.com> Cc: Zide Chen <zide.chen@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-13perf test: Do not skip when some metric-group tests succeedNamhyung Kim
I think the return value of SKIP (2) should be used when it skipped the entire test suite rather than a few of them. While the FAIL should be reserved if any of test failed. $ perf test -vv 109 109: perf all metricgroups test: --- start --- test child forked, pid 2493003 Testing Backend Testing Bad Testing BadSpec Testing BigFootprint Testing BrMispredicts Testing Branches Testing BvBC Testing BvBO Testing BvCB Testing BvFB Testing BvIO Testing BvMB Testing BvML Testing BvMP Testing BvMS Testing BvMT Testing BvOB Testing BvUW Testing CacheHits Testing CacheMisses Testing CodeGen Testing Compute Testing Cor Testing DSB Testing DSBmiss Testing DataSharing Testing Default Testing Default2 Testing Default3 Testing Default4 Ignoring failures in Default4 that may contain unsupported legacy events Testing Fed Testing FetchBW Testing FetchLat Testing Flops Testing FpScalar Testing FpVector Testing Frontend Testing HPC Testing IcMiss Testing InsType Testing LSD Testing LockCont Testing MachineClears Testing Machine_Clears Testing Mem Testing MemOffcore Testing MemoryBW Testing MemoryBound Testing MemoryLat Testing MemoryTLB Testing Memory_BW Testing Memory_Lat Testing MicroSeq Testing OS Testing Offcore Testing PGO Testing Pipeline Testing PortsUtil Testing Power Testing Prefetches Testing Ret Testing Retire Testing SMT Testing Snoop Testing SoC Testing Summary Testing TmaL1 Testing TmaL2 Testing TmaL3mem Testing TopdownL1 Testing TopdownL2 Testing TopdownL3 Testing TopdownL4 Testing TopdownL5 Testing TopdownL6 Testing smi Testing tma_L1_group Testing tma_L2_group Testing tma_L3_group Testing tma_L4_group Testing tma_L5_group Testing tma_L6_group Testing tma_alu_op_utilization_group Testing tma_assists_group Testing tma_backend_bound_group Testing tma_bad_speculation_group Testing tma_branch_mispredicts_group Testing tma_branch_resteers_group Testing tma_code_stlb_miss_group Testing tma_core_bound_group Testing tma_divider_group Testing tma_dram_bound_group Testing tma_dtlb_load_group Testing tma_dtlb_store_group Testing tma_fetch_bandwidth_group Testing tma_fetch_latency_group Testing tma_fp_arith_group Testing tma_fp_vector_group Testing tma_frontend_bound_group Testing tma_heavy_operations_group Testing tma_icache_misses_group Testing tma_issue2P Testing tma_issueBM Testing tma_issueBW Testing tma_issueComp Testing tma_issueD0 Testing tma_issueFB Testing tma_issueFL Testing tma_issueL1 Testing tma_issueLat Testing tma_issueMC Testing tma_issueMS Testing tma_issueMV Testing tma_issueRFO Testing tma_issueSL Testing tma_issueSO Testing tma_issueSmSt Testing tma_issueSpSt Testing tma_issueSyncxn Testing tma_issueTLB Testing tma_itlb_misses_group Testing tma_l1_bound_group Testing tma_l2_bound_group Testing tma_l3_bound_group Testing tma_light_operations_group Testing tma_load_stlb_miss_group Testing tma_machine_clears_group Testing tma_memory_bound_group Testing tma_microcode_sequencer_group Testing tma_mite_group Testing tma_other_light_ops_group Testing tma_ports_utilization_group Testing tma_ports_utilized_0_group Testing tma_ports_utilized_3m_group Testing tma_retiring_group Testing tma_serializing_operation_group Testing tma_store_bound_group Testing tma_store_stlb_miss_group Testing transaction ---- end(0) ---- 109: perf all metricgroups test : Ok Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-13perf test: Do not skip when some metrics tests succeededNamhyung Kim
I think the return value of SKIP (2) should be used when it skipped the entire test suite rather than a few of them. While the FAIL should be reserved if any of test failed. $ perf test -vv 110 110: perf all metrics test: --- start --- test child forked, pid 2496399 Testing tma_core_bound Testing tma_info_core_ilp Testing tma_info_memory_l2mpki Testing tma_memory_bound Testing tma_bottleneck_irregular_overhead Testing tma_bottleneck_mispredictions Testing tma_info_bad_spec_branch_misprediction_cost Testing tma_info_bad_spec_ipmisp_cond_ntaken Testing tma_info_bad_spec_ipmisp_cond_taken Testing tma_info_bad_spec_ipmisp_indirect Testing tma_info_bad_spec_ipmisp_ret Testing tma_info_bad_spec_ipmispredict Testing tma_info_branches_callret Testing tma_info_branches_cond_nt Testing tma_info_branches_cond_tk Testing tma_info_branches_jump Testing tma_info_branches_other_branches Testing tma_branch_mispredicts Testing tma_clears_resteers Testing tma_machine_clears Testing tma_mispredicts_resteers Testing tma_bottleneck_big_code Testing tma_icache_misses Testing tma_itlb_misses Testing tma_unknown_branches Testing tma_info_bad_spec_spec_clears_ratio Testing tma_other_mispredicts Testing tma_branch_instructions Testing tma_info_frontend_tbpc Testing tma_info_inst_mix_bptkbranch Testing tma_info_inst_mix_ipbranch Testing tma_info_inst_mix_ipcall Testing tma_info_inst_mix_iptb Testing tma_info_system_ipfarbranch Testing tma_info_thread_uptb Testing tma_bottleneck_branching_overhead Testing tma_nop_instructions Testing tma_bottleneck_compute_bound_est Testing tma_divider Testing tma_ports_utilized_3m Testing tma_bottleneck_instruction_fetch_bw Testing tma_frontend_bound Testing tma_assists Testing tma_other_nukes Testing tma_serializing_operation Testing tma_bottleneck_data_cache_memory_bandwidth Testing tma_fb_full Testing tma_mem_bandwidth Testing tma_sq_full Testing tma_bottleneck_data_cache_memory_latency Testing tma_l1_latency_dependency Testing tma_l2_bound Testing tma_l3_hit_latency Testing tma_mem_latency Testing tma_store_latency Testing tma_bottleneck_memory_synchronization Testing tma_contested_accesses Testing tma_data_sharing Testing tma_false_sharing Testing tma_bottleneck_memory_data_tlbs Testing tma_dtlb_load Testing tma_dtlb_store Testing tma_backend_bound Testing tma_bottleneck_other_bottlenecks Testing tma_bottleneck_useful_work Testing tma_retiring Testing tma_info_memory_fb_hpki Testing tma_info_memory_l1mpki Testing tma_info_memory_l1mpki_load Testing tma_info_memory_l2hpki_all Testing tma_info_memory_l2hpki_load Testing tma_info_memory_l2mpki_all Testing tma_info_memory_l2mpki_load Testing tma_l1_bound Testing tma_l3_bound Testing tma_info_memory_l2mpki_rfo Testing tma_fp_scalar Testing tma_fp_vector Testing tma_fp_vector_128b Testing tma_fp_vector_256b Testing tma_fp_vector_512b Testing tma_port_0 Testing tma_x87_use Testing tma_info_botlnk_l0_core_bound_likely Testing tma_info_core_fp_arith_utilization Testing tma_info_pipeline_execute Testing tma_info_system_gflops Testing tma_info_thread_execute_per_issue Testing tma_dsb Testing tma_info_botlnk_l2_dsb_bandwidth Testing tma_info_frontend_dsb_coverage Testing tma_decoder0_alone Testing tma_dsb_switches Testing tma_info_botlnk_l2_dsb_misses Testing tma_info_frontend_dsb_switch_cost Testing tma_info_frontend_ipdsb_miss_ret Testing tma_mite Testing tma_mite_4wide Testing CPUs_utilized Testing backend_cycles_idle [Ignored backend_cycles_idle] failed but as a Default metric this can be expected Performance counter stats for 'perf test -w noploop': <not counted> cpu-cycles:u <not supported> stalled-cycles-backend:u 1.014051473 seconds time elapsed 1.005718000 seconds user 0.008013000 seconds sys Testing branch_frequency Testing branch_miss_rate Testing cs_per_second Testing cycles_frequency Testing frontend_cycles_idle [Ignored frontend_cycles_idle] failed but as a Default metric this can be expected Performance counter stats for 'perf test -w noploop': <not counted> cpu-cycles:u <not supported> stalled-cycles-frontend:u 1.012813656 seconds time elapsed 1.004603000 seconds user 0.008004000 seconds sys Testing insn_per_cycle Testing migrations_per_second Testing page_faults_per_second Testing stalled_cycles_per_instruction [Ignored stalled_cycles_per_instruction] failed but as a Default metric this can be expected Error: No supported events found. The stalled-cycles-backend:u event is not supported. Testing tma_bad_speculation Testing l1d_miss_rate Testing llc_miss_rate Testing dtlb_miss_rate Testing itlb_miss_rate [Ignored itlb_miss_rate] failed but as a Default metric this can be expected Performance counter stats for 'perf test -w noploop': <not supported> iTLB-loads:u 3,097 iTLB-load-misses:u 1.012766732 seconds time elapsed 1.004318000 seconds user 0.008002000 seconds sys Testing l1i_miss_rate [Ignored l1i_miss_rate] failed but as a Default metric this can be expected Performance counter stats for 'perf test -w noploop': <not counted> L1-icache-load-misses:u <not supported> L1-icache-loads:u 1.013606395 seconds time elapsed 1.001371000 seconds user 0.011968000 seconds sys Testing l1_prefetch_miss_rate [Ignored l1_prefetch_miss_rate] failed but as a Default metric this can be expected Error: No supported events found. The L1-dcache-prefetches:u event is not supported. Testing tma_info_botlnk_l2_ic_misses Testing tma_info_frontend_fetch_upc Testing tma_info_frontend_icache_miss_latency Testing tma_info_frontend_ipunknown_branch Testing tma_info_frontend_lsd_coverage Testing tma_info_memory_tlb_code_stlb_mpki Testing tma_info_pipeline_fetch_dsb Testing tma_info_pipeline_fetch_lsd Testing tma_info_pipeline_fetch_mite Testing tma_info_pipeline_fetch_ms Testing tma_fetch_bandwidth Testing tma_lsd Testing tma_branch_resteers Testing tma_code_l2_hit Testing tma_code_l2_miss Testing tma_code_stlb_hit Testing tma_code_stlb_miss Testing tma_code_stlb_miss_2m Testing tma_code_stlb_miss_4k Testing tma_lcp Testing tma_ms_switches Testing tma_info_core_flopc Testing tma_info_inst_mix_iparith Testing tma_info_inst_mix_iparith_avx128 Testing tma_info_inst_mix_iparith_avx256 Testing tma_info_inst_mix_iparith_avx512 Testing tma_info_inst_mix_iparith_scalar_dp Testing tma_info_inst_mix_iparith_scalar_sp Testing tma_info_inst_mix_ipflop Testing tma_info_inst_mix_ippause Testing tma_fetch_latency Testing tma_fp_arith Testing tma_fp_assists Testing tma_info_system_cpu_utilization Testing tma_info_system_dram_bw_use [Skipped tma_info_system_dram_bw_use] Not supported events Performance counter stats for 'perf test -w noploop': <not supported> UNC_ARB_TRK_REQUESTS.ALL:u <not supported> UNC_ARB_COH_TRK_REQUESTS.ALL:u 1,013,554,749 duration_time 1.013527265 seconds time elapsed 1.005417000 seconds user 0.008011000 seconds sys Testing tma_info_frontend_l2mpki_code Testing tma_info_frontend_l2mpki_code_all Testing tma_info_inst_mix_ipload Testing tma_info_inst_mix_ipstore Testing tma_info_memory_latency_load_l2_miss_latency Testing tma_lock_latency Testing tma_info_memory_core_l1d_cache_fill_bw_2t Testing tma_info_memory_core_l2_cache_fill_bw_2t Testing tma_info_memory_core_l3_cache_access_bw_2t Testing tma_info_memory_core_l3_cache_fill_bw_2t Testing tma_info_memory_l1d_cache_fill_bw Testing tma_info_memory_l2_cache_fill_bw Testing tma_info_memory_l3_cache_access_bw Testing tma_info_memory_l3_cache_fill_bw Testing tma_info_memory_l3mpki Testing tma_info_memory_load_miss_real_latency Testing tma_info_memory_mix_bus_lock_pki Testing tma_info_memory_mix_uc_load_pki Testing tma_info_memory_mlp Testing tma_info_memory_tlb_load_stlb_mpki Testing tma_info_memory_tlb_page_walks_utilization Testing tma_info_memory_tlb_store_stlb_mpki Testing tma_info_system_mem_parallel_reads [Skipped tma_info_system_mem_parallel_reads] Not supported events Performance counter stats for 'perf test -w noploop': <not supported> UNC_ARB_DAT_OCCUPANCY.RD:u <not counted> UNC_ARB_DAT_OCCUPANCY.RD/cmask=1/ 1.013354884 seconds time elapsed 1.009239000 seconds user 0.004004000 seconds sys Testing tma_info_system_mem_read_latency [Skipped tma_info_system_mem_read_latency] Not supported events Performance counter stats for 'perf test -w noploop': <not supported> UNC_ARB_DAT_OCCUPANCY.RD:u <not counted> UNC_ARB_TRK_OCCUPANCY.RD <not counted> UNC_ARB_TRK_REQUESTS.RD 1.012882143 seconds time elapsed 1.004600000 seconds user 0.008036000 seconds sys Testing tma_info_thread_cpi Testing tma_streaming_stores Testing tma_dram_bound Testing tma_store_bound Testing tma_l2_hit_latency Testing tma_load_stlb_hit Testing tma_load_stlb_miss Testing tma_load_stlb_miss_1g Testing tma_load_stlb_miss_2m Testing tma_load_stlb_miss_4k Testing tma_store_stlb_hit Testing tma_store_stlb_miss Testing tma_store_stlb_miss_1g Testing tma_store_stlb_miss_2m Testing tma_store_stlb_miss_4k Testing tma_info_memory_latency_data_l2_mlp Testing tma_info_memory_latency_load_l2_mlp Testing tma_info_pipeline_ipassist Testing tma_microcode_sequencer Testing tma_ms Testing tma_info_system_kernel_cpi [Failed tma_info_system_kernel_cpi] Metric contains missing events Error: No supported events found. Access to performance monitoring and observability operations is limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open access to performance monitoring and observability operations for processes without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability. More information can be found at 'Perf events and tool security' document: https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html perf_event_paranoid setting is 2: -1: Allow use of (almost) all events by all users Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK >= 0: Disallow raw and ftrace function tracepoint access >= 1: Disallow CPU event access >= 2: Disallow kernel profiling To make the adjusted perf_event_paranoid setting permanent preserve it in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>) Testing tma_info_system_kernel_utilization [Failed tma_info_system_kernel_utilization] Metric contains missing events Error: No supported events found. Access to performance monitoring and observability operations is limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open access to performance monitoring and observability operations for processes without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability. More information can be found at 'Perf events and tool security' document: https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html perf_event_paranoid setting is 2: -1: Allow use of (almost) all events by all users Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK >= 0: Disallow raw and ftrace function tracepoint access >= 1: Disallow CPU event access >= 2: Disallow kernel profiling To make the adjusted perf_event_paranoid setting permanent preserve it in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>) Testing tma_info_pipeline_retire Testing tma_info_thread_clks Testing tma_info_thread_uoppi Testing tma_memory_operations Testing tma_other_light_ops Testing tma_ports_utilization Testing tma_ports_utilized_0 Testing tma_ports_utilized_1 Testing tma_ports_utilized_2 Testing C10_Pkg_Residency [Failed C10_Pkg_Residency] Metric contains missing events WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c10-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c10-residency/u) in per-thread mode, enable system wide with '-a'. Testing C2_Pkg_Residency [Failed C2_Pkg_Residency] Metric contains missing events WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c2-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c2-residency/u) in per-thread mode, enable system wide with '-a'. Testing C3_Pkg_Residency [Failed C3_Pkg_Residency] Metric contains missing events WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { msr/tsc/, cstate_pkg/c3-residency/ } Error: No supported events found. Invalid event (msr/tsc/u) in per-thread mode, enable system wide with '-a'. Testing C6_Core_Residency [Failed C6_Core_Residency] Metric contains missing events WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_core/c6-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_core/c6-residency/u) in per-thread mode, enable system wide with '-a'. Testing C6_Pkg_Residency [Failed C6_Pkg_Residency] Metric contains missing events WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c6-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c6-residency/u) in per-thread mode, enable system wide with '-a'. Testing C7_Core_Residency [Failed C7_Core_Residency] Metric contains missing events WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_core/c7-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_core/c7-residency/u) in per-thread mode, enable system wide with '-a'. Testing C7_Pkg_Residency [Failed C7_Pkg_Residency] Metric contains missing events WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c7-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c7-residency/u) in per-thread mode, enable system wide with '-a'. Testing C8_Pkg_Residency [Failed C8_Pkg_Residency] Metric contains missing events WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c8-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c8-residency/u) in per-thread mode, enable system wide with '-a'. Testing C9_Pkg_Residency [Failed C9_Pkg_Residency] Metric contains missing events WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c9-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c9-residency/u) in per-thread mode, enable system wide with '-a'. Testing tma_info_core_epc Testing tma_info_system_core_frequency Testing tma_info_system_power [Skipped tma_info_system_power] Not supported events Performance counter stats for 'perf test -w noploop': <not supported> Joules power/energy-pkg/u 1,013,238,256 duration_time 1.013223072 seconds time elapsed 0.995924000 seconds user 0.011903000 seconds sys Testing tma_info_system_power_license0_utilization Testing tma_info_system_power_license1_utilization Testing tma_info_system_power_license2_utilization Testing tma_info_system_turbo_utilization Testing tma_info_inst_mix_ipswpf Testing tma_info_memory_prefetches_useless_hwpf Testing tma_info_core_coreipc Testing tma_info_thread_ipc Testing tma_heavy_operations Testing tma_light_operations Testing tma_info_core_core_clks Testing tma_info_system_smt_2t_utilization Testing tma_info_thread_slots_utilization Testing UNCORE_FREQ [Skipped UNCORE_FREQ] Not supported events Performance counter stats for 'perf test -w noploop': <not supported> UNC_CLOCK.SOCKET:u 1,015,993,466 duration_time 1.015949387 seconds time elapsed 1.007676000 seconds user 0.008029000 seconds sys Testing tma_info_system_socket_clks [Failed tma_info_system_socket_clks] Metric contains missing events Error: No supported events found. Invalid event (UNC_CLOCK.SOCKET:u) in per-thread mode, enable system wide with '-a'. Testing tma_info_inst_mix_instructions Testing tma_info_system_cpus_utilized Testing tma_info_system_mux Testing tma_info_system_time Testing tma_info_thread_slots Testing tma_few_uops_instructions Testing tma_4k_aliasing Testing tma_cisc Testing tma_fp_divider Testing tma_int_divider Testing tma_slow_pause Testing tma_split_loads Testing tma_split_stores Testing tma_store_fwd_blk Testing tma_alu_op_utilization Testing tma_load_op_utilization Testing tma_mixing_vectors Testing tma_store_op_utilization Testing tma_port_1 Testing tma_port_5 Testing tma_port_6 Testing smi_cycles [Skipped smi_cycles] Not supported events Performance counter stats for 'perf test -w noploop': <not supported> msr/smi/u <not supported> msr/aperf/u 3,965,789,327 cycles:u 1.012779591 seconds time elapsed 1.004579000 seconds user 0.007972000 seconds sys Testing smi_num [Failed smi_num] Metric contains missing events Error: No supported events found. Invalid event (msr/smi/u) in per-thread mode, enable system wide with '-a'. Testing tsx_aborted_cycles Testing tsx_cycles_per_elision Testing tsx_cycles_per_transaction Testing tsx_transactional_cycles ---- end(-1) ---- 110: perf all metrics test : FAILED! Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-13perf test: Use shelldir to refer perf source locationNamhyung Kim
It uses tools/perf/include which assumes it's running from the root of the linux kernel source tree. But you can run perf from other places like tools/perf, then the include path won't match. We can use the shelldir variable to locate the test script in the tree. $ cd tools/perf $ ./perf test dlfilter 63: dlfilter C API : Ok 101: perf script --dlfilter tests : Ok Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-13perf test: Skip dlfilter test for build failuresNamhyung Kim
For some reason, it may fail to build the dlfilter. Let's skip the test as it's not an error in the perf. This can happen when you run the perf test without source code or in a different directory. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-13genirq/cpuhotplug: Notify about affinity changes breaking the affinity maskImran Khan
During CPU offlining the interrupts affined to that CPU are moved to other online CPUs, which might break the original affinity mask if the outgoing CPU was the last online CPU in that mask. This change is not propagated to irq_desc::affinity_notify(), which leaves users of the affinity notifier mechanism with stale information. Avoid this by scheduling affinity change notification work for interrupts that were affined to the CPU being offlined, if the new target CPU is not part of the original affinity mask. Since irq_set_affinity_locked() uses the same logic to schedule affinity change notification work, split out this logic into a dedicated function and use that at both places. [ tglx: Removed the EXPORT(), removed the !SMP stub, moved the prototype, added a lockdep assert instead of a comment, fixed up coding style and name space. Polished and clarified the change log ] Signed-off-by: Imran Khan <imran.f.khan@oracle.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260113143727.1041265-1-imran.f.khan@oracle.com
2026-01-13switch {alloc,free}_bprm() to CLASS()Al Viro
All linux_binprm instances come from alloc_bprm() and are unconditionally destroyed by free_bprm() in the end of the same scope. IOW, CLASS() machinery is a decent fit for those. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13do_execveat_common(): don't consume filename referenceAl Viro
... and convert its callers to CLASS(filename...) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13execve: fold {compat_,}do_execve{,at}() into their sole callersAl Viro
All of them are wrappers for do_execveat_common() and each has exactly one caller. The only difference is in the way they are constructing argv/envp arguments for do_execveat_common() and that's easy to do with less boilerplate. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13simplify the callers of alloc_bprm()Al Viro
alloc_bprm() starts with do_open_execat() and it will do the right thing if given ERR_PTR() for name. Allows to drop such checks in its callers... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13simplify the callers of do_open_execat()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13simplify the callers of file_open_name()Al Viro
It accepts ERR_PTR() for name and does the right thing in that case. That allows to simplify the logics in callers, making them trivial to switch to CLASS(filename). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13do_sys_openat2(): get rid of useless check, switch to CLASS(filename)Al Viro
do_file_open() will do the right thing when given ERR_PTR() as name... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13rename do_filp_open() to do_file_open()Al Viro
"filp" thing never made sense; seeing that there are exactly 4 callers in the entire tree (and it's neither exported nor even declared in linux/*/*.h), there's no point keeping that ugliness. FWIW, the 'filp' thing did originate in OSD&I; for some reason Tanenbaum decided to call the object representing an opened file 'struct filp', the last letter standing for 'position'. In all Unices, Linux included, the corresponding object had always been 'struct file'... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13do_filp_open(): DTRT when getting ERR_PTR() as pathnameAl Viro
The rest of the set_nameidata() callers treat IS_ERR(pathname) as "bail out immediately with PTR_ERR(pathname) as error". Makes life simpler for callers; do_filp_open() is the only exception and its callers would also benefit from such calling conventions change. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13ksmbd_vfs_rename(): vfs_path_parent_lookup() accepts ERR_PTR() as nameAl Viro
no need to check in the caller Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13ksmbd_vfs_path_lookup(): vfs_path_parent_lookup() accepts ERR_PTR() as nameAl Viro
no need to check in the caller Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13move_mount(): filename_lookup() accepts ERR_PTR() as filenameAl Viro
no need to check it in the caller Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13file_setattr(): filename_lookup() accepts ERR_PTR() as filenameAl Viro
no need to check it in the caller Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13file_getattr(): filename_lookup() accepts ERR_PTR() as filenameAl Viro
no need to check it in the caller Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13struct filename ->refcnt doesn't need to be atomicAl Viro
... or visible outside of audit, really. Note that references held in delayed_filename always have refcount 1, and from the moment of complete_getname() or equivalent point in getname...() there won't be any references to struct filename instance left in places visible to other threads. Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13allow incomplete imports of filenamesAl Viro
There are two filename-related problems in io_uring and its interplay with audit. Filenames are imported when request is submitted and used when it is processed. Unfortunately, the latter may very well happen in a different thread. In that case the reference to filename is put into the wrong audit_context - that of submitting thread, not the processing one. Audit logics is called by the latter, and it really wants to be able to find the names in audit_context current (== processing) thread. Another related problem is the headache with refcounts - normally all references to given struct filename are visible only to one thread (the one that uses that struct filename). io_uring violates that - an extra reference is stashed in audit_context of submitter. It gets dropped when submitter returns to userland, which can happen simultaneously with processing thread deciding to drop the reference it got. We paper over that by making refcount atomic, but that means pointless headache for everyone. Solution: the notion of partially imported filenames. Namely, already copied from userland, but *not* exposed to audit yet. io_uring can create that in submitter thread, and complete the import (obtaining the usual reference to struct filename) in processing thread. Object: struct delayed_filename. Primitives for working with it: delayed_getname(&delayed_filename, user_string) - copies the name from userland, returning 0 and stashing the address of (still incomplete) struct filename in delayed_filename on success and returning -E... on error. delayed_getname_uflags(&delayed_filename, user_string, atflags) - similar, in the same relation to delayed_getname() as getname_uflags() is to getname() complete_getname(&delayed_filename) - completes the import of filename stashed in delayed_filename and returns struct filename to caller, emptying delayed_filename. CLASS(filename_complete_delayed, name)(&delayed_filename) - variant of CLASS(filename) with complete_getname() for constructor. dismiss_delayed_filename(&delayed_filename) - destructor; drops whatever might be stashed in delayed_filename, emptying it. putname_to_delayed(&delayed_filename, name) - if name is shared, stashes its copy into delayed_filename and drops the reference to name, otherwise stashes the name itself in there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13switch __getname_maybe_null() to CLASS(filename_flags)Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13fs: hide names_cache behind runtime const machineryMateusz Guzik
s/names_cachep/names_cache/ for consistency with dentry cache. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13struct filename: saner handling of long namesAl Viro
Always allocate struct filename from names_cachep, long name or short; short names would be embedded into struct filename. Longer ones do not cannibalize the original struct filename - put them into kmalloc'ed buffers (PATH_MAX-sized for import from userland, strlen() + 1 - for ones originating kernel-side, where we know the length beforehand). Cutoff length for short names is chosen so that struct filename would be 192 bytes long - that's both a multiple of 64 and large enough to cover the majority of real-world uses. Simplifies logics in getname()/putname() and friends. [fixed an embarrassing braino in EMBEDDED_NAME_MAX, first reported by Dan Carpenter] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13struct filename: use names_cachep only for getname() and friendsAl Viro
Instances of struct filename come from names_cachep (via __getname()). That is done by getname_flags() and getname_kernel() and these two are the main callers of __getname(). However, there are other callers that simply want to allocate PATH_MAX bytes for uses that have nothing to do with struct filename. We want saner allocation rules for long pathnames, so that struct filename would *always* come from names_cachep, with the out-of-line pathname getting kmalloc'ed. For that we need to be able to change the size of objects allocated by getname_flags()/getname_kernel(). That requires the rest of __getname() users to stop using names_cachep; we could explicitly switch all of those to kmalloc(), but that would cause quite a bit of noise. So the plan is to switch getname_...() to new helpers and turn __getname() into a wrapper for kmalloc(). Remaining __getname() users could be converted to explicit kmalloc() at leisure, hopefully along with figuring out what size do they really want - PATH_MAX is an overkill for some of them, used out of laziness ("we have a convenient helper that does 4K allocations and that's large enough, let's use it"). As a side benefit, names_cachep is no longer used outside of fs/namei.c, so we can move it there and be done with that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13getname_flags() massage, part 2Al Viro
Take the "long name" case into a helper (getname_long()). In case of failure have the caller deal with freeing the original struct filename. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13getname_flags() massage, part 1Al Viro
In case of long name don't reread what we'd already copied. memmove() it instead. That avoids the possibility of ending up with empty name there and the need to look at the flags on the slow path. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13ntfs: ->d_compare() must not blockAl Viro
... so don't use __getname() there. Switch it (and ntfs_d_hash(), while we are at it) to kmalloc(PATH_MAX, GFP_NOWAIT). Yes, ntfs_d_hash() almost certainly can do with smaller allocations, but let ntfs folks deal with that - keep the allocation size as-is for now. Stop abusing names_cachep in ntfs, period - various uses of that thing in there have nothing to do with pathnames; just use k[mz]alloc() and be done with that. For now let's keep sizes as-in, but AFAICS none of the users actually want PATH_MAX. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13get rid of audit_reusename()Al Viro
Originally we tried to avoid multiple insertions into audit names array during retry loop by a cute hack - memorize the userland pointer and if there already is a match, just grab an extra reference to it. Cute as it had been, it had problems - two identical pointers had audit aux entries merged, two identical strings did not. Having different behaviour for syscalls that differ only by addresses of otherwise identical string arguments is obviously wrong - if nothing else, compiler can decide to merge identical string literals. Besides, this hack does nothing for non-audited processes - they get a fresh copy for retry. It's not time-critical, but having behaviour subtly differ that way is bogus. These days we have very few places that import filename more than once (9 functions total) and it's easy to massage them so we get rid of all re-imports. With that done, we don't need audit_reusename() anymore. There's no need to memorize userland pointer either. Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13do_readlinkat(): import pathname only onceAl Viro
Take getname_flags() and putname() outside of retry loop. Since getname_flags() is the only thing that cares about LOOKUP_EMPTY, don't bother with setting LOOKUP_EMPTY in lookup_flags - just pass it to getname_flags() and be done with that. The things could be further simplified by use of cleanup.h stuff, but let's not clutter the patch with that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13do_sys_truncate(): import pathname only onceAl Viro
Convert the user_path_at() call inside a retry loop into getname_flags() + filename_lookup() + putname() and leave only filename_lookup() inside the loop. In this case we never pass LOOKUP_EMPTY, so getname_flags() is equivalent to plain getname(). The things could be further simplified by use of cleanup.h stuff, but let's not clutter the patch with that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13user_statfs(): import pathname only onceAl Viro
Convert the user_path_at() call inside a retry loop into getname_flags() + filename_lookup() + putname() and leave only filename_lookup() inside the loop. In this case we never pass LOOKUP_EMPTY, so getname_flags() is equivalent to plain getname(). The things could be further simplified by use of cleanup.h stuff, but let's not clutter the patch with that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>