summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
11 daysnet: usb: lan78xx: restore VLAN and hash filters after link upNicolai Buchwitz
Configured VLANs intermittently stop receiving traffic after a link down/up cycle, e.g. when the network cable is unplugged and plugged back in. VLAN filtering stays enabled but all VLAN-tagged frames are dropped until a VLAN is added or removed again. The LAN7801 datasheet (DS00002123E) states: "A portion of the MAC operates on clocks generated by the Ethernet PHY. During a PHY reset event, this portion of the MAC is designed to not be taken out of reset until the PHY clocks are operational" (section 8.10, MAC Reset Watchdog Timer) "After a reset event, the RFE will automatically initialize the contents of the VHF to 0h." (section 7.1.4, VHF Organization) Thus a link down/up cycle stops and restarts the PHY clock, resets the PHY-clocked portion of the MAC, and the RFE clears its VLAN/DA hash filter (VHF) memory. The VHF holds both the VLAN filter table and the multicast hash table, but the driver never reprograms either from its shadow copy once the link is back, so both stay empty. Reprogram the VLAN filter and multicast hash tables on link up. Reported-by: Sven Schuchmann <schuchmann@schleissheimer.de> Closes: https://lore.kernel.org/netdev/BEZP281MB224501E38B30BFDC4BD3D364D9E32@BEZP281MB2245.DEUP281.PROD.OUTLOOK.COM/T/#u Tested-by: Sven Schuchmann <schuchmann@schleissheimer.de> Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de> Link: https://patch.msgid.link/20260622102911.484045-1-nb@tipi-net.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysveth: fix NAPI leak in XDP enable error pathEric Dumazet
During XDP enablement in veth, if xdp_rxq_info_reg() or xdp_rxq_info_reg_mem_model() fails, the driver rolls back the changes. However, the rollback loop: for (i--; i >= start; i--) { decrements the loop index 'i' before the first iteration. This correctly skips unregistering the rxq for the failed index 'i' (as registration failed or was already cleaned up), but it also erroneously skips calling netif_napi_deli() for rq[i].xdp_napi. Since netif_napi_add() was already called for index 'i', this leaves a dangling napi_struct in the device's napi_list. When the veth device is later destroyed, the freed queue memory (which contains the leaked NAPI structure) can be reused. The subsequent device teardown iterates the NAPI list and corrupts the reallocated memory, leading to UAF. Fix this by explicitly deleting the NAPI association for the failed index 'i' before rolling back the successfully configured queues. Fixes: b02e5a0ebb17 ("xsk: Propagate napi_id to XDP socket Rx path") Reported-by: Guenter Roeck <groeck@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20260622111825.88337-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: ti: icssg: Fix XSK zero copy TX during application wakeupMeghana Malladi
emac_xsk_xmit_zc() handles tx xmit for zero copy and gets called inside napi context. User application wakes up the kernel while initiating the transmit which triggers napi to start processing the tx packets. The num_tx check inside emac_tx_complete_packets() returns early if no packet transfer happen hindering the call to emac_xsk_xmit_zc(). Remove this check to let application wakeup initiate zero copy xmit traffic. Add __netif_tx_lock() to ensure that the TX queue is protected from concurrent access during the transmission of XDP frames. This fixes netdev watchdog timeout for long runs. Fixes: e2dc7bfd677f ("net: ti: icssg-prueth: Move common functions into a separate file") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Link: https://patch.msgid.link/20260618100348.2209907-1-m-malladi@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: dsa: sja1105: round up PTP perout pin durationAleksandrova Alyona
pin_duration is converted from the user-provided period to SJA1105 clock ticks and is later passed as the cycle_time argument to future_base_time(). Very small period values may become zero after the conversion, which can lead to a division by zero in future_base_time(). Round zero pin_duration up to 1 tick so that the smallest unsupported periods use the minimum non-zero hardware duration instead of passing zero to future_base_time(). Fixes: 747e5eb31d59 ("net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT") Signed-off-by: Aleksandrova Alyona <aga@itb.spb.ru> Link: https://patch.msgid.link/20260618110508.53094-1-aga@itb.spb.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: do not acquire dev->tx_global_lock in netdev_watchdog_up()Eric Dumazet
Marek Szyprowski reported a deadlock during system resume when virtio_net driver is used. The deadlock occurs because netif_device_attach() is called while holding dev->tx_global_lock (via netif_tx_lock_bh() in virtnet_restore_up()). netif_device_attach() calls __netdev_watchdog_up(), which now also tries to acquire dev->tx_global_lock to synchronize with dev_watchdog(). This recursive lock acquisition results in a deadlock. Fix this by removing the tx_global_lock acquisition from netdev_watchdog_up(). The critical state (watchdog_timer and watchdog_ref_held) is already protected by dev->watchdog_lock, which was introduced in the blamed commit. Fixes: 8eed5519e496 ("net: watchdog: fix refcount tracking races") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Closes: https://lore.kernel.org/netdev/a443376e-5187-4268-93b3-58047ef113a8@samsung.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260622110108.69541-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysMerge tag 'soundwire-7.2-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire Pull soundwire updates from Vinod Koul: - Improvements in handling of soundwire groups - Additional checks flagged by various tools - Intel driver updates for ghost Realtek device handling in firmware and adding devices to wake lists * tag 'soundwire-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: dmi-quirks: Disable ghost Realtek devices soundwire: only handle alert events when the peripheral is attached soundwire: intel_ace2x: release bpt_stream when close it soundwire: intel: Move suspend tracking from trigger to pm suspend soundwire: intel_auxdevice: Add es9356 to wake_capable_list soundwire: use krealloc_array to prevent integer overflow soundwire: increase group->max_size after allocation soundwire: fix bug in sdw_add_element_group_count found by syzkaller soundwire: don't program SDW_SCP_BUSCLOCK_SCALE on a unattached Peripheral soundwire: validate DT compatible before parsing it soundwire: intel_auxdevice: Add cs42l43b to wake_capable_list soundwire: stream: sdw_stream_remove_slave(): Check stream is valid
11 daysdocs: tools: Fix typo 'ackward' to 'awkward' in unittest.rstDeclan Wale
Signed-off-by: Declan Wale <decwale37@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <CADz3o9mbM60-p1PV8t=nOm7099KnFeYQOyo5J+bC2iiP9PtBJQ@mail.gmail.com>
11 daysnet, bpf: check master for NULL in xdp_master_redirect()Xiang Mei
xdp_master_redirect() dereferences the result of netdev_master_upper_dev_get_rcu() without a NULL check, but that helper returns NULL when the receiving device has no upper-master adjacency. The reach guard only checks netif_is_bond_slave(). On bond slave release bond_upper_dev_unlink() drops the upper-master adjacency before clearing IFF_SLAVE, so an XDP_TX reaching xdp_master_redirect() in that window still passes netif_is_bond_slave() while master is already NULL, and faults on master->flags at offset 0xb0: BUG: kernel NULL pointer dereference, address: 00000000000000b0 RIP: 0010:xdp_master_redirect (net/core/filter.c:4432) Call Trace: xdp_master_redirect (net/core/filter.c:4432) bpf_prog_run_generic_xdp (include/net/xdp.h:700) do_xdp_generic (net/core/dev.c:5608) __netif_receive_skb_one_core (net/core/dev.c:6204) process_backlog (net/core/dev.c:6319) __napi_poll (net/core/dev.c:7729) net_rx_action (net/core/dev.c:7792) handle_softirqs (kernel/softirq.c:622) __dev_queue_xmit (include/linux/bottom_half.h:33) packet_sendmsg (net/packet/af_packet.c:3082) __sys_sendto (net/socket.c:2252) Kernel panic - not syncing: Fatal exception in interrupt The missing check dates back to the original code; commit 1921f91298d1 ("net, bpf: fix null-ptr-deref in xdp_master_redirect() for down master") later added the master->flags read where the fault now lands but kept the unconditional deref. Check master for NULL before use; a NULL master is treated the same as one that is not up. Fixes: 879af96ffd72 ("net, core: Add support for XDP redirection to slave device") Reported-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Xiang Mei <xmei5@asu.edu> Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://patch.msgid.link/20260620201531.180123-1-xmei5@asu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 dayskdoc: xforms: ignore special static/inline macrosRandy Dunlap
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c contains 7 (for now) functions that use STATIC_IFN_KUNIT or INLINE_IFN_KUNIT macros for function qualifiers (static or not, inline or not). These cause parse warnings from kernel-doc: Invalid C declaration: Expected identifier in nested name, got keyword: struct [error at 29] STATIC_IFN_KUNIT const struct drm_color_lut * __extract_blob_lut (const struct drm_property_blob *blob, uint32_t *size) Handle these in kernel-doc to prevent multiple warnings. Fixes: 647d1fd04652 ("drm/amd/display: Add KUnit test for color helpers") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20260612234458.1084156-1-rdunlap@infradead.org>
11 daysMerge branch 'selftests-xsk-stabilize-timeout-test-behavior'Jakub Kicinski
Tushar Vyavahare says: ==================== selftests/xsk: stabilize timeout test behavior This series improves AF_XDP selftests by making timeout handling explicit and fixing sources of non-determinism in xsk timeout tests. Patch 1 introduces test_spec::poll_tmout and removes implicit dependence on RX UMEM setup state for timeout behavior. Patch 2 fixes thread harness sequencing by attaching XDP programs before worker startup, removing signal-based termination, and using barrier synchronization only for dual-thread runs. Patch 3 restores shared_umem after POLL_TXQ_FULL so test-local configuration does not leak into subsequent cases on shared-netdev runs. Together these changes make timeout handling easier to follow and improve selftest stability, especially on real NIC runs. ==================== Link: https://patch.msgid.link/20260616154955.1492560-1-tushar.vyavahare@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysselftests/xsk: restore shared_umem after POLL_TXQ_FULLTushar Vyavahare
POLL_TXQ_FULL temporarily disables shared_umem on TX to exercise the TX timeout path in isolation. With shared_umem enabled, TX setup expects RX UMEM to be initialized first and fails with: "RX UMEM is not initialized before shared-UMEM TX setup". Save and restore shared_umem around POLL_TXQ_FULL execution, and restore it on both success and pkt_stream_replace() failure paths. Also add an in-code comment explaining why shared_umem is temporarily disabled in this test. This keeps timeout setup local and prevents cross-test state leakage. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260616154955.1492560-4-tushar.vyavahare@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysselftests/xsk: fix timeout thread harness sequencingTushar Vyavahare
Prevent workers from running before XDP program attachment completes. The previous ordering allowed races between worker startup and setup. Attach XDP programs before entering traffic validation. Remove SIGUSR1-based worker termination and use pthread_join() for thread shutdown so blocking syscalls are not interrupted. Use barriers only for dual-thread runs so participants match and teardown ordering stays deterministic. This removes setup/startup races and stabilizes harness sequencing. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260616154955.1492560-3-tushar.vyavahare@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysselftests/xsk: make poll timeout mode explicitTushar Vyavahare
Stop inferring timeout behavior from RX UMEM initialization state. That ties timeout semantics to setup internals and obscures intent. Use test_spec::poll_tmout as the explicit timeout-mode selector in TX and RX paths. In RX, treat poll timeout as expected only in timeout mode. In TX, let send_pkts() own loop completion in non-timeout mode and use __send_pkts() only for progress and timeout detection. This makes timeout logic explicit and keeps control flow predictable. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260616154955.1492560-2-tushar.vyavahare@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 dayskdoc: xforms_lists: handle DECLARE_PER_CPU() in kernel-docRandy Dunlap
Add support for DECLARE_PER_CPU() as a var (variable) as used in <linux/netfilter/x_tables.h>. Warning: include/linux/netfilter/x_tables.h:345 function parameter 'seqcount_t' not described in 'DECLARE_PER_CPU' Warning: include/linux/netfilter/x_tables.h:345 function parameter 'xt_recseq' not described in 'DECLARE_PER_CPU' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20260614052452.1557987-1-rdunlap@infradead.org>
11 daysMAINTAINERS: Fix regex for kdocMatthew Wilcox (Oracle)
The trailing '*' means "all files in this directory, but not subdirectories" which excluded tools/lib/python/kdoc/. This is surely not intended. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20260615154057.2156589-1-willy@infradead.org>
11 daysMerge tag 'sched_ext-for-7.2-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext tree reorg from Tejun Heo: "Pure source reorganization with no functional change: - the kernel/sched/ext* files move into a new kernel/sched/ext/ subdirectory - the headers and sources are made self-contained so editor tooling can parse each file on its own" * tag 'sched_ext-for-7.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Move shared helpers from ext.c into internal.h and cid.h sched_ext: Make kernel/sched/ext/ sources self-contained for clangd sched_ext: Move sources under kernel/sched/ext/
11 daysdocs: kgdb: Fix path of driver optionsZenghui Yu
The correct path of driver options should be /sys/module/<driver>/parameters/<option>. Fix it. Signed-off-by: Zenghui Yu <zenghui.yu@linux.dev> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20260620234035.9917-1-zenghui.yu@linux.dev>
11 daysDocumentation: tracing: fix typo in events documentationYudistira Putra
Fix a typo in the tracing events documentation: "can by built up" should be "can be built up". Signed-off-by: Yudistira Putra <pyudistira519@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20260622143735.71778-1-pyudistira519@gmail.com>
11 daysDocs/driver-api/uio-howto: document mmap_prepare callbackDoehyun Baek
The UIO howto still documents an mmap callback in struct uio_info. That field was replaced by mmap_prepare, which takes a struct vm_area_desc. A UIO driver following the current howto no longer builds because struct uio_info has no mmap member. Update the documented callback signature and matching text to match the current API. Fixes: 933f05f58ac6 ("uio: replace deprecated mmap hook with mmap_prepare in uio_info") Signed-off-by: Doehyun Baek <doehyunbaek@gmail.com> Reviewed-by: Lorenzo Stoakes <ljs@kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20260622181821.1195257-1-doehyunbaek@gmail.com>
11 daysdocs/mm: clarify that we are not looking for LLM generated contentDavid Hildenbrand (Arm)
Let's make it clear that we are not looking for LLM generated content from contributors not familiar with the details of MM, as it shifts the real work onto reviewers. Signed-off-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Dev Jain <dev.jain@arm.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Reviewed-by: Lorenzo Stoakes <ljs@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Harry Yoo (Oracle) <harry@kernel.org> Acked-by: SeongJae Park <sj@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20260420-llmdoc-v1-1-47d2091177c4@kernel.org>
11 dayskernel-doc: xforms: support __SYSFS_FUNCTION_ALTERNATIVE()Randy Dunlap
Add support for __SYSFS_FUNCTION_ALTERNATIVE() to create a union of its members (as though CONFIG_CFI is unset). Fixes these docs build warnings: WARNING: include/linux/device.h:117 Invalid param: __SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*show)(struct device *dev, struct device_attribute *attr, char *buf) WARNING: include/linux/device.h:117 struct member '__SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*show' not described in 'device_attribute' WARNING: include/linux/device.h:117 Invalid param: __SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*store)(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) WARNING: include/linux/device.h:117 struct member '__SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*store' not described in 'device_attribute' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20260623190006.406571-1-rdunlap@infradead.org>
11 daysPCI/sysfs: Use kstrtobool() to parse the ROM attribute inputKrzysztof Wilczyński
pci_write_rom() controls access to the ROM content through the corresponding sysfs attribute, and treats the input as a request to disable only when it matches the string "0\n" exactly: if ((off == 0) && (*buf == '0') && (count == 2)) The count == 2 condition encodes the trailing newline that echo(1) appends. This was found when userspace wrote "0" without a trailing newline aiming to disable access, which failed to match the condition above and enabled access instead. For example: $ echo 0 > rom # "0\n", count 2, access disabled $ echo -n 0 > rom # "0", count 1, access enabled $ echo > rom # "", count 1, access enabled (likely not desirable) Parse the input with kstrtobool(), which handles common boolean inputs such as "0", "1", "n", "y" or "off", "on", with or without a trailing newline, so both of the above disable access, and update the now stale comment. As a side effect, input that does not parse as a boolean is rejected with -EINVAL rather than enabling access. The documented "0" and "1" continue to work as before, and rejecting malformed input brings the attribute in line with how sysfs attributes typically handle it. Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260612182448.552406-1-kwilczynski@kernel.org
11 daysPCI/sysfs: Limit BAR resize attribute scope to platforms with PCI mmapKrzysztof Wilczyński
Currently, __resource_resize_store() uses sysfs_remove_groups() and sysfs_create_groups() on pci_dev_resource_attr_groups to tear down and recreate the resourceN files after a BAR resize, so the updated BAR sizes are visible in sysfs. The resourceN files only exist on platforms that define HAVE_PCI_MMAP or ARCH_GENERIC_PCI_MMAP_RESOURCE. On platforms that define neither, pci_dev_resource_attr_groups is NULL and the sysfs_remove_groups() and sysfs_create_groups() calls in __resource_resize_store() become no-ops. Resizable BAR (ReBAR) is a PCI Express Extended Capability (PCI_EXT_CAP_ID_REBAR) that requires PCIe extended config space. Every PCIe-capable architecture defines HAVE_PCI_MMAP or ARCH_GENERIC_PCI_MMAP_RESOURCE (via arch headers or the asm-generic/pci.h fallback). Architectures without either only support conventional PCI and cannot have any ReBAR-capable devices. Move the resize show and store helpers, the per-BAR attribute definitions, and the attribute group behind the existing #ifdef HAVE_PCI_MMAP || ARCH_GENERIC_PCI_MMAP_RESOURCE guard, and fold the group reference in pci_dev_groups[] into the existing #if block. Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Link: https://patch.msgid.link/20260508043543.217179-25-kwilczynski@kernel.org
11 daysPCI/sysfs: Remove pci_create_legacy_files() and pci_sysfs_init()Krzysztof Wilczyński
Currently, pci_create_legacy_files() and pci_remove_legacy_files() are no-op stubs. With legacy attributes now handled by static groups registered via pcibus_groups[], no call site needs them. Remove both functions, their declarations, and the call sites in pci_register_host_bridge(), pci_alloc_child_bus(), and pci_remove_bus(). Remove the pci_sysfs_init() late_initcall and sysfs_initialized. The late_initcall originally existed to create all the dynamic PCI sysfs files, but with both resource and legacy attributes now handled by static groups, it is no longer needed. Remove the legacy_io and legacy_mem fields from struct pci_bus which were used to track the dynamically allocated legacy attributes. Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Link: https://patch.msgid.link/20260508043543.217179-24-kwilczynski@kernel.org
11 daysPCI/sysfs: Convert legacy I/O and memory attributes to static definitionsKrzysztof Wilczyński
Currently, legacy_io and legacy_mem are dynamically allocated and created by pci_create_legacy_files(), with pci_adjust_legacy_attr() updating the attributes at runtime on Alpha to rename them and shift the size for sparse addressing. Convert to four static const attributes (legacy_io, legacy_io_sparse, legacy_mem, legacy_mem_sparse) with .is_bin_visible() callbacks that use pci_legacy_has_sparse() to select the appropriate variant per bus. The sizes are compile-time constants and .size is set directly on each attribute. Register the groups in pcibus_groups[] under a HAVE_PCI_LEGACY guard so the driver model handles creation and removal automatically. Stub out pci_create_legacy_files() and pci_remove_legacy_files() as the dynamic creation is no longer needed. Remove the __weak pci_adjust_legacy_attr(), Alpha's override, and its declaration from both Alpha and PowerPC asm/pci.h headers. Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Link: https://patch.msgid.link/20260508043543.217179-23-kwilczynski@kernel.org
11 daysPCI/sysfs: Add __weak pci_legacy_has_sparse() helperKrzysztof Wilczyński
Currently, Alpha's sparse/dense legacy attribute handling is done via pci_adjust_legacy_attr(), which updates dynamically allocated attributes at runtime. The upcoming conversion to static attributes needs a way to determine sparse support at visibility check time. Add a __weak pci_legacy_has_sparse() that returns false by default. Alpha overrides it to check has_sparse() on the bus host controller. Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Link: https://patch.msgid.link/20260508043543.217179-22-kwilczynski@kernel.org
11 daysalpha/PCI: Compute legacy size in pci_mmap_legacy_page_range()Krzysztof Wilczyński
Currently, pci_mmap_legacy_page_range() reads the legacy resource size from bus->legacy_mem->size or bus->legacy_io->size. This couples the mmap bounds check to the struct pci_bus fields that will be removed when legacy attributes are converted to static definitions. Compute the size directly using PCI_LEGACY_MEM_SIZE (0x100000) and PCI_LEGACY_IO_SIZE (0xffff) macros, and shift by 5 bits for sparse systems. Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Magnus Lindholm <linmag7@gmail.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Acked-by: Magnus Lindholm <linmag7@gmail.com> Link: https://patch.msgid.link/20260508043543.217179-21-kwilczynski@kernel.org
11 daysPCI: Add macros for legacy I/O and memory address space sizesKrzysztof Wilczyński
Add defines for the standard PCI legacy address space sizes, replacing the raw literals used by the legacy sysfs attributes. Then, replace open-coded values with the newly added macros. No functional changes intended. Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Link: https://patch.msgid.link/20260508043543.217179-20-kwilczynski@kernel.org
11 daysPCI/sysfs: Remove pci_{create,remove}_sysfs_dev_files()Krzysztof Wilczyński
Currently, pci_create_sysfs_dev_files() and pci_remove_sysfs_dev_files() are no-op stubs. With both the generic and Alpha resource files now handled by static attribute groups, no platform needs dynamic per-device sysfs file creation. Remove both functions, their declarations, and the call sites in pci_bus_add_device() and pci_stop_dev(). Remove __weak pci_create_resource_files() and pci_remove_resource_files() stubs and their declarations in pci.h, as no architecture overrides them anymore. Remove the res_attr[] and res_attr_wc[] fields from struct pci_dev which were used to track dynamically allocated resource attributes. Finally, simplify pci_sysfs_init() to only handle legacy file creation under HAVE_PCI_LEGACY, removing the per-device loop and the HAVE_PCI_SYSFS_INIT helper added earlier. Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Link: https://patch.msgid.link/20260508043543.217179-19-kwilczynski@kernel.org
11 daysalpha/PCI: Convert resource files to static attributesKrzysztof Wilczyński
Previously, Alpha's PCI resource files (resourceN, resourceN_sparse, resourceN_dense) were dynamically created by pci_create_resource_files(), which overrides the generic __weak implementation. The previous code allocated bin_attributes at runtime and managed them via the res_attr[] and res_attr_wc[] fields in struct pci_dev. Convert to static const attributes with three attribute groups (plain, sparse, dense), each with an .is_bin_visible() callback that checks resource length, has_sparse(), and sparse_mem_mmap_fits(). A .bin_size() callback provides the resource size to the kernfs node, with the sparse variant shifting by 5 bits for byte-level addressing. Register the groups via ARCH_PCI_DEV_GROUPS so the driver model handles creation and removal automatically. Use the new pci_resource_is_mem() helper for the type check, replacing the open-coded bitwise flag test. Finally, remove pci_create_resource_files(), pci_remove_resource_files(), pci_create_attr(), and pci_create_one_attr() which are no longer needed. Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Magnus Lindholm <linmag7@gmail.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Acked-by: Magnus Lindholm <linmag7@gmail.com> Link: https://patch.msgid.link/20260508043543.217179-18-kwilczynski@kernel.org
11 daysalpha/PCI: Add static PCI resource attribute macrosKrzysztof Wilczyński
Add macros for declaring static binary attributes for Alpha's PCI resource files: - pci_dev_resource_attr(), for dense/BWX systems (mmap dense) - pci_dev_resource_sparse_attr(), for sparse systems (mmap sparse) - pci_dev_resource_dense_attr(), for dense companion files (mmap dense) Each macro creates a const bin_attribute with the BAR index stored in the .private property and the appropriate .mmap() callback. Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Magnus Lindholm <linmag7@gmail.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Acked-by: Magnus Lindholm <linmag7@gmail.com> Link: https://patch.msgid.link/20260508043543.217179-17-kwilczynski@kernel.org
11 daysalpha/PCI: Remove WARN from __pci_mmap_fits() and __legacy_mmap_fits()Krzysztof Wilczyński
Remove the WARN() that fires when userspace attempts to mmap beyond the BAR bounds. The check still returns 0 to reject the mapping, but the warning is excessive for normal operation. A similar warning was removed from the PCI core in the commit 3b519e4ea618 ("PCI: fix size checks for mmap() on /proc/bus/pci files"). Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: squash https://lore.kernel.org/all/20260508045824.GA3160093@rocinante] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Magnus Lindholm <linmag7@gmail.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Acked-by: Magnus Lindholm <linmag7@gmail.com> Link: https://patch.msgid.link/20260508043543.217179-16-kwilczynski@kernel.org
11 daysalpha/PCI: Fix __pci_mmap_fits() overflow for zero-length BARsKrzysztof Wilczyński
Currently, __pci_mmap_fits() computes the BAR size using "pci_resource_len() - 1", which wraps to a large value when the BAR length is zero, causing the bounds check to incorrectly succeed. Add an early return for empty resources. Fixes: 10a0ef39fbd1 ("PCI/alpha: pci sysfs resources") Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Magnus Lindholm <linmag7@gmail.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Acked-by: Magnus Lindholm <linmag7@gmail.com> Link: https://patch.msgid.link/20260508043543.217179-15-kwilczynski@kernel.org
11 daysalpha/PCI: Use PCI resource accessor macrosKrzysztof Wilczyński
Replace direct pdev->resource[] accesses with pci_resource_n(), and open-coded res->flags type checks with pci_resource_is_mem() and pci_resource_start() helpers. While at it, move the pci_resource_n() call directly into pcibios_resource_to_bus() and drop the local struct resource pointer. No functional changes intended. Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Magnus Lindholm <linmag7@gmail.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Acked-by: Magnus Lindholm <linmag7@gmail.com> Link: https://patch.msgid.link/20260508043543.217179-14-kwilczynski@kernel.org
11 daysalpha/PCI: Use BAR index in sysfs attr->private instead of resource pointerKrzysztof Wilczyński
Currently, Alpha's pci_create_one_attr() stores a resource pointer in attr->private, and pci_mmap_resource() loops through all BARs to find the matching index. Store the BAR index directly in attr->private and retrieve the resource via pci_resource_n(). This eliminates the loop and aligns with the convention used by the generic PCI sysfs code. The PCI core change was first added in the commit dca40b186b75 ("PCI: Use BAR index in sysfs attr->private instead of resource pointer"). Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Magnus Lindholm <linmag7@gmail.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Acked-by: Magnus Lindholm <linmag7@gmail.com> Link: https://patch.msgid.link/20260508043543.217179-13-kwilczynski@kernel.org
11 daysalpha/PCI: Add security_locked_down() check to pci_mmap_resource()Krzysztof Wilczyński
Currently, Alpha's pci_mmap_resource() does not check security_locked_down(LOCKDOWN_PCI_ACCESS) before allowing userspace to mmap PCI BARs. The generic version has had this check since commit eb627e17727e ("PCI: Lock down BAR access when the kernel is locked down") to prevent DMA attacks when the kernel is locked down. Add the same check to Alpha's pci_mmap_resource(). Fixes: eb627e17727e ("PCI: Lock down BAR access when the kernel is locked down") Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Magnus Lindholm <linmag7@gmail.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Acked-by: Magnus Lindholm <linmag7@gmail.com> Link: https://patch.msgid.link/20260508043543.217179-12-kwilczynski@kernel.org
11 daysPCI/sysfs: Limit pci_sysfs_init() late_initcall compile scopeKrzysztof Wilczyński
Currently, pci_sysfs_init() and sysfs_initialized compile unconditionally, even on platforms where static attribute groups handle all resource file creation. Place them behind a new HAVE_PCI_SYSFS_INIT macro, especially as the late_initcall is only needed when: - HAVE_PCI_LEGACY is set, to iterate buses and create legacy I/O and memory files. - Neither HAVE_PCI_MMAP nor ARCH_GENERIC_PCI_MMAP_RESOURCE is set, to iterate devices and create resource files via the __weak pci_create_resource_files() stub override (this is how the Alpha architecture handles this currently). On most systems both conditions are false and the entire late_initcall compiles away. Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Link: https://patch.msgid.link/20260508043543.217179-11-kwilczynski@kernel.org
11 daysPCI/sysfs: Add stubs for pci_{create,remove}_sysfs_dev_files()Krzysztof Wilczyński
On platforms with HAVE_PCI_MMAP or ARCH_GENERIC_PCI_MMAP_RESOURCE, resource files are now handled by static attribute groups registered via pci_dev_groups[]. Stub out the pci_create_sysfs_dev_files() and pci_remove_sysfs_dev_files(), as the dynamic resource file creation is no longer needed. Also, simplify pci_sysfs_init() on these platforms to only iterate buses for legacy attributes creation, skipping the per-device loop. Move the __weak stubs for pci_create_resource_files() and pci_remove_resource_files() into the #else branch since only platforms without HAVE_PCI_MMAP (such as Alpha architecture) still need them. Guard the res_attr[] and res_attr_wc[] fields in struct pci_dev the same way. Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Link: https://patch.msgid.link/20260508043543.217179-10-kwilczynski@kernel.org
11 daysPCI/sysfs: Warn about BAR resize failure in __resource_resize_store()Krzysztof Wilczyński
Add a pci_warn() to __resource_resize_store(), so that BAR resize failures are visible to the user, which can help troubleshoot any potential resource resize issues. While at it, rename the resource_resize_is_visible() to resource_resize_attr_is_visible() along with the corresponding group variable to align with the naming convention used by the resource attribute groups. Also, change the order of pci_dev_groups[] such that the resize group is now located alongside the other resource groups. Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Link: https://patch.msgid.link/20260508043543.217179-9-kwilczynski@kernel.org
11 daysPCI/sysfs: Convert PCI resource files to static attributesKrzysztof Wilczyński
Currently, the PCI resource files (resourceN, resourceN_wc) are dynamically created by pci_create_sysfs_dev_files(), called from both pci_bus_add_device() and the pci_sysfs_init() late_initcall, with only a sysfs_initialized flag for synchronisation. This has caused warnings and boot panics when both paths race on the same device, e.g.: sysfs: cannot create duplicate filename '/devices/pci0000:3c/0000:3c:01.0/0000:3e:00.2/resource2' This is especially likely on Devicetree-based platforms, where the PCI host controllers are platform drivers that probe via the driver model, which can happen during or after the late_initcall. As such, pci_bus_add_device() and pci_sysfs_init() are more likely to overlap. Convert to static const attributes with three attribute groups (I/O, UC, WC), each with an .is_bin_visible() callback that checks resource flags, BAR length, and non_mappable_bars. A .bin_size() callback provides pci_resource_len() to the kernfs node for correct stat and lseek behaviour. As part of this conversion: - Rename pci_read_resource_io() and pci_write_resource_io() to pci_read_resource() and pci_write_resource() since the callbacks are no longer I/O-specific in the static attribute context. - Update __resource_resize_store() to use sysfs_create_groups() and sysfs_remove_groups(), which re-evaluates visibility and runs the .bin_size() callback for the static resource attribute groups. - Remove pci_create_resource_files(), pci_remove_resource_files(), and pci_create_attr() which are no longer needed. - Move the __weak stubs outside the #if guard so they remain available for callers converted in subsequent commits. Platforms that do not define the HAVE_PCI_MMAP macro or the ARCH_GENERIC_PCI_MMAP_RESOURCE macro, such as Alpha architecture, continue using their platform-specific resource file creation. For reference, the dynamic creation dates back to the pre-Git era: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/drivers/pci/pci-sysfs.c?id=42298be0eeb5ae98453b3374c36161b05a46c5dc The write-combine support was added in commit 45aec1ae72fc ("x86: PAT export resource_wc in pci sysfs"). Many other reports mentioned in the cover letter (first Link: below). Link: https://lore.kernel.org/r/20260508043543.217179-1-kwilczynski@kernel.org/ Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215515 Closes: https://github.com/openwrt/openwrt/issues/17143 Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Link: https://patch.msgid.link/20260508043543.217179-8-kwilczynski@kernel.org
11 daysPCI/proc: Fix race between pci_proc_init() and pci_bus_add_device()Krzysztof Wilczyński
pci_proc_attach_device() creates procfs entries for PCI devices and is called from pci_bus_add_device(). It lazily creates the per-bus procfs directory (bus->procdir) via proc_mkdir() on first use, and returns early if proc_initialized is not yet set. On x86 with ACPI, PCI enumeration occurs at subsys_initcall, before pci_proc_init() sets proc_initialized at device_initcall. The for_each_pci_dev() loop in pci_proc_init() then creates procfs entries for these already-enumerated devices, but runs without holding pci_rescan_remove_lock. On ARM64 with devicetree, PCI host bridges probe at device_initcall. With async probing enabled, pci_bus_add_device() can run concurrently with pci_proc_init(), and both may call pci_proc_attach_device() for the same device or for different devices on the same bus. As pci_host_probe() holds pci_rescan_remove_lock while pci_proc_init() does not, there is no serialisation between the two paths. When two threads concurrently call pci_proc_attach_device() for devices on the same bus, both observe bus->procdir as NULL and both call proc_mkdir(). The proc filesystem serialises directory creation internally, so only one caller succeeds. The other results in a warning like: proc_dir_entry '000c:00/00.0' already registered The caller receives NULL (duplicate entry) and unconditionally stores it to bus->procdir, corrupting the valid pointer set by the first caller. Serialise access to proc_initialized, proc_bus_pci_dir, bus->procdir and dev->procent with a new mutex local to drivers/pci/proc.c, and store the created entries to bus->procdir and dev->procent only on success, so a failed creation can never overwrite a valid pointer. Additionally, wrap the for_each_pci_dev() loop in pci_proc_init() with pci_lock_rescan_remove() to serialise against concurrent PCI bus operations, add an early return in pci_proc_attach_device() when dev->procent is already set to make the function idempotent, and clear bus->procdir in pci_proc_detach_bus() to prevent use of a dangling pointer after proc_remove(). Reported-by: Shuan He <heshuan@bytedance.com> Closes: https://lore.kernel.org/linux-pci/20250702155112.40124-2-heshuan@bytedance.com/ Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20260611150543.511422-1-kwilczynski@kernel.org
11 daysPCI: rzg3s-host: Use common pci_host_common_link_train_delay() helperHans Zhang
Replace the unconditional msleep(100) with the common helper pci_host_common_link_train_delay(). The helper only waits when max_link_speed > 2, as required by PCIe r6.0 sec 6.6.1. This avoids unnecessary delay for Gen1/Gen2 links while retaining the mandatory 100 ms for higher speeds. Signed-off-by: Hans Zhang <18255117159@163.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260518004246.1384532-8-18255117159@163.com
11 daysPCI: mediatek-gen3: Add 100 ms delay after link upHans Zhang
The MediaTek Gen3 PCIe host driver lacks the required 100 ms delay after link training completes for speeds > 5.0 GT/s, as specified in PCIe r6.0 sec 6.6.1. The driver already stores max_link_speed (from the device tree). After mtk_pcie_startup_port() successfully brings up the link, call pci_host_common_link_train_delay() to comply with the specification. Signed-off-by: Hans Zhang <18255117159@163.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260518004246.1384532-7-18255117159@163.com
11 daysPCI: aardvark: Add 100 ms delay after link trainingHans Zhang
The Aardvark PCIe controller driver waits for the link to come up but does not implement the mandatory 100 ms delay after link training completes for speeds greater than 5.0 GT/s (PCIe r6.0 sec 6.6.1). The driver already maintains a 'link_gen' field that holds the negotiated link speed. Use it together with pci_host_common_link_train_delay() to insert the required delay immediately after confirming that the link is up. Signed-off-by: Hans Zhang <18255117159@163.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260518004246.1384532-6-18255117159@163.com
11 daysPCI: dwc: Use common pci_host_common_link_train_delay() helperHans Zhang
The DWC driver already implements the 100 ms delay required by PCIe r6.0 sec 6.6.1 by checking pci->max_link_speed and calling msleep(100). Replace the open-coded msleep() with the new common helper pci_host_common_link_train_delay() to reduce code duplication and improve maintainability. No functional change intended. Signed-off-by: Hans Zhang <18255117159@163.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260518004246.1384532-5-18255117159@163.com
11 daysPCI: cadence-hpa: Add post-link delayHans Zhang
The Cadence HPA (High Performance Architecture IP) specific link setup function cdns_pcie_hpa_host_link_setup() waits for the link to come up but does not implement the required 100 ms delay after link training completes for speeds > 5.0 GT/s (PCIe r6.0 sec 6.6.1). Add a call to pci_host_common_link_train_delay() immediately after the link is confirmed to be up, using the max_link_speed field. Also, in the HPA host setup function, read the device tree property "max-link-speed" to initialize max_link_speed if not already set by a glue driver. This ensures compliance for HPA-based platforms. Signed-off-by: Hans Zhang <18255117159@163.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> [bhelgaas: driver tag "cadence: HPA:" -> "cadence-hpa:"] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260518004246.1384532-4-18255117159@163.com
11 daysMerge tag 'mm-stable-2026-06-23-08-55' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: - "khugepaged: add mTHP collapse support" (Nico Pache) Provide khugepaged with the capability to collapse anonymous memory regions to mTHPs - "Remove CONFIG_READ_ONLY_THP_FOR_FS and enable file THP for writable files" (Zi Yan) Remove the READ_ONLY_THP_FOR_FS check in file_thp_enabled(), so that khugepaged and MADV_COLLAPSE can run on filesystems with PMD THP pagecache support even without READ_ONLY_THP_FOR_FS enabled - "make MM selftests more CI friendly" (Mike Rapoport) General fixes and cleanups to the MM selftests. Also move more MM selftests under the kselftest framework, making them more amenable to ongoing CI testing - "selftests/mm: fix failures and robustness improvements" and "selftests/mm: assorted fixes for hmm-tests" (Sayali Patil) Fix several issues in MM selftests which were revealed by powerpc 64k pagesize * tag 'mm-stable-2026-06-23-08-55' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (118 commits) Revert "mm: limit filemap_fault readahead to VMA boundaries" mm/vmscan: pass NULL to trace vmscan node reclaim mm: use mapping_mapped to simplify the code selftests/mm: fix exclusive_cow test fork() handling selftests/mm: remove hardcoded THP sizing assumptions in hmm tests selftests/mm: allow PUD-level entries in compound testcase of hmm tests mm/gup_test: reject wrapped user ranges mm/page_frag: reject invalid CPUs in page_frag_test mm/damon/core: always put unsuccessfully committed target pids mm: page_isolation: avoid unsafe folio reads while scanning compound pages mm/shrinker: do not hold RCU lock in shrinker_debugfs_count_show() selftests: mm: fix and speedup "droppable" test mm: merge writeout into pageout MAINTAINERS: add Hao Ge as reviewer for codetag and alloc_tag selftests/mm: clarify alternate unmapping in compaction_test selftests/mm: move hwpoison setup into run_test() and silence modprobe output for memory-failure category selftests/mm: skip uffd-stress test when nr_pages_per_cpu is zero selftests/mm: skip uffd-wp-mremap if UFFD write-protect is unsupported selftests/mm: ensure destination is hugetlb-backed in hugetlb-mremap selftest/mm: register existing mapping with userfaultfd in hugetlb-mremap ...
11 daysMerge tag 'perf-tools-for-v7.2-1-2026-06-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Arnaldo Carvalho de Melo: - Introduce 'perf inject --aslr' to remap ASLR-randomized addresses in perf.data files, enabling reproducible analysis across runs with different address space layouts - Refactor evsel out of sample processing paths: store evsel in struct perf_sample and remove the redundant evsel parameter from tool APIs, tracepoint handlers, hist entry iterators, and db-export, simplifying the entire tool callback chain - Switch architecture detection from string-based perf_env__arch() comparisons to the numeric ELF e_machine field across the codebase (capstone, print_insn, c2c, lock-contention, sort, sample-raw, machine, header), making cross-analysis more robust - Overhaul ARM CoreSight ETM tests: add deterministic and named_threads workloads, speed up basic and disassembly tests, add process attribution and concurrent threads tests, remove unused workloads and duplicate tests, queue context packets for the frontend decoder - Add ARM SPE IMPDEF event decoding for Arm Neoverse N1, store MIDR in arm_spe_pkt for per-CPU event mapping, handle missing CPU IDs gracefully - Refactor libunwind support: remove the libunwind-local backend, make register reading cross-platform, add RISC-V libunwind support, allow dynamic selection between libdw and libunwind unwinding at runtime - Extensive hardening of perf.data parsing against crafted files: add bounds checks and byte-swap validation for session records, feature sections, header attributes, BPF metadata, auxtrace errors, compressed events, CPU maps, build ID notes, and ELF program headers. Add minimum event size validation and file offset diagnostics - Fix libdw API contract violations across dwarf-aux, libdw, probe-finder, annotate-data, and debuginfo subsystems. Fix callchain parent update in ORDER_CALLER mode, support DWARF line 0 in inline lists, handle multiple address spaces in callchains - Fix numerous 'perf sched' bugs: thread reference leaks, memory leaks, heap overflows with cross-machine recordings, NULL dereferences, replace BUG_ON assertions with graceful error handling, bounds-check CPU indices, fix SIGCHLD vs pause() races in sched stats - Overhaul the build system: move BPF skeleton generation out of Makefile.perf into bpf_skel.mak, decouple pmu-events from the prepare target, make beauty generated C code standalone .o files, compile BPF skeletons with -mcpu=v3, fix continuous rebuilds, various cleanups - Add 'perf test' JUnit XML reporting with -j/--junit option, split monolithic test suites into sub-tests, add summary reporting, refactor parallel poll loop, fix test failures on musl-based systems - Fix 'perf c2c' memory leaks in hist entry and format list handling, use-after-free in error paths, bounds-check CPU and node IDs - Fix 'perf bpf' metadata leaks on duplicate insert and alloc failure, bounds-check array offsets, validate event sizes and func_info fields, add NULL checks - Fix hwmon PMU: off-by-one null termination on sysfs reads, strlcpy buffer overflow in parse_hwmon_filename(), fd 0 check, empty label reads, scnprintf usage - Fix symbols subsystem: bounds-check ELF and sysfs build ID note iteration, validate p_filesz, fix 32-bit ELF bswap error, fix signed overflow in size checks, bounds-check .gnu_debuglink section - Fix tools lib api: null termination in filename__read_int/ull(), uninitialized stack data in filename__write_int(), snprintf truncation in mount_overload() - Replace libbabeltrace with babeltrace2-ctf-writer for CTF conversion in 'perf data' - Add RISC-V SDT argument parsing for static tracepoints - Add 'perf trace --show-cpu' option to display CPU id - Add 'perf bench sched pipe --write-size' option - Add a perf-specific .clang-format that overrides some kernel style behaviors - Update Intel vendor events for Alder Lake, Arrow Lake, Clearwater Forest, Emerald Rapids, Granite Rapids, Grand Ridge, Lunar Lake, Meteor Lake, Panther Lake, Sapphire Rapids, Sierra Forest - Add IOMMU metrics for AMD and Intel - Fix AMD event: switch l2_itlb_misses to bp_l1_tlb_miss_l2_tlb_miss.all - Add AMD IBS improvements: decode Streaming-store and Remote-Socket flags, suppress bogus fields on Zen4+, skip privilege test on Zen6+ - Fix 'perf lock contention' SIGCHLD vs pause() race, allow 'mmap_lock' in -L filter, enable end-timestamp for cgroup aggregation, fix non-atomic data updates - Fix 'perf stat' false NMI watchdog warning in aggregation modes, bounds-check CPU index in topology callbacks, add aggr_nr metric parser support for uncore scaling - Fix 'perf timechart' memory leaks, CPU bounds checking, use-after-free on corrupted callchains - Fix 'perf inject' itrace branch stack synthesis, fix synthesized sample size with branch stacks - Fix DSO heap overflow on decompressed paths, uninitialized pathname on fallback, set proper error codes - Fix various snprintf/scnprintf usages to prevent buffer overflows and truncation across the codebase - Fix off-by-one stack buffer overflow in kallsyms__parse() - Fix 'perf kwork' memory management, address sanitizer issues, bounds check work->cpu - Fix 'perf tpebs' concurrent stop races and PID reuse hazards - Add O_CLOEXEC to open() calls and use mkostemp() for temporary files to prevent file descriptor leaks to child processes - Fix s390 Python extension TEXTREL by compiling as PIC - Fix build with ASAN for jitdump - Fix build failure due to btf_vlen() return type change * tag 'perf-tools-for-v7.2-1-2026-06-22' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (343 commits) perf bpf: Fix up build failure due to change of btf_vlen() return type perf dso: Set standard errno on decompression failure perf bpf: Validate array presence before casting BPF prog info pointers perf c2c: Fix hist entry and format list leaks in c2c_he_free() perf c2c: Free format list entries when c2c_hists__init() fails perf cs-etm: Bounds-check CPU in cs_etm__get_queue() perf cs-etm: Require full global header in auxtrace_info size check perf cs-etm: Validate num_cpu before metadata allocation perf machine: Use snprintf() for guestmount path construction perf machine: Propagate machine__init() error to callers perf trace: Guard __probe_ip suppression with evsel__is_probe() perf evsel: Add lazy-initialized probe type detection helpers perf evsel: Add no-libtraceevent stubs for evsel__field() and evsel__common_field() perf cs-etm: Reject CPU IDs that would overflow signed comparison perf c2c: Free format list entries when releasing c2c hist entries perf bpf: Bounds-check array offsets in bpil_offs_to_addr() perf bpf: Reject oversized BPF metadata events that truncate header.size perf bpf: Validate func_info_rec_size and sub_id in synthesize_bpf_prog_name() perf sched: Replace (void*)1 sentinel with proper runtime allocation perf hwmon: Fix fd check to accept fd 0 in hwmon_pmu__describe_items() ...
11 daysdt-bindings: PCI: qcom,pcie-sm8550: Add Eliza compatibleKrishna Chaitanya Chundru
PCIe controller present in Eliza SoC is backwards compatible with the controller present in SM8550 SoC. Hence, add the compatible with SM8550 fallback. Eliza requires 6 reg entries, 8 clocks and 9 interrupts, so add the corresponding allOf constraints. Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://patch.msgid.link/20260608-eliza-v3-2-9bdeb7434b28@oss.qualcomm.com
11 daysdt-bindings: PCI: renesas,r9a08g045-pcie: Add RZ/V2N supportLad Prabhakar
Document the Renesas RZ/V2N PCIe host controller, which is compatible with the RZ/G3E PCIe IP and therefore uses it as a fallback compatible. The only difference is that it uses device ID 0x003B. Make the binding title generic to avoid extending the title for each new SoC, and update the description to list the supported SoCs and their capabilities. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260501102407.29462-1-prabhakar.mahadev-lad.rj@bp.renesas.com