summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
45 hoursidpf: Fix RSS LUT configuration on down interfacesSreedevi Joshi
[ Upstream commit 445b49d13787da2fe8d51891ee196e5077feef44 ] RSS LUT provisioning and queries on a down interface currently return silently without effect. Users should be able to configure RSS settings even when the interface is down. Fix by maintaining RSS configuration changes in the driver's soft copy and deferring HW programming until the interface comes up. Fixes: 02cbfba1add5 ("idpf: add ethtool callbacks") Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursidpf: Fix RSS LUT NULL pointer crash on early ethtool operationsSreedevi Joshi
[ Upstream commit 83f38f210b85676f40ba8586b5a8edae19b56995 ] The RSS LUT is not initialized until the interface comes up, causing the following NULL pointer crash when ethtool operations like rxhash on/off are performed before the interface is brought up for the first time. Move RSS LUT initialization from ndo_open to vport creation to ensure LUT is always available. This enables RSS configuration via ethtool before bringing the interface up. Simplify LUT management by maintaining all changes in the driver's soft copy and programming zeros to the indirection table when rxhash is disabled. Defer HW programming until the interface comes up if it is down during rxhash and LUT configuration changes. Steps to reproduce: ** Load idpf driver; interfaces will be created modprobe idpf ** Before bringing the interfaces up, turn rxhash off ethtool -K eth2 rxhash off [89408.371875] BUG: kernel NULL pointer dereference, address: 0000000000000000 [89408.371908] #PF: supervisor read access in kernel mode [89408.371924] #PF: error_code(0x0000) - not-present page [89408.371940] PGD 0 P4D 0 [89408.371953] Oops: Oops: 0000 [#1] SMP NOPTI <snip> [89408.372052] RIP: 0010:memcpy_orig+0x16/0x130 [89408.372310] Call Trace: [89408.372317] <TASK> [89408.372326] ? idpf_set_features+0xfc/0x180 [idpf] [89408.372363] __netdev_update_features+0x295/0xde0 [89408.372384] ethnl_set_features+0x15e/0x460 [89408.372406] genl_family_rcv_msg_doit+0x11f/0x180 [89408.372429] genl_rcv_msg+0x1ad/0x2b0 [89408.372446] ? __pfx_ethnl_set_features+0x10/0x10 [89408.372465] ? __pfx_genl_rcv_msg+0x10/0x10 [89408.372482] netlink_rcv_skb+0x58/0x100 [89408.372502] genl_rcv+0x2c/0x50 [89408.372516] netlink_unicast+0x289/0x3e0 [89408.372533] netlink_sendmsg+0x215/0x440 [89408.372551] __sys_sendto+0x234/0x240 [89408.372571] __x64_sys_sendto+0x28/0x30 [89408.372585] x64_sys_call+0x1909/0x1da0 [89408.372604] do_syscall_64+0x7a/0xfa0 [89408.373140] ? clear_bhb_loop+0x60/0xb0 [89408.373647] entry_SYSCALL_64_after_hwframe+0x76/0x7e [89408.378887] </TASK> <snip> Fixes: a251eee62133 ("idpf: add SRIOV support and other ndo_ops") Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursidpf: fix issue with ethtool -n command displayErik Gabriel Carrillo
[ Upstream commit 36aae2ea6bd76b8246caa50e34a4f4824f0a3be8 ] When ethtool -n is executed on an interface to display the flow steering rules, "rxclass: Unknown flow type" error is generated. The flow steering list maintained in the driver currently stores only the location and q_index but other fields of the ethtool_rx_flow_spec are not stored. This may be enough for the virtchnl command to delete the entry. However, when the ethtool -n command is used to query the flow steering rules, the ethtool_rx_flow_spec returned is not complete causing the error below. Resolve this by storing the flow spec (fsp) when rules are added and returning the complete flow spec when rules are queried. Also, change the return value from EINVAL to ENOENT when flow steering entry is not found during query by location or when deleting an entry. Add logic to detect and reject duplicate filter entries at the same location and change logic to perform upfront validation of all error conditions before adding flow rules through virtchnl. This avoids the need for additional virtchnl delete messages when subsequent operations fail, which was missing in the original upstream code. Example: Before the fix: ethtool -n eth1 2 RX rings available Total 2 rules rxclass: Unknown flow type rxclass: Unknown flow type After the fix: ethtool -n eth1 2 RX rings available Total 2 rules Filter: 0 Rule Type: TCP over IPv4 Src IP addr: 10.0.0.1 mask: 0.0.0.0 Dest IP addr: 0.0.0.0 mask: 255.255.255.255 TOS: 0x0 mask: 0xff Src port: 0 mask: 0xffff Dest port: 0 mask: 0xffff Action: Direct to queue 0 Filter: 1 Rule Type: UDP over IPv4 Src IP addr: 10.0.0.1 mask: 0.0.0.0 Dest IP addr: 0.0.0.0 mask: 255.255.255.255 TOS: 0x0 mask: 0xff Src port: 0 mask: 0xffff Dest port: 0 mask: 0xffff Action: Direct to queue 0 Fixes: ada3e24b84a0 ("idpf: add flow steering support") Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Co-developed-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursidpf: fix memory leak of flow steer list on rmmodSreedevi Joshi
[ Upstream commit f9841bd28b600526ca4f6713b0ca49bf7bb98452 ] The flow steering list maintains entries that are added and removed as ethtool creates and deletes flow steering rules. Module removal with active entries causes memory leak as the list is not properly cleaned up. Prevent this by iterating through the remaining entries in the list and freeing the associated memory during module removal. Add a spinlock (flow_steer_list_lock) to protect the list access from multiple threads. Fixes: ada3e24b84a0 ("idpf: add flow steering support") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursidpf: fix error handling in the init_task on loadEmil Tantilov
[ Upstream commit 4d792219fe6f891b5b557a607ac8a0a14eda6e38 ] If the init_task fails during a driver load, we end up without vports and netdevs, effectively failing the entire process. In that state a subsequent reset will result in a crash as the service task attempts to access uninitialized resources. Following trace is from an error in the init_task where the CREATE_VPORT (op 501) is rejected by the FW: [40922.763136] idpf 0000:83:00.0: Device HW Reset initiated [40924.449797] idpf 0000:83:00.0: Transaction failed (op 501) [40958.148190] idpf 0000:83:00.0: HW reset detected [40958.161202] BUG: kernel NULL pointer dereference, address: 00000000000000a8 ... [40958.168094] Workqueue: idpf-0000:83:00.0-vc_event idpf_vc_event_task [idpf] [40958.168865] RIP: 0010:idpf_vc_event_task+0x9b/0x350 [idpf] ... [40958.177932] Call Trace: [40958.178491] <TASK> [40958.179040] process_one_work+0x226/0x6d0 [40958.179609] worker_thread+0x19e/0x340 [40958.180158] ? __pfx_worker_thread+0x10/0x10 [40958.180702] kthread+0x10f/0x250 [40958.181238] ? __pfx_kthread+0x10/0x10 [40958.181774] ret_from_fork+0x251/0x2b0 [40958.182307] ? __pfx_kthread+0x10/0x10 [40958.182834] ret_from_fork_asm+0x1a/0x30 [40958.183370] </TASK> Fix the error handling in the init_task to make sure the service and mailbox tasks are disabled if the error happens during load. These are started in idpf_vc_core_init(), which spawns the init_task and has no way of knowing if it failed. If the error happens on reset, following successful driver load, the tasks can still run, as that will allow the netdevs to attempt recovery through another reset. Stop the PTP callbacks either way as those will be restarted by the call to idpf_vc_core_init() during a successful reset. Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration") Reported-by: Vivek Kumar <iamvivekkumar@google.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursidpf: fix memory leak in idpf_vc_core_deinit()Emil Tantilov
[ Upstream commit e111cbc4adf9f9974eed040aeece7e17460f6bff ] Make sure to free hw->lan_regs. Reported by kmemleak during reset: unreferenced object 0xff1b913d02a936c0 (size 96): comm "kworker/u258:14", pid 2174, jiffies 4294958305 hex dump (first 32 bytes): 00 00 00 c0 a8 ba 2d ff 00 00 00 00 00 00 00 00 ......-......... 00 00 40 08 00 00 00 00 00 00 25 b3 a8 ba 2d ff ..@.......%...-. backtrace (crc 36063c4f): __kmalloc_noprof+0x48f/0x890 idpf_vc_core_init+0x6ce/0x9b0 [idpf] idpf_vc_event_task+0x1fb/0x350 [idpf] process_one_work+0x226/0x6d0 worker_thread+0x19e/0x340 kthread+0x10f/0x250 ret_from_fork+0x251/0x2b0 ret_from_fork_asm+0x1a/0x30 Fixes: 6aa53e861c1a ("idpf: implement get LAN MMIO memory regions") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Joshua Hay <joshua.a.hay@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursidpf: fix memory leak in idpf_vport_rel()Emil Tantilov
[ Upstream commit f6242b354605faff263ca45882b148200915a3f6 ] Free vport->rx_ptype_lkup in idpf_vport_rel() to avoid leaking memory during a reset. Reported by kmemleak: unreferenced object 0xff450acac838a000 (size 4096): comm "kworker/u258:5", pid 7732, jiffies 4296830044 hex dump (first 32 bytes): 00 00 00 00 00 10 00 00 00 10 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 ................ backtrace (crc 3da81902): __kmalloc_cache_noprof+0x469/0x7a0 idpf_send_get_rx_ptype_msg+0x90/0x570 [idpf] idpf_init_task+0x1ec/0x8d0 [idpf] process_one_work+0x226/0x6d0 worker_thread+0x19e/0x340 kthread+0x10f/0x250 ret_from_fork+0x251/0x2b0 ret_from_fork_asm+0x1a/0x30 Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursidpf: detach and close netdevs while handling a resetEmil Tantilov
[ Upstream commit 2e281e1155fc476c571c0bd2ffbfe28ab829a5c3 ] Protect the reset path from callbacks by setting the netdevs to detached state and close any netdevs in UP state until the reset handling has completed. During a reset, the driver will de-allocate resources for the vport, and there is no guarantee that those will recover, which is why the existing vport_ctrl_lock does not provide sufficient protection. idpf_detach_and_close() is called right before reset handling. If the reset handling succeeds, the netdevs state is recovered via call to idpf_attach_and_open(). If the reset handling fails the netdevs remain down. The detach/down calls are protected with RTNL lock to avoid racing with callbacks. On the recovery side the attach can be done without holding the RTNL lock as there are no callbacks expected at that point, due to detach/close always being done first in that flow. The previous logic restoring the netdevs state based on the IDPF_VPORT_UP_REQUESTED flag in the init task is not needed anymore, hence the removal of idpf_set_vport_state(). The IDPF_VPORT_UP_REQUESTED is still being used to restore the state of the netdevs following the reset, but has no use outside of the reset handling flow. idpf_init_hard_reset() is converted to void, since it was used as such and there is no error handling being done based on its return value. Before this change, invoking hard and soft resets simultaneously will cause the driver to lose the vport state: ip -br a <inf> UP echo 1 > /sys/class/net/ens801f0/device/reset& \ ethtool -L ens801f0 combined 8 ip -br a <inf> DOWN ip link set <inf> up ip -br a <inf> DOWN Also in case of a failure in the reset path, the netdev is left exposed to external callbacks, while vport resources are not initialized, leading to a crash on subsequent ifup/down: [408471.398966] idpf 0000:83:00.0: HW reset detected [408471.411744] idpf 0000:83:00.0: Device HW Reset initiated [408472.277901] idpf 0000:83:00.0: The driver was unable to contact the device's firmware. Check that the FW is running. Driver state= 0x2 [408508.125551] BUG: kernel NULL pointer dereference, address: 0000000000000078 [408508.126112] #PF: supervisor read access in kernel mode [408508.126687] #PF: error_code(0x0000) - not-present page [408508.127256] PGD 2aae2f067 P4D 0 [408508.127824] Oops: Oops: 0000 [#1] SMP NOPTI ... [408508.130871] RIP: 0010:idpf_stop+0x39/0x70 [idpf] ... [408508.139193] Call Trace: [408508.139637] <TASK> [408508.140077] __dev_close_many+0xbb/0x260 [408508.140533] __dev_change_flags+0x1cf/0x280 [408508.140987] netif_change_flags+0x26/0x70 [408508.141434] dev_change_flags+0x3d/0xb0 [408508.141878] devinet_ioctl+0x460/0x890 [408508.142321] inet_ioctl+0x18e/0x1d0 [408508.142762] ? _copy_to_user+0x22/0x70 [408508.143207] sock_do_ioctl+0x3d/0xe0 [408508.143652] sock_ioctl+0x10e/0x330 [408508.144091] ? find_held_lock+0x2b/0x80 [408508.144537] __x64_sys_ioctl+0x96/0xe0 [408508.144979] do_syscall_64+0x79/0x3d0 [408508.145415] entry_SYSCALL_64_after_hwframe+0x76/0x7e [408508.145860] RIP: 0033:0x7f3e0bb4caff Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursidpf: convert vport state to bitmapEmil Tantilov
[ Upstream commit 8dd72ebc73f37b216410db17340f15e6fb2cdb7b ] Convert vport state to a bitmap and remove the DOWN state which is redundant in the existing logic. There are no functional changes aside from the use of bitwise operations when setting and checking the states. Removed the double underscore to be consistent with the naming of other bitmaps in the header and renamed current_state to vport_is_up to match the meaning of the new variable. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Chittim Madhu <madhu.chittim@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-6-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 2e281e1155fc ("idpf: detach and close netdevs while handling a reset") Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursidpf: keep the netdev when a reset failsEmil Tantilov
[ Upstream commit 083029bd8b445595222a3cd14076b880781c1765 ] During a successful reset the driver would re-allocate vport resources while keeping the netdevs intact. However, in case of an error in the init task, the netdev of the failing vport will be unregistered, effectively removing the network interface: [ 121.211076] idpf 0000:83:00.0: enabling device (0100 -> 0102) [ 121.221976] idpf 0000:83:00.0: Device HW Reset initiated [ 124.161229] idpf 0000:83:00.0 ens801f0: renamed from eth0 [ 124.163364] idpf 0000:83:00.0 ens801f0d1: renamed from eth1 [ 125.934656] idpf 0000:83:00.0 ens801f0d2: renamed from eth2 [ 128.218429] idpf 0000:83:00.0 ens801f0d3: renamed from eth3 ip -br a ens801f0 UP ens801f0d1 UP ens801f0d2 UP ens801f0d3 UP echo 1 > /sys/class/net/ens801f0/device/reset [ 145.885537] idpf 0000:83:00.0: resetting [ 145.990280] idpf 0000:83:00.0: reset done [ 146.284766] idpf 0000:83:00.0: HW reset detected [ 146.296610] idpf 0000:83:00.0: Device HW Reset initiated [ 211.556719] idpf 0000:83:00.0: Transaction timed-out (op:526 cookie:7700 vc_op:526 salt:77 timeout:60000ms) [ 272.996705] idpf 0000:83:00.0: Transaction timed-out (op:502 cookie:7800 vc_op:502 salt:78 timeout:60000ms) ip -br a ens801f0d1 DOWN ens801f0d2 DOWN ens801f0d3 DOWN Re-shuffle the logic in the error path of the init task to make sure the netdevs remain intact. This will allow the driver to attempt recovery via subsequent resets, provided the FW is still functional. The main change is to make sure that idpf_decfg_netdev() is not called should the init task fail during a reset. The error handling is consolidated under unwind_vports, as the removed labels had the same cleanup logic split depending on the point of failure. Fixes: ce1b75d0635c ("idpf: add ptypes and MAC filter support") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursPCI/VGA: Don't assume the only VGA device on a system is `boot_vga`Mario Limonciello (AMD)
[ Upstream commit fd390ff144513eb0310c350b1cf5fa8d6ddd0c53 ] Some systems ship with multiple display class devices but not all of them are VGA devices. If the "only" VGA device on the system is not used for displaying the image on the screen marking it as `boot_vga` because nothing was found is totally wrong. This behavior actually leads to mistakes of the wrong device being advertised to userspace and then userspace can make incorrect decisions. As there is an accurate `boot_display` sysfs file stop lying about `boot_vga` by assuming if nothing is found it's the right device. Reported-by: Aaron Erhardt <aer@tuxedocomputers.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220712 Tested-by: Aaron Erhardt <aer@tuxedocomputers.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: ad90860bd10ee ("fbcon: Use screen info to find primary device") Tested-by: Luke D. Jones <luke@ljones.dev> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260106044638.52906-1-superm1@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnet: fix memory leak in skb_segment_list for GRO packetsMohammad Heib
[ Upstream commit 238e03d0466239410b72294b79494e43d4fabe77 ] When skb_segment_list() is called during packet forwarding, it handles packets that were aggregated by the GRO engine. Historically, the segmentation logic in skb_segment_list assumes that individual segments are split from a parent SKB and may need to carry their own socket memory accounting. Accordingly, the code transfers truesize from the parent to the newly created segments. Prior to commit ed4cccef64c1 ("gro: fix ownership transfer"), this truesize subtraction in skb_segment_list() was valid because fragments still carry a reference to the original socket. However, commit ed4cccef64c1 ("gro: fix ownership transfer") changed this behavior by ensuring that fraglist entries are explicitly orphaned (skb->sk = NULL) to prevent illegal orphaning later in the stack. This change meant that the entire socket memory charge remained with the head SKB, but the corresponding accounting logic in skb_segment_list() was never updated. As a result, the current code unconditionally adds each fragment's truesize to delta_truesize and subtracts it from the parent SKB. Since the fragments are no longer charged to the socket, this subtraction results in an effective under-count of memory when the head is freed. This causes sk_wmem_alloc to remain non-zero, preventing socket destruction and leading to a persistent memory leak. The leak can be observed via KMEMLEAK when tearing down the networking environment: unreferenced object 0xffff8881e6eb9100 (size 2048): comm "ping", pid 6720, jiffies 4295492526 backtrace: kmem_cache_alloc_noprof+0x5c6/0x800 sk_prot_alloc+0x5b/0x220 sk_alloc+0x35/0xa00 inet6_create.part.0+0x303/0x10d0 __sock_create+0x248/0x640 __sys_socket+0x11b/0x1d0 Since skb_segment_list() is exclusively used for SKB_GSO_FRAGLIST packets constructed by GRO, the truesize adjustment is removed. The call to skb_release_head_state() must be preserved. As documented in commit cf673ed0e057 ("net: fix fraglist segmentation reference count leak"), it is still required to correctly drop references to SKB extensions that may be overwritten during __copy_skb_header(). Fixes: ed4cccef64c1 ("gro: fix ownership transfer") Signed-off-by: Mohammad Heib <mheib@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260104213101.352887-1-mheib@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursriscv: pgtable: Cleanup useless VA_USER_XXX definitionsGuo Ren (Alibaba DAMO Academy)
[ Upstream commit 5e5be092ffadcab0093464ccd9e30f0c5cce16b9 ] These marcos are not used after commit b5b4287accd7 ("riscv: mm: Use hint address in mmap if available"). Cleanup VA_USER_XXX definitions in asm/pgtable.h. Fixes: b5b4287accd7 ("riscv: mm: Use hint address in mmap if available") Signed-off-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20251201005850.702569-1-guoren@kernel.org Signed-off-by: Paul Walmsley <pjw@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursriscv: cpufeature: Fix Zk bundled extension missing ZknhGuodong Xu
[ Upstream commit 8632180daf735074a746ce2b3808a8f2c079310e ] The Zk extension is a bundle consisting of Zkn, Zkr, and Zkt. The Zkn extension itself is a bundle consisting of Zbkb, Zbkc, Zbkx, Zknd, Zkne, and Zknh. The current implementation of riscv_zk_bundled_exts manually listed the dependencies but missed RISCV_ISA_EXT_ZKNH. Fix this by introducing a RISCV_ISA_EXT_ZKN macro that lists the Zkn components and using it in both riscv_zk_bundled_exts and riscv_zkn_bundled_exts. This adds the missing Zknh extension to Zk and reduces code duplication. Fixes: 0d8295ed975b ("riscv: add ISA extension parsing for scalar crypto") Link: https://patch.msgid.link/20231114141256.126749-4-cleger@rivosinc.com/ Signed-off-by: Guodong Xu <guodong@riscstar.com> Reviewed-by: Clément Léger <cleger@rivosinc.com> Link: https://patch.msgid.link/20251223-zk-missing-zknh-v1-1-b627c990ee1a@riscstar.com Signed-off-by: Paul Walmsley <pjw@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnet: airoha: Fix npu rx DMA definitionsLorenzo Bianconi
[ Upstream commit a7fc8c641cab855824c45e5e8877e40fd528b5df ] Fix typos in npu rx DMA descriptor definitions. Fixes: b3ef7bdec66fb ("net: airoha: Add airoha_offload.h header") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260102-airoha-npu-dma-rx-def-fixes-v1-1-205fc6bf7d94@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursbtrfs: fix NULL pointer dereference in do_abort_log_replay()Suchit Karunakaran
[ Upstream commit 530e3d4af566ca44807d79359b90794dea24c4f3 ] Coverity reported a NULL pointer dereference issue (CID 1666756) in do_abort_log_replay(). When btrfs_alloc_path() fails in replay_one_buffer(), wc->subvol_path is NULL, but btrfs_abort_log_replay() calls do_abort_log_replay() which unconditionally dereferences wc->subvol_path when attempting to print debug information. Fix this by adding a NULL check before dereferencing wc->subvol_path in do_abort_log_replay(). Fixes: 2753e4917624 ("btrfs: dump detailed info and specific messages on log replay failures") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursbtrfs: only enforce free space tree if v1 cache is required for bs < ps casesQu Wenruo
[ Upstream commit 30bcf4e824aa37d305502f52e1527c7b1eabef3d ] [BUG] Since the introduction of btrfs bs < ps support, v1 cache was never on the plan due to its hard coded PAGE_SIZE usage, and the future plan to properly deprecate it. However for bs < ps cases, even if 'nospace_cache,clear_cache' mount option is specified, it's never respected and free space tree is always enabled: mkfs.btrfs -f -O ^bgt,fst $dev mount $dev $mnt -o clear_cache,nospace_cache umount $mnt btrfs ins dump-super $dev ... compat_ro_flags 0x3 ( FREE_SPACE_TREE | FREE_SPACE_TREE_VALID ) ... This means a different behavior compared to bs >= ps cases. [CAUSE] The forcing usage of v2 space cache is done inside btrfs_set_free_space_cache_settings(), however it never checks if we're even using space cache but always enabling v2 cache. [FIX] Instead unconditionally enable v2 cache, only forcing v2 cache if the old v1 cache is required. Now v2 space cache can be properly disabled on bs < ps cases: mkfs.btrfs -f -O ^bgt,fst $dev mount $dev $mnt -o clear_cache,nospace_cache umount $mnt btrfs ins dump-super $dev ... compat_ro_flags 0x0 ... Fixes: 9f73f1aef98b ("btrfs: force v2 space cache usage for subpage mount") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursbtrfs: release path before initializing extent tree in btrfs_read_locked_inode()Filipe Manana
[ Upstream commit 8731f2c50b0b1d2b58ed5b9671ef2c4bdc2f8347 ] In btrfs_read_locked_inode() we are calling btrfs_init_file_extent_tree() while holding a path with a read locked leaf from a subvolume tree, and btrfs_init_file_extent_tree() may do a GFP_KERNEL allocation, which can trigger reclaim. This can create a circular lock dependency which lockdep warns about with the following splat: [6.1433] ====================================================== [6.1574] WARNING: possible circular locking dependency detected [6.1583] 6.18.0+ #4 Tainted: G U [6.1591] ------------------------------------------------------ [6.1599] kswapd0/117 is trying to acquire lock: [6.1606] ffff8d9b6333c5b8 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1625] but task is already holding lock: [6.1633] ffffffffa4ab8ce0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x195/0xc60 [6.1646] which lock already depends on the new lock. [6.1657] the existing dependency chain (in reverse order) is: [6.1667] -> #2 (fs_reclaim){+.+.}-{0:0}: [6.1677] fs_reclaim_acquire+0x9d/0xd0 [6.1685] __kmalloc_cache_noprof+0x59/0x750 [6.1694] btrfs_init_file_extent_tree+0x90/0x100 [6.1702] btrfs_read_locked_inode+0xc3/0x6b0 [6.1710] btrfs_iget+0xbb/0xf0 [6.1716] btrfs_lookup_dentry+0x3c5/0x8e0 [6.1724] btrfs_lookup+0x12/0x30 [6.1731] lookup_open.isra.0+0x1aa/0x6a0 [6.1739] path_openat+0x5f7/0xc60 [6.1746] do_filp_open+0xd6/0x180 [6.1753] do_sys_openat2+0x8b/0xe0 [6.1760] __x64_sys_openat+0x54/0xa0 [6.1768] do_syscall_64+0x97/0x3e0 [6.1776] entry_SYSCALL_64_after_hwframe+0x76/0x7e [6.1784] -> #1 (btrfs-tree-00){++++}-{3:3}: [6.1794] lock_release+0x127/0x2a0 [6.1801] up_read+0x1b/0x30 [6.1808] btrfs_search_slot+0x8e0/0xff0 [6.1817] btrfs_lookup_inode+0x52/0xd0 [6.1825] __btrfs_update_delayed_inode+0x73/0x520 [6.1833] btrfs_commit_inode_delayed_inode+0x11a/0x120 [6.1842] btrfs_log_inode+0x608/0x1aa0 [6.1849] btrfs_log_inode_parent+0x249/0xf80 [6.1857] btrfs_log_dentry_safe+0x3e/0x60 [6.1865] btrfs_sync_file+0x431/0x690 [6.1872] do_fsync+0x39/0x80 [6.1879] __x64_sys_fsync+0x13/0x20 [6.1887] do_syscall_64+0x97/0x3e0 [6.1894] entry_SYSCALL_64_after_hwframe+0x76/0x7e [6.1903] -> #0 (&delayed_node->mutex){+.+.}-{3:3}: [6.1913] __lock_acquire+0x15e9/0x2820 [6.1920] lock_acquire+0xc9/0x2d0 [6.1927] __mutex_lock+0xcc/0x10a0 [6.1934] __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1944] btrfs_evict_inode+0x20b/0x4b0 [6.1952] evict+0x15a/0x2f0 [6.1958] prune_icache_sb+0x91/0xd0 [6.1966] super_cache_scan+0x150/0x1d0 [6.1974] do_shrink_slab+0x155/0x6f0 [6.1981] shrink_slab+0x48e/0x890 [6.1988] shrink_one+0x11a/0x1f0 [6.1995] shrink_node+0xbfd/0x1320 [6.1002] balance_pgdat+0x67f/0xc60 [6.1321] kswapd+0x1dc/0x3e0 [6.1643] kthread+0xff/0x240 [6.1965] ret_from_fork+0x223/0x280 [6.1287] ret_from_fork_asm+0x1a/0x30 [6.1616] other info that might help us debug this: [6.1561] Chain exists of: &delayed_node->mutex --> btrfs-tree-00 --> fs_reclaim [6.1503] Possible unsafe locking scenario: [6.1110] CPU0 CPU1 [6.1411] ---- ---- [6.1707] lock(fs_reclaim); [6.1998] lock(btrfs-tree-00); [6.1291] lock(fs_reclaim); [6.1581] lock(&delayed_node->mutex); [6.1874] *** DEADLOCK *** [6.1716] 2 locks held by kswapd0/117: [6.1999] #0: ffffffffa4ab8ce0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x195/0xc60 [6.1294] #1: ffff8d998344b0e0 (&type->s_umount_key#40){++++}- {3:3}, at: super_cache_scan+0x37/0x1d0 [6.1596] stack backtrace: [6.1183] CPU: 11 UID: 0 PID: 117 Comm: kswapd0 Tainted: G U 6.18.0+ #4 PREEMPT(lazy) [6.1185] Tainted: [U]=USER [6.1186] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 2001 02/01/2023 [6.1187] Call Trace: [6.1187] <TASK> [6.1189] dump_stack_lvl+0x6e/0xa0 [6.1192] print_circular_bug.cold+0x17a/0x1c0 [6.1194] check_noncircular+0x175/0x190 [6.1197] __lock_acquire+0x15e9/0x2820 [6.1200] lock_acquire+0xc9/0x2d0 [6.1201] ? __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1204] __mutex_lock+0xcc/0x10a0 [6.1206] ? __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1208] ? __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1211] ? __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1213] __btrfs_release_delayed_node.part.0+0x39/0x2f0 [6.1215] btrfs_evict_inode+0x20b/0x4b0 [6.1217] ? lock_acquire+0xc9/0x2d0 [6.1220] evict+0x15a/0x2f0 [6.1222] prune_icache_sb+0x91/0xd0 [6.1224] super_cache_scan+0x150/0x1d0 [6.1226] do_shrink_slab+0x155/0x6f0 [6.1228] shrink_slab+0x48e/0x890 [6.1229] ? shrink_slab+0x2d2/0x890 [6.1231] shrink_one+0x11a/0x1f0 [6.1234] shrink_node+0xbfd/0x1320 [6.1236] ? shrink_node+0xa2d/0x1320 [6.1236] ? shrink_node+0xbd3/0x1320 [6.1239] ? balance_pgdat+0x67f/0xc60 [6.1239] balance_pgdat+0x67f/0xc60 [6.1241] ? finish_task_switch.isra.0+0xc4/0x2a0 [6.1246] kswapd+0x1dc/0x3e0 [6.1247] ? __pfx_autoremove_wake_function+0x10/0x10 [6.1249] ? __pfx_kswapd+0x10/0x10 [6.1250] kthread+0xff/0x240 [6.1251] ? __pfx_kthread+0x10/0x10 [6.1253] ret_from_fork+0x223/0x280 [6.1255] ? __pfx_kthread+0x10/0x10 [6.1257] ret_from_fork_asm+0x1a/0x30 [6.1260] </TASK> This is because: 1) The fsync task is holding an inode's delayed node mutex (for a directory) while calling __btrfs_update_delayed_inode() and that needs to do a search on the subvolume's btree (therefore read lock some extent buffers); 2) The lookup task, at btrfs_lookup(), triggered reclaim with the GFP_KERNEL allocation done by btrfs_init_file_extent_tree() while holding a read lock on a subvolume leaf; 3) The reclaim triggered kswapd which is doing inode eviction for the directory inode the fsync task is using as an argument to btrfs_commit_inode_delayed_inode() - but in that call chain we are trying to read lock the same leaf that the lookup task is holding while calling btrfs_init_file_extent_tree() and doing the GFP_KERNEL allocation. Fix this by calling btrfs_init_file_extent_tree() after we don't need the path anymore and release it in btrfs_read_locked_inode(). Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/linux-btrfs/6e55113a22347c3925458a5d840a18401a38b276.camel@linux.intel.com/ Fixes: 8679d2687c35 ("btrfs: initialize inode::file_extent_tree after i_mode has been set") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursvsock: Make accept()ed sockets use custom setsockopt()Michal Luczaj
[ Upstream commit ce5e612dd411de096aa041b9e9325ba1bec5f9f4 ] SO_ZEROCOPY handling in vsock_connectible_setsockopt() does not get called on accept()ed sockets due to a missing flag. Flip it. Fixes: e0718bd82e27 ("vsock: enable setting SO_ZEROCOPY") Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://patch.msgid.link/20251229-vsock-child-sock-custom-sockopt-v2-1-64778d6c4f88@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursdrm/amd/pm: force send pcie parmater on navi1xYang Wang
[ Upstream commit dc8a887de1a7d397ab4131f45676e89565417aa8 ] v1: the PMFW didn't initialize the PCIe DPM parameters and requires the KMD to actively provide these parameters. v2: clean & remove unused code logic (lijo) Fixes: 1a18607c07bb ("drm/amd/pm: override pcie dpm parameters only if it is necessary") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4671 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b0dbd5db7cf1f81e4aaedd25cb5e72ce369387b2) Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursdrm/amd/pm: fix wrong pcie parameter on navi1xYang Wang
[ Upstream commit 4f74c2dd970611d3ec3bb0d58215e73af5cd7214 ] fix wrong pcie dpm parameter on navi1x Fixes: 1a18607c07bb ("drm/amd/pm: override pcie dpm parameters only if it is necessary") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4671 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Co-developed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5c5189cf4b0cc0a22bac74a40743ee711cff07f8) Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursperf: Ensure swevent hrtimer is properly destroyedPeter Zijlstra
[ Upstream commit ff5860f5088e9076ebcccf05a6ca709d5935cfa9 ] With the change to hrtimer_try_to_cancel() in perf_swevent_cancel_hrtimer() it appears possible for the hrtimer to still be active by the time the event gets freed. Make sure the event does a full hrtimer_cancel() on the free path by installing a perf_event::destroy handler. Fixes: eb3182ef0405 ("perf/core: Fix system hang caused by cpu-clock usage") Reported-by: CyberUnicorns <a101e_iotvul@163.com> Tested-by: CyberUnicorns <a101e_iotvul@163.com> Debugged-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursinet: frags: drop fraglist conntrack referencesFlorian Westphal
[ Upstream commit 2ef02ac38d3c17f34a00c4b267d961a8d4b45d1a ] Jakub added a warning in nf_conntrack_cleanup_net_list() to make debugging leaked skbs/conntrack references more obvious. syzbot reports this as triggering, and I can also reproduce this via ip_defrag.sh selftest: conntrack cleanup blocked for 60s WARNING: net/netfilter/nf_conntrack_core.c:2512 [..] conntrack clenups gets stuck because there are skbs with still hold nf_conn references via their frag_list. net.core.skb_defer_max=0 makes the hang disappear. Eric Dumazet points out that skb_release_head_state() doesn't follow the fraglist. ip_defrag.sh can only reproduce this problem since commit 6471658dc66c ("udp: use skb_attempt_defer_free()"), but AFAICS this problem could happen with TCP as well if pmtu discovery is off. The relevant problem path for udp is: 1. netns emits fragmented packets 2. nf_defrag_v6_hook reassembles them (in output hook) 3. reassembled skb is tracked (skb owns nf_conn reference) 4. ip6_output refragments 5. refragmented packets also own nf_conn reference (ip6_fragment calls ip6_copy_metadata()) 6. on input path, nf_defrag_v6_hook skips defragmentation: the fragments already have skb->nf_conn attached 7. skbs are reassembled via ipv6_frag_rcv() 8. skb_consume_udp -> skb_attempt_defer_free() -> skb ends up in pcpu freelist, but still has nf_conn reference. Possible solutions: 1 let defrag engine drop nf_conn entry, OR 2 export kick_defer_list_purge() and call it from the conntrack netns exit callback, OR 3 add skb_has_frag_list() check to skb_attempt_defer_free() 2 & 3 also solve ip_defrag.sh hang but share same drawback: Such reassembled skbs, queued to socket, can prevent conntrack module removal until userspace has consumed the packet. While both tcp and udp stack do call nf_reset_ct() before placing skb on socket queue, that function doesn't iterate frag_list skbs. Therefore drop nf_conn entries when they are placed in defrag queue. Keep the nf_conn entry of the first (offset 0) skb so that reassembled skb retains nf_conn entry for sake of TX path. Note that fixes tag is incorrect; it points to the commit introducing the 'ip_defrag.sh reproducible problem': no need to backport this patch to every stable kernel. Reported-by: syzbot+4393c47753b7808dac7d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/693b0fa7.050a0220.4004e.040d.GAE@google.com/ Fixes: 6471658dc66c ("udp: use skb_attempt_defer_free()") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260102140030.32367-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursvirtio_net: fix device mismatch in devm_kzalloc/devm_kfreeKommula Shiva Shankar
[ Upstream commit acb4bc6e1ba34ae1a34a9334a1ce8474c909466e ] Initial rss_hdr allocation uses virtio_device->device, but virtnet_set_queues() frees using net_device->device. This device mismatch causing below devres warning [ 3788.514041] ------------[ cut here ]------------ [ 3788.514044] WARNING: drivers/base/devres.c:1095 at devm_kfree+0x84/0x98, CPU#16: vdpa/1463 [ 3788.514054] Modules linked in: octep_vdpa virtio_net virtio_vdpa [last unloaded: virtio_vdpa] [ 3788.514064] CPU: 16 UID: 0 PID: 1463 Comm: vdpa Tainted: G W 6.18.0 #10 PREEMPT [ 3788.514067] Tainted: [W]=WARN [ 3788.514069] Hardware name: Marvell CN106XX board (DT) [ 3788.514071] pstate: 63400009 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) [ 3788.514074] pc : devm_kfree+0x84/0x98 [ 3788.514076] lr : devm_kfree+0x54/0x98 [ 3788.514079] sp : ffff800084e2f220 [ 3788.514080] x29: ffff800084e2f220 x28: ffff0003b2366000 x27: 000000000000003f [ 3788.514085] x26: 000000000000003f x25: ffff000106f17c10 x24: 0000000000000080 [ 3788.514089] x23: ffff00045bb8ab08 x22: ffff00045bb8a000 x21: 0000000000000018 [ 3788.514093] x20: ffff0004355c3080 x19: ffff00045bb8aa00 x18: 0000000000080000 [ 3788.514098] x17: 0000000000000040 x16: 000000000000001f x15: 000000000007ffff [ 3788.514102] x14: 0000000000000488 x13: 0000000000000005 x12: 00000000000fffff [ 3788.514106] x11: ffffffffffffffff x10: 0000000000000005 x9 : ffff800080c8c05c [ 3788.514110] x8 : ffff800084e2eeb8 x7 : 0000000000000000 x6 : 000000000000003f [ 3788.514115] x5 : ffff8000831bafe0 x4 : ffff800080c8b010 x3 : ffff0004355c3080 [ 3788.514119] x2 : ffff0004355c3080 x1 : 0000000000000000 x0 : 0000000000000000 [ 3788.514123] Call trace: [ 3788.514125] devm_kfree+0x84/0x98 (P) [ 3788.514129] virtnet_set_queues+0x134/0x2e8 [virtio_net] [ 3788.514135] virtnet_probe+0x9c0/0xe00 [virtio_net] [ 3788.514139] virtio_dev_probe+0x1e0/0x338 [ 3788.514144] really_probe+0xc8/0x3a0 [ 3788.514149] __driver_probe_device+0x84/0x170 [ 3788.514152] driver_probe_device+0x44/0x120 [ 3788.514155] __device_attach_driver+0xc4/0x168 [ 3788.514158] bus_for_each_drv+0x8c/0xf0 [ 3788.514161] __device_attach+0xa4/0x1c0 [ 3788.514164] device_initial_probe+0x1c/0x30 [ 3788.514168] bus_probe_device+0xb4/0xc0 [ 3788.514170] device_add+0x614/0x828 [ 3788.514173] register_virtio_device+0x214/0x258 [ 3788.514175] virtio_vdpa_probe+0xa0/0x110 [virtio_vdpa] [ 3788.514179] vdpa_dev_probe+0xa8/0xd8 [ 3788.514183] really_probe+0xc8/0x3a0 [ 3788.514186] __driver_probe_device+0x84/0x170 [ 3788.514189] driver_probe_device+0x44/0x120 [ 3788.514192] __device_attach_driver+0xc4/0x168 [ 3788.514195] bus_for_each_drv+0x8c/0xf0 [ 3788.514197] __device_attach+0xa4/0x1c0 [ 3788.514200] device_initial_probe+0x1c/0x30 [ 3788.514203] bus_probe_device+0xb4/0xc0 [ 3788.514206] device_add+0x614/0x828 [ 3788.514209] _vdpa_register_device+0x58/0x88 [ 3788.514211] octep_vdpa_dev_add+0x104/0x228 [octep_vdpa] [ 3788.514215] vdpa_nl_cmd_dev_add_set_doit+0x2d0/0x3c0 [ 3788.514218] genl_family_rcv_msg_doit+0xe4/0x158 [ 3788.514222] genl_rcv_msg+0x218/0x298 [ 3788.514225] netlink_rcv_skb+0x64/0x138 [ 3788.514229] genl_rcv+0x40/0x60 [ 3788.514233] netlink_unicast+0x32c/0x3b0 [ 3788.514237] netlink_sendmsg+0x170/0x3b8 [ 3788.514241] __sys_sendto+0x12c/0x1c0 [ 3788.514246] __arm64_sys_sendto+0x30/0x48 [ 3788.514249] invoke_syscall.constprop.0+0x58/0xf8 [ 3788.514255] do_el0_svc+0x48/0xd0 [ 3788.514259] el0_svc+0x48/0x210 [ 3788.514264] el0t_64_sync_handler+0xa0/0xe8 [ 3788.514268] el0t_64_sync+0x198/0x1a0 [ 3788.514271] ---[ end trace 0000000000000000 ]--- Fix by using virtio_device->device consistently for allocation and deallocation Fixes: 4944be2f5ad8c ("virtio_net: Allocate rss_hdr with devres") Signed-off-by: Kommula Shiva Shankar <kshankar@marvell.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://patch.msgid.link/20260102101900.692770-1-kshankar@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursbnxt_en: Fix potential data corruption with HW GRO/LROSrijit Bose
[ Upstream commit ffeafa65b2b26df2f5b5a6118d3174f17bd12ec5 ] Fix the max number of bits passed to find_first_zero_bit() in bnxt_alloc_agg_idx(). We were incorrectly passing the number of long words. find_first_zero_bit() may fail to find a zero bit and cause a wrong ID to be used. If the wrong ID is already in use, this can cause data corruption. Sometimes an error like this can also be seen: bnxt_en 0000:83:00.0 enp131s0np0: TPA end agg_buf 2 != expected agg_bufs 1 Fix it by passing the correct number of bits MAX_TPA_P5. Use DECLARE_BITMAP() to more cleanly define the bitmap. Add a sanity check to warn if a bit cannot be found and reset the ring [MChan]. Fixes: ec4d8e7cf024 ("bnxt_en: Add TPA ID mapping logic for 57500 chips.") Reviewed-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Srijit Bose <srijit.bose@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251231083625.3911652-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnet: wwan: iosm: Fix memory leak in ipc_mux_deinit()Zilin Guan
[ Upstream commit 92e6e0a87f6860a4710f9494f8c704d498ae60f8 ] Commit 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support") allocated memory for pp_qlt in ipc_mux_init() but did not free it in ipc_mux_deinit(). This results in a memory leak when the driver is unloaded. Free the allocated memory in ipc_mux_deinit() to fix the leak. Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support") Co-developed-by: Jianhao Xu <jianhao.xu@seu.edu.cn> Signed-off-by: Jianhao Xu <jianhao.xu@seu.edu.cn> Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Link: https://patch.msgid.link/20251230071853.1062223-1-zilin@seu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnet/ena: fix missing lock when update devlink paramsFrank Liang
[ Upstream commit 8da901ffe497a53fa4ecc3ceed0e6d771586f88e ] Fix assert lock warning while calling devl_param_driverinit_value_set() in ena. WARNING: net/devlink/core.c:261 at devl_assert_locked+0x62/0x90, CPU#0: kworker/0:0/9 CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.19.0-rc2+ #1 PREEMPT(lazy) Hardware name: Amazon EC2 m8i-flex.4xlarge/, BIOS 1.0 10/16/2017 Workqueue: events work_for_cpu_fn RIP: 0010:devl_assert_locked+0x62/0x90 Call Trace: <TASK> devl_param_driverinit_value_set+0x15/0x1c0 ena_devlink_alloc+0x18c/0x220 [ena] ? __pfx_ena_devlink_alloc+0x10/0x10 [ena] ? trace_hardirqs_on+0x18/0x140 ? lockdep_hardirqs_on+0x8c/0x130 ? __raw_spin_unlock_irqrestore+0x5d/0x80 ? __raw_spin_unlock_irqrestore+0x46/0x80 ? devm_ioremap_wc+0x9a/0xd0 ena_probe+0x4d2/0x1b20 [ena] ? __lock_acquire+0x56a/0xbd0 ? __pfx_ena_probe+0x10/0x10 [ena] ? local_clock+0x15/0x30 ? __lock_release.isra.0+0x1c9/0x340 ? mark_held_locks+0x40/0x70 ? lockdep_hardirqs_on_prepare.part.0+0x92/0x170 ? trace_hardirqs_on+0x18/0x140 ? lockdep_hardirqs_on+0x8c/0x130 ? __raw_spin_unlock_irqrestore+0x5d/0x80 ? __raw_spin_unlock_irqrestore+0x46/0x80 ? __pfx_ena_probe+0x10/0x10 [ena] ...... </TASK> Fixes: 816b52624cf6 ("net: ena: Control PHC enable through devlink") Signed-off-by: Frank Liang <xiliang@redhat.com> Reviewed-by: David Arinzon <darinzon@amazon.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20251231145808.6103-1-xiliang@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnet/mlx5e: Dealloc forgotten PSP RX modify headerCosmin Ratiu
[ Upstream commit 0462a15d2d1fafd3d48cf3c7c67393e42d03908c ] The commit which added RX steering rules for PSP forgot to free a modify header HW object on the cleanup path, which lead to health errors when reloading the driver and uninitializing the device: mlx5_core 0000:08:00.0: poll_health:803:(pid 3021): Fatal error 3 detected Fix that by saving the modify header pointer in the PSP steering struct and deallocating it after freeing the rule which references it. Fixes: 9536fbe10c9d ("net/mlx5e: Add PSP steering in local NIC RX") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20251225132717.358820-6-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnet/mlx5e: Don't print error message due to invalid moduleGal Pressman
[ Upstream commit 144297e2a24e3e54aee1180ec21120ea38822b97 ] Dumping module EEPROM on newer modules is supported through the netlink interface only. Querying with old userspace ethtool (or other tools, such as 'lshw') which still uses the ioctl interface results in an error message that could flood dmesg (in addition to the expected error return value). The original message was added under the assumption that the driver should be able to handle all module types, but now that such flows are easily triggered from userspace, it doesn't serve its purpose. Change the log level of the print in mlx5_query_module_eeprom() to debug. Fixes: bb64143eee8c ("net/mlx5e: Add ethtool support for dump module EEPROM") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20251225132717.358820-5-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnet/mlx5e: Don't gate FEC histograms on ppcnt_statistical_groupAlexei Lazar
[ Upstream commit 6c75dc9de40ff91ec2b621b78f6cd9031762067c ] Currently, the ppcnt_statistical_group capability check incorrectly gates access to FEC histogram statistics. This capability applies only to statistical and physical counter groups, not for histogram data. Restrict the ppcnt_statistical_group check to the Physical_Layer_Counters and Physical_Layer_Statistical_Counters groups. Histogram statistics access remains gated by the pphcr capability. The issue is harmless as of today, as it happens that ppcnt_statistical_group is set on all existing devices that have pphcr set. Fixes: 6b81b8a0b197 ("net/mlx5e: Don't query FEC statistics when FEC is disabled") Signed-off-by: Alexei Lazar <alazar@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20251225132717.358820-3-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnet/mlx5: Lag, multipath, give priority for routes with smaller network prefixPatrisious Haddad
[ Upstream commit 31057979cdadfee9f934746fd84046b43506ba61 ] Today multipath offload is controlled by a single route and the route controlling is selected if it meets one of the following criteria: 1. No controlling route is set. 2. New route destination is the same as old one. 3. New route metric is lower than old route metric. This can cause unwanted behaviour in case a new route is added with a smaller network prefix which should get the priority. Fix this by adding a new criteria to give priority to new route with a smaller network prefix. Fixes: ad11c4f1d8fd ("net/mlx5e: Lag, Only handle events from highest priority multipath entry") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20251225132717.358820-2-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnetdev: preserve NETIF_F_ALL_FOR_ALL across TSO updatesDi Zhu
[ Upstream commit 02d1e1a3f9239cdb3ecf2c6d365fb959d1bf39df ] Directly increment the TSO features incurs a side effect: it will also directly clear the flags in NETIF_F_ALL_FOR_ALL on the master device, which can cause issues such as the inability to enable the nocache copy feature on the bonding driver. The fix is to include NETIF_F_ALL_FOR_ALL in the update mask, thereby preventing it from being cleared. Fixes: b0ce3508b25e ("bonding: allow TSO being set on bonding master") Signed-off-by: Di Zhu <zhud@hygon.cn> Link: https://patch.msgid.link/20251224012224.56185-1-zhud@hygon.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnet: sock: fix hardened usercopy panic in sock_recv_errqueueWeiming Shi
[ Upstream commit 2a71a1a8d0ed718b1c7a9ac61f07e5755c47ae20 ] skbuff_fclone_cache was created without defining a usercopy region, [1] unlike skbuff_head_cache which properly whitelists the cb[] field. [2] This causes a usercopy BUG() when CONFIG_HARDENED_USERCOPY is enabled and the kernel attempts to copy sk_buff.cb data to userspace via sock_recv_errqueue() -> put_cmsg(). The crash occurs when: 1. TCP allocates an skb using alloc_skb_fclone() (from skbuff_fclone_cache) [1] 2. The skb is cloned via skb_clone() using the pre-allocated fclone [3] 3. The cloned skb is queued to sk_error_queue for timestamp reporting 4. Userspace reads the error queue via recvmsg(MSG_ERRQUEUE) 5. sock_recv_errqueue() calls put_cmsg() to copy serr->ee from skb->cb [4] 6. __check_heap_object() fails because skbuff_fclone_cache has no usercopy whitelist [5] When cloned skbs allocated from skbuff_fclone_cache are used in the socket error queue, accessing the sock_exterr_skb structure in skb->cb via put_cmsg() triggers a usercopy hardening violation: [ 5.379589] usercopy: Kernel memory exposure attempt detected from SLUB object 'skbuff_fclone_cache' (offset 296, size 16)! [ 5.382796] kernel BUG at mm/usercopy.c:102! [ 5.383923] Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI [ 5.384903] CPU: 1 UID: 0 PID: 138 Comm: poc_put_cmsg Not tainted 6.12.57 #7 [ 5.384903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 5.384903] RIP: 0010:usercopy_abort+0x6c/0x80 [ 5.384903] Code: 1a 86 51 48 c7 c2 40 15 1a 86 41 52 48 c7 c7 c0 15 1a 86 48 0f 45 d6 48 c7 c6 80 15 1a 86 48 89 c1 49 0f 45 f3 e8 84 27 88 ff <0f> 0b 490 [ 5.384903] RSP: 0018:ffffc900006f77a8 EFLAGS: 00010246 [ 5.384903] RAX: 000000000000006f RBX: ffff88800f0ad2a8 RCX: 1ffffffff0f72e74 [ 5.384903] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff87b973a0 [ 5.384903] RBP: 0000000000000010 R08: 0000000000000000 R09: fffffbfff0f72e74 [ 5.384903] R10: 0000000000000003 R11: 79706f6372657375 R12: 0000000000000001 [ 5.384903] R13: ffff88800f0ad2b8 R14: ffffea00003c2b40 R15: ffffea00003c2b00 [ 5.384903] FS: 0000000011bc4380(0000) GS:ffff8880bf100000(0000) knlGS:0000000000000000 [ 5.384903] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.384903] CR2: 000056aa3b8e5fe4 CR3: 000000000ea26004 CR4: 0000000000770ef0 [ 5.384903] PKRU: 55555554 [ 5.384903] Call Trace: [ 5.384903] <TASK> [ 5.384903] __check_heap_object+0x9a/0xd0 [ 5.384903] __check_object_size+0x46c/0x690 [ 5.384903] put_cmsg+0x129/0x5e0 [ 5.384903] sock_recv_errqueue+0x22f/0x380 [ 5.384903] tls_sw_recvmsg+0x7ed/0x1960 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? schedule+0x6d/0x270 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? mutex_unlock+0x81/0xd0 [ 5.384903] ? __pfx_mutex_unlock+0x10/0x10 [ 5.384903] ? __pfx_tls_sw_recvmsg+0x10/0x10 [ 5.384903] ? _raw_spin_lock_irqsave+0x8f/0xf0 [ 5.384903] ? _raw_read_unlock_irqrestore+0x20/0x40 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 The crash offset 296 corresponds to skb2->cb within skbuff_fclones: - sizeof(struct sk_buff) = 232 - offsetof(struct sk_buff, cb) = 40 - offset of skb2.cb in fclones = 232 + 40 = 272 - crash offset 296 = 272 + 24 (inside sock_exterr_skb.ee) This patch uses a local stack variable as a bounce buffer to avoid the hardened usercopy check failure. [1] https://elixir.bootlin.com/linux/v6.12.62/source/net/ipv4/tcp.c#L885 [2] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5104 [3] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5566 [4] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5491 [5] https://elixir.bootlin.com/linux/v6.12.62/source/mm/slub.c#L5719 Fixes: 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20251223203534.1392218-2-bestswngs@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnet: phy: mxl-86110: Add power management and soft reset supportStefano Radaelli
[ Upstream commit 62f7edd59964eb588e96fce1ad35a2327ea54424 ] Implement soft_reset, suspend, and resume callbacks using genphy_soft_reset(), genphy_suspend(), and genphy_resume() to fix PHY initialization and power management issues. The soft_reset callback is needed to properly recover the PHY after an ifconfig down/up cycle. Without it, the PHY can remain in power-down state, causing MDIO register access failures during config_init(). The soft reset ensures the PHY is operational before configuration. The suspend/resume callbacks enable proper power management during system suspend/resume cycles. Fixes: b2908a989c59 ("net: phy: add driver for MaxLinear MxL86110 PHY") Signed-off-by: Stefano Radaelli <stefano.r@variscite.com> Link: https://patch.msgid.link/20251223120940.407195-1-stefano.r@variscite.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursinet: ping: Fix icmp out countingyuan.gao
[ Upstream commit 4c0856c225b39b1def6c9a6bc56faca79550da13 ] When the ping program uses an IPPROTO_ICMP socket to send ICMP_ECHO messages, ICMP_MIB_OUTMSGS is counted twice. ping_v4_sendmsg ping_v4_push_pending_frames ip_push_pending_frames ip_finish_skb __ip_make_skb icmp_out_count(net, icmp_type); // first count icmp_out_count(sock_net(sk), user_icmph.type); // second count However, when the ping program uses an IPPROTO_RAW socket, ICMP_MIB_OUTMSGS is counted correctly only once. Therefore, the first count should be removed. Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Signed-off-by: yuan.gao <yuan.gao@ucloud.cn> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20251224063145.3615282-1-yuan.gao@ucloud.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnet: mscc: ocelot: Fix crash when adding interface under a lagJerry Wu
[ Upstream commit 34f3ff52cb9fa7dbf04f5c734fcc4cb6ed5d1a95 ] Commit 15faa1f67ab4 ("lan966x: Fix crash when adding interface under a lag") fixed a similar issue in the lan966x driver caused by a NULL pointer dereference. The ocelot_set_aggr_pgids() function in the ocelot driver has similar logic and is susceptible to the same crash. This issue specifically affects the ocelot_vsc7514.c frontend, which leaves unused ports as NULL pointers. The felix_vsc9959.c frontend is unaffected as it uses the DSA framework which registers all ports. Fix this by checking if the port pointer is valid before accessing it. Fixes: 528d3f190c98 ("net: mscc: ocelot: drop the use of the "lags" array") Signed-off-by: Jerry Wu <w.7erry@foxmail.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/tencent_75EF812B305E26B0869C673DD1160866C90A@qq.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursbridge: fix C-VLAN preservation in 802.1ad vlan_tunnel egressAlexandre Knecht
[ Upstream commit 3128df6be147768fe536986fbb85db1d37806a9f ] When using an 802.1ad bridge with vlan_tunnel, the C-VLAN tag is incorrectly stripped from frames during egress processing. br_handle_egress_vlan_tunnel() uses skb_vlan_pop() to remove the S-VLAN from hwaccel before VXLAN encapsulation. However, skb_vlan_pop() also moves any "next" VLAN from the payload into hwaccel: /* move next vlan tag to hw accel tag */ __skb_vlan_pop(skb, &vlan_tci); __vlan_hwaccel_put_tag(skb, vlan_proto, vlan_tci); For QinQ frames where the C-VLAN sits in the payload, this moves it to hwaccel where it gets lost during VXLAN encapsulation. Fix by calling __vlan_hwaccel_clear_tag() directly, which clears only the hwaccel S-VLAN and leaves the payload untouched. This path is only taken when vlan_tunnel is enabled and tunnel_info is configured, so 802.1Q bridges are unaffected. Tested with 802.1ad bridge + VXLAN vlan_tunnel, verified C-VLAN preserved in VXLAN payload via tcpdump. Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths") Signed-off-by: Alexandre Knecht <knecht.alexandre@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20251228020057.2788865-1-knecht.alexandre@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnet: marvell: prestera: fix NULL dereference on devlink_alloc() failureAlok Tiwari
[ Upstream commit a428e0da1248c353557970848994f35fd3f005e2 ] devlink_alloc() may return NULL on allocation failure, but prestera_devlink_alloc() unconditionally calls devlink_priv() on the returned pointer. This leads to a NULL pointer dereference if devlink allocation fails. Add a check for a NULL devlink pointer and return NULL early to avoid the crash. Fixes: 34dd1710f5a3 ("net: marvell: prestera: Add basic devlink support") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Acked-by: Elad Nachman <enachman@marvell.com> Link: https://patch.msgid.link/20251230052124.897012-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnetfilter: nf_conncount: update last_gc only when GC has been performedFernando Fernandez Mancera
[ Upstream commit 7811ba452402d58628e68faedf38745b3d485e3c ] Currently last_gc is being updated everytime a new connection is tracked, that means that it is updated even if a GC wasn't performed. With a sufficiently high packet rate, it is possible to always bypass the GC, causing the list to grow infinitely. Update the last_gc value only when a GC has been actually performed. Fixes: d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnetfilter: nf_tables: fix memory leak in nf_tables_newrule()Zilin Guan
[ Upstream commit d077e8119ddbb4fca67540f1a52453631a47f221 ] In nf_tables_newrule(), if nft_use_inc() fails, the function jumps to the err_release_rule label without freeing the allocated flow, leading to a memory leak. Fix this by adding a new label err_destroy_flow and jumping to it when nft_use_inc() fails. This ensures that the flow is properly released in this error case. Fixes: 1689f25924ada ("netfilter: nf_tables: report use refcount overflow") Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursgpio: pca953x: handle short interrupt pulses on PCAL devicesErnest Van Hoecke
[ Upstream commit 014a17deb41201449f76df2b20c857a9c3294a7c ] GPIO drivers with latch input support may miss short pulses on input pins even when input latching is enabled. The generic interrupt logic in the pca953x driver reports interrupts by comparing the current input value against the previously sampled one and only signals an event when a level change is observed between two reads. For short pulses, the first edge is captured when the input register is read, but if the signal returns to its previous level before the read, the second edge is not observed. As a result, successive pulses can produce identical input values at read time and no level change is detected, causing interrupts to be missed. Below timing diagram shows this situation where the top signal is the input pin level and the bottom signal indicates the latched value. ─────┐ ┌──*───────────────┐ ┌──*─────────────────┐ ┌──*─── │ │ . │ │ . │ │ . │ │ │ │ │ │ │ │ │ └──*──┘ │ └──*──┘ │ └──*──┘ │ Input │ │ │ │ │ │ ▼ │ ▼ │ ▼ │ IRQ │ IRQ │ IRQ │ . . . ─────┐ .┌──────────────┐ .┌────────────────┐ .┌── │ │ │ │ │ │ │ │ │ │ │ │ └────────*┘ └────────*┘ └────────*┘ Latched │ │ │ ▼ ▼ ▼ READ 0 READ 0 READ 0 NO CHANGE NO CHANGE PCAL variants provide an interrupt status register that records which pins triggered an interrupt, but the status and input registers cannot be read atomically. The interrupt status is only cleared when the input port is read, and the input value must also be read to determine the triggering edge. If another interrupt occurs on a different line after the status register has been read but before the input register is sampled, that event will not be reflected in the earlier status snapshot, so relying solely on the interrupt status register is also insufficient. Support for input latching and interrupt status handling was previously added by [1], but the interrupt status-based logic was reverted by [2] due to these issues. This patch addresses the original problem by combining both sources of information. Events indicated by the interrupt status register are merged with events detected through the existing level-change logic. As a result: * short pulses, whose second edges are invisible, are detected via the interrupt status register, and * interrupts that occur between the status and input reads are still caught by the generic level-change logic. This significantly improves robustness on devices that signal interrupts as short pulses, while avoiding the issues that led to the earlier reversion. In practice, even if only the first edge of a pulse is observable, the interrupt is reliably detected. This fixes missed interrupts from an Ilitek touch controller with its interrupt line connected to a PCAL6416A, where active-low pulses are approximately 200 us long. [1] commit 44896beae605 ("gpio: pca953x: add PCAL9535 interrupt support for Galileo Gen2") [2] commit d6179f6c6204 ("gpio: pca953x: Improve interrupt support") Fixes: d6179f6c6204 ("gpio: pca953x: Improve interrupt support") Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20251217153050.142057-1-ernestvanhoecke@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursgpiolib: fix race condition for gdev->srcuPaweł Narewski
[ Upstream commit a7ac22d53d0990152b108c3f4fe30df45fcb0181 ] If two drivers were calling gpiochip_add_data_with_key(), one may be traversing the srcu-protected list in gpio_name_to_desc(), meanwhile other has just added its gdev in gpiodev_add_to_list_unlocked(). This creates a non-mutexed and non-protected timeframe, when one instance is dereferencing and using &gdev->srcu, before the other has initialized it, resulting in crash: [ 4.935481] Unable to handle kernel paging request at virtual address ffff800272bcc000 [ 4.943396] Mem abort info: [ 4.943400] ESR = 0x0000000096000005 [ 4.943403] EC = 0x25: DABT (current EL), IL = 32 bits [ 4.943407] SET = 0, FnV = 0 [ 4.943410] EA = 0, S1PTW = 0 [ 4.943413] FSC = 0x05: level 1 translation fault [ 4.943416] Data abort info: [ 4.943418] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [ 4.946220] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 4.955261] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 4.955268] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000038e6c000 [ 4.961449] [ffff800272bcc000] pgd=0000000000000000 [ 4.969203] , p4d=1000000039739003 [ 4.979730] , pud=0000000000000000 [ 4.980210] phandle (CPU): 0x0000005e, phandle (BE): 0x5e000000 for node "reset" [ 4.991736] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP ... [ 5.121359] pc : __srcu_read_lock+0x44/0x98 [ 5.131091] lr : gpio_name_to_desc+0x60/0x1a0 [ 5.153671] sp : ffff8000833bb430 [ 5.298440] [ 5.298443] Call trace: [ 5.298445] __srcu_read_lock+0x44/0x98 [ 5.309484] gpio_name_to_desc+0x60/0x1a0 [ 5.320692] gpiochip_add_data_with_key+0x488/0xf00 5.946419] ---[ end trace 0000000000000000 ]--- Move initialization code for gdev fields before it is added to gpio_devices, with adjacent initialization code. Adjust goto statements to reflect modified order of operations Fixes: 47d8b4c1d868 ("gpio: add SRCU infrastructure to struct gpio_device") Reviewed-by: Jakub Lewalski <jakub.lewalski@nokia.com> Signed-off-by: Paweł Narewski <pawel.narewski@nokia.com> [Bartosz: fixed a build issue, removed stray newline] Link: https://lore.kernel.org/r/20251224082641.10769-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursgpiolib: rename GPIO chip printk macrosBartosz Golaszewski
[ Upstream commit d4f335b410ddbe3e99f48f8b5ea78a25041274f1 ] The chip_$level() macros take struct gpio_chip as argument so make it follow the convention of using the 'gpiochip_' prefix. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Stable-dep-of: a7ac22d53d09 ("gpiolib: fix race condition for gdev->srcu") Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursgpiolib: remove unnecessary 'out of memory' messagesBartosz Golaszewski
[ Upstream commit 0ba6f1ed3808b1f095fbdb490006f0ecd00f52bd ] We don't need to add additional logs when returning -ENOMEM so remove unnecessary error messages. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Stable-dep-of: a7ac22d53d09 ("gpiolib: fix race condition for gdev->srcu") Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnetfilter: nft_synproxy: avoid possible data-race on update operationFernando Fernandez Mancera
[ Upstream commit 36a3200575642846a96436d503d46544533bb943 ] During nft_synproxy eval we are reading nf_synproxy_info struct which can be modified on update operation concurrently. As nf_synproxy_info struct fits in 32 bits, use READ_ONCE/WRITE_ONCE annotations. Fixes: ee394f96ad75 ("netfilter: nft_synproxy: add synproxy stateful object support") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursnetfilter: nft_set_pipapo: fix range overlap detectionFlorian Westphal
[ Upstream commit 7711f4bb4b360d9c0ff84db1c0ec91e385625047 ] set->klen has to be used, not sizeof(). The latter only compares a single register but a full check of the entire key is needed. Example: table ip t { map s { typeof iifname . ip saddr : verdict flags interval } } nft add element t s '{ "lo" . 10.0.0.0/24 : drop }' # no error, expected nft add element t s '{ "lo" . 10.0.0.0/24 : drop }' # no error, expected nft add element t s '{ "lo" . 10.0.0.0/8 : drop }' # bug: no error The 3rd 'add element' should be rejected via -ENOTEMPTY, not -EEXIST, so userspace / nft can report an error to the user. The latter is only correct for the 2nd case (re-add of existing element). As-is, userspace is told that the command was successful, but no elements were added. After this patch, 3rd command gives: Error: Could not process rule: File exists add element t s { "lo" . 127.0.0.0/8 . "lo" : drop } ^^^^^^^^^^^^^^^^^^^^^^^^^ Fixes: 0eb4b5ee33f2 ("netfilter: nft_set_pipapo: Separate partial and complete overlap cases on insertion") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursarm64: dts: mba8mx: Fix Ethernet PHY IRQ supportAlexander Stein
[ Upstream commit 89e87d0dc87eb3654c9ae01afc4a18c1c6d1e523 ] Ethernet PHY interrupt mode is level triggered. Adjust the mode accordingly. Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Fixes: 70cf622bb16e ("arm64: dts: mba8mx: Add Ethernet PHY IRQ support") Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursarm64: dts: imx8qm-ss-dma: correct the dma channels of lpuartSherry Sun
[ Upstream commit a988caeed9d918452aa0a68de2c6e94d86aa43ba ] The commit 616effc0272b5 ("arm64: dts: imx8: Fix lpuart DMA channel order") swap uart rx and tx channel at common imx8-ss-dma.dtsi. But miss update imx8qm-ss-dma.dtsi. The commit 5a8e9b022e569 ("arm64: dts: imx8qm-ss-dma: Pass lpuart dma-names") just simple add dma-names as binding doc requirement. Correct lpuart0 - lpuart3 dma rx and tx channels, and use defines for the FSL_EDMA_RX flag. Fixes: 5a8e9b022e56 ("arm64: dts: imx8qm-ss-dma: Pass lpuart dma-names") Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursarm64: dts: imx8mp: Fix LAN8740Ai PHY reference clock on DH electronics ↵Marek Vasut
i.MX8M Plus DHCOM [ Upstream commit c63749a7ddc59ac6ec0b05abfa0a21af9f2c1d38 ] Add missing 'clocks' property to LAN8740Ai PHY node, to allow the PHY driver to manage LAN8740Ai CLKIN reference clock supply. This fixes sporadic link bouncing caused by interruptions on the PHY reference clock, by letting the PHY driver manage the reference clock and assure there are no interruptions. This follows the matching PHY driver recommendation described in commit bedd8d78aba3 ("net: phy: smsc: LAN8710/20: add phy refclk in support") Fixes: 8d6712695bc8 ("arm64: dts: imx8mp: Add support for DH electronics i.MX8M Plus DHCOM and PDK2") Signed-off-by: Marek Vasut <marek.vasut@mailbox.org> Tested-by: Christoph Niedermaier <cniedermaier@dh-electronics.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
45 hoursarm64: dts: freescale: tx8p-ml81: fix eqos nvmem-cellsMaud Spierings
[ Upstream commit cdf4e631eec5ddd49bb625df9fb144d6ecdd6f15 ] On this SoM eqos is the primary ethernet interface, Ka-Ro fuses the address for it in eth_mac1, eth_mac2 seems to be left unfused. In their downstream u-boot they fetch it from eth_mac1 [1][2], by setting alias of eqos to ethernet0, the driver then fetches the mac address based on the alias number. Set eqos to read from eth_mac1 instead of eth_mac2. Also set fec to point at eth_mac2 as it may be fused later even though it is disabled by default. With this changed barebox is now capable of loading the correct address. Link: https://github.com/karo-electronics/karo-tx-uboot/blob/380543278410bbf04264d80a3bfbe340b8e62439/drivers/net/dwc_eth_qos.c#L1167 [1] Link: https://github.com/karo-electronics/karo-tx-uboot/blob/380543278410bbf04264d80a3bfbe340b8e62439/arch/arm/dts/imx8mp-karo.dtsi#L12 [2] Fixes: bac63d7c5f46 ("arm64: dts: freescale: add Ka-Ro Electronics tx8p-ml81 COM") Signed-off-by: Maud Spierings <maudspierings@gocontroll.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>