| Age | Commit message (Collapse) | Author |
|
[ Upstream commit 445b49d13787da2fe8d51891ee196e5077feef44 ]
RSS LUT provisioning and queries on a down interface currently return
silently without effect. Users should be able to configure RSS settings
even when the interface is down.
Fix by maintaining RSS configuration changes in the driver's soft copy and
deferring HW programming until the interface comes up.
Fixes: 02cbfba1add5 ("idpf: add ethtool callbacks")
Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Reviewed-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 83f38f210b85676f40ba8586b5a8edae19b56995 ]
The RSS LUT is not initialized until the interface comes up, causing
the following NULL pointer crash when ethtool operations like rxhash on/off
are performed before the interface is brought up for the first time.
Move RSS LUT initialization from ndo_open to vport creation to ensure LUT
is always available. This enables RSS configuration via ethtool before
bringing the interface up. Simplify LUT management by maintaining all
changes in the driver's soft copy and programming zeros to the indirection
table when rxhash is disabled. Defer HW programming until the interface
comes up if it is down during rxhash and LUT configuration changes.
Steps to reproduce:
** Load idpf driver; interfaces will be created
modprobe idpf
** Before bringing the interfaces up, turn rxhash off
ethtool -K eth2 rxhash off
[89408.371875] BUG: kernel NULL pointer dereference, address: 0000000000000000
[89408.371908] #PF: supervisor read access in kernel mode
[89408.371924] #PF: error_code(0x0000) - not-present page
[89408.371940] PGD 0 P4D 0
[89408.371953] Oops: Oops: 0000 [#1] SMP NOPTI
<snip>
[89408.372052] RIP: 0010:memcpy_orig+0x16/0x130
[89408.372310] Call Trace:
[89408.372317] <TASK>
[89408.372326] ? idpf_set_features+0xfc/0x180 [idpf]
[89408.372363] __netdev_update_features+0x295/0xde0
[89408.372384] ethnl_set_features+0x15e/0x460
[89408.372406] genl_family_rcv_msg_doit+0x11f/0x180
[89408.372429] genl_rcv_msg+0x1ad/0x2b0
[89408.372446] ? __pfx_ethnl_set_features+0x10/0x10
[89408.372465] ? __pfx_genl_rcv_msg+0x10/0x10
[89408.372482] netlink_rcv_skb+0x58/0x100
[89408.372502] genl_rcv+0x2c/0x50
[89408.372516] netlink_unicast+0x289/0x3e0
[89408.372533] netlink_sendmsg+0x215/0x440
[89408.372551] __sys_sendto+0x234/0x240
[89408.372571] __x64_sys_sendto+0x28/0x30
[89408.372585] x64_sys_call+0x1909/0x1da0
[89408.372604] do_syscall_64+0x7a/0xfa0
[89408.373140] ? clear_bhb_loop+0x60/0xb0
[89408.373647] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[89408.378887] </TASK>
<snip>
Fixes: a251eee62133 ("idpf: add SRIOV support and other ndo_ops")
Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Reviewed-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 36aae2ea6bd76b8246caa50e34a4f4824f0a3be8 ]
When ethtool -n is executed on an interface to display the flow steering
rules, "rxclass: Unknown flow type" error is generated.
The flow steering list maintained in the driver currently stores only the
location and q_index but other fields of the ethtool_rx_flow_spec are not
stored. This may be enough for the virtchnl command to delete the entry.
However, when the ethtool -n command is used to query the flow steering
rules, the ethtool_rx_flow_spec returned is not complete causing the
error below.
Resolve this by storing the flow spec (fsp) when rules are added and
returning the complete flow spec when rules are queried.
Also, change the return value from EINVAL to ENOENT when flow steering
entry is not found during query by location or when deleting an entry.
Add logic to detect and reject duplicate filter entries at the same
location and change logic to perform upfront validation of all error
conditions before adding flow rules through virtchnl. This avoids the
need for additional virtchnl delete messages when subsequent operations
fail, which was missing in the original upstream code.
Example:
Before the fix:
ethtool -n eth1
2 RX rings available
Total 2 rules
rxclass: Unknown flow type
rxclass: Unknown flow type
After the fix:
ethtool -n eth1
2 RX rings available
Total 2 rules
Filter: 0
Rule Type: TCP over IPv4
Src IP addr: 10.0.0.1 mask: 0.0.0.0
Dest IP addr: 0.0.0.0 mask: 255.255.255.255
TOS: 0x0 mask: 0xff
Src port: 0 mask: 0xffff
Dest port: 0 mask: 0xffff
Action: Direct to queue 0
Filter: 1
Rule Type: UDP over IPv4
Src IP addr: 10.0.0.1 mask: 0.0.0.0
Dest IP addr: 0.0.0.0 mask: 255.255.255.255
TOS: 0x0 mask: 0xff
Src port: 0 mask: 0xffff
Dest port: 0 mask: 0xffff
Action: Direct to queue 0
Fixes: ada3e24b84a0 ("idpf: add flow steering support")
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Co-developed-by: Sreedevi Joshi <sreedevi.joshi@intel.com>
Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f9841bd28b600526ca4f6713b0ca49bf7bb98452 ]
The flow steering list maintains entries that are added and removed as
ethtool creates and deletes flow steering rules. Module removal with active
entries causes memory leak as the list is not properly cleaned up.
Prevent this by iterating through the remaining entries in the list and
freeing the associated memory during module removal. Add a spinlock
(flow_steer_list_lock) to protect the list access from multiple threads.
Fixes: ada3e24b84a0 ("idpf: add flow steering support")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4d792219fe6f891b5b557a607ac8a0a14eda6e38 ]
If the init_task fails during a driver load, we end up without vports and
netdevs, effectively failing the entire process. In that state a
subsequent reset will result in a crash as the service task attempts to
access uninitialized resources. Following trace is from an error in the
init_task where the CREATE_VPORT (op 501) is rejected by the FW:
[40922.763136] idpf 0000:83:00.0: Device HW Reset initiated
[40924.449797] idpf 0000:83:00.0: Transaction failed (op 501)
[40958.148190] idpf 0000:83:00.0: HW reset detected
[40958.161202] BUG: kernel NULL pointer dereference, address: 00000000000000a8
...
[40958.168094] Workqueue: idpf-0000:83:00.0-vc_event idpf_vc_event_task [idpf]
[40958.168865] RIP: 0010:idpf_vc_event_task+0x9b/0x350 [idpf]
...
[40958.177932] Call Trace:
[40958.178491] <TASK>
[40958.179040] process_one_work+0x226/0x6d0
[40958.179609] worker_thread+0x19e/0x340
[40958.180158] ? __pfx_worker_thread+0x10/0x10
[40958.180702] kthread+0x10f/0x250
[40958.181238] ? __pfx_kthread+0x10/0x10
[40958.181774] ret_from_fork+0x251/0x2b0
[40958.182307] ? __pfx_kthread+0x10/0x10
[40958.182834] ret_from_fork_asm+0x1a/0x30
[40958.183370] </TASK>
Fix the error handling in the init_task to make sure the service and
mailbox tasks are disabled if the error happens during load. These are
started in idpf_vc_core_init(), which spawns the init_task and has no way
of knowing if it failed. If the error happens on reset, following
successful driver load, the tasks can still run, as that will allow the
netdevs to attempt recovery through another reset. Stop the PTP callbacks
either way as those will be restarted by the call to idpf_vc_core_init()
during a successful reset.
Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration")
Reported-by: Vivek Kumar <iamvivekkumar@google.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e111cbc4adf9f9974eed040aeece7e17460f6bff ]
Make sure to free hw->lan_regs. Reported by kmemleak during reset:
unreferenced object 0xff1b913d02a936c0 (size 96):
comm "kworker/u258:14", pid 2174, jiffies 4294958305
hex dump (first 32 bytes):
00 00 00 c0 a8 ba 2d ff 00 00 00 00 00 00 00 00 ......-.........
00 00 40 08 00 00 00 00 00 00 25 b3 a8 ba 2d ff ..@.......%...-.
backtrace (crc 36063c4f):
__kmalloc_noprof+0x48f/0x890
idpf_vc_core_init+0x6ce/0x9b0 [idpf]
idpf_vc_event_task+0x1fb/0x350 [idpf]
process_one_work+0x226/0x6d0
worker_thread+0x19e/0x340
kthread+0x10f/0x250
ret_from_fork+0x251/0x2b0
ret_from_fork_asm+0x1a/0x30
Fixes: 6aa53e861c1a ("idpf: implement get LAN MMIO memory regions")
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Joshua Hay <joshua.a.hay@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f6242b354605faff263ca45882b148200915a3f6 ]
Free vport->rx_ptype_lkup in idpf_vport_rel() to avoid leaking memory
during a reset. Reported by kmemleak:
unreferenced object 0xff450acac838a000 (size 4096):
comm "kworker/u258:5", pid 7732, jiffies 4296830044
hex dump (first 32 bytes):
00 00 00 00 00 10 00 00 00 10 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 ................
backtrace (crc 3da81902):
__kmalloc_cache_noprof+0x469/0x7a0
idpf_send_get_rx_ptype_msg+0x90/0x570 [idpf]
idpf_init_task+0x1ec/0x8d0 [idpf]
process_one_work+0x226/0x6d0
worker_thread+0x19e/0x340
kthread+0x10f/0x250
ret_from_fork+0x251/0x2b0
ret_from_fork_asm+0x1a/0x30
Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration")
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2e281e1155fc476c571c0bd2ffbfe28ab829a5c3 ]
Protect the reset path from callbacks by setting the netdevs to detached
state and close any netdevs in UP state until the reset handling has
completed. During a reset, the driver will de-allocate resources for the
vport, and there is no guarantee that those will recover, which is why the
existing vport_ctrl_lock does not provide sufficient protection.
idpf_detach_and_close() is called right before reset handling. If the
reset handling succeeds, the netdevs state is recovered via call to
idpf_attach_and_open(). If the reset handling fails the netdevs remain
down. The detach/down calls are protected with RTNL lock to avoid racing
with callbacks. On the recovery side the attach can be done without
holding the RTNL lock as there are no callbacks expected at that point,
due to detach/close always being done first in that flow.
The previous logic restoring the netdevs state based on the
IDPF_VPORT_UP_REQUESTED flag in the init task is not needed anymore, hence
the removal of idpf_set_vport_state(). The IDPF_VPORT_UP_REQUESTED is
still being used to restore the state of the netdevs following the reset,
but has no use outside of the reset handling flow.
idpf_init_hard_reset() is converted to void, since it was used as such and
there is no error handling being done based on its return value.
Before this change, invoking hard and soft resets simultaneously will
cause the driver to lose the vport state:
ip -br a
<inf> UP
echo 1 > /sys/class/net/ens801f0/device/reset& \
ethtool -L ens801f0 combined 8
ip -br a
<inf> DOWN
ip link set <inf> up
ip -br a
<inf> DOWN
Also in case of a failure in the reset path, the netdev is left
exposed to external callbacks, while vport resources are not
initialized, leading to a crash on subsequent ifup/down:
[408471.398966] idpf 0000:83:00.0: HW reset detected
[408471.411744] idpf 0000:83:00.0: Device HW Reset initiated
[408472.277901] idpf 0000:83:00.0: The driver was unable to contact the device's firmware. Check that the FW is running. Driver state= 0x2
[408508.125551] BUG: kernel NULL pointer dereference, address: 0000000000000078
[408508.126112] #PF: supervisor read access in kernel mode
[408508.126687] #PF: error_code(0x0000) - not-present page
[408508.127256] PGD 2aae2f067 P4D 0
[408508.127824] Oops: Oops: 0000 [#1] SMP NOPTI
...
[408508.130871] RIP: 0010:idpf_stop+0x39/0x70 [idpf]
...
[408508.139193] Call Trace:
[408508.139637] <TASK>
[408508.140077] __dev_close_many+0xbb/0x260
[408508.140533] __dev_change_flags+0x1cf/0x280
[408508.140987] netif_change_flags+0x26/0x70
[408508.141434] dev_change_flags+0x3d/0xb0
[408508.141878] devinet_ioctl+0x460/0x890
[408508.142321] inet_ioctl+0x18e/0x1d0
[408508.142762] ? _copy_to_user+0x22/0x70
[408508.143207] sock_do_ioctl+0x3d/0xe0
[408508.143652] sock_ioctl+0x10e/0x330
[408508.144091] ? find_held_lock+0x2b/0x80
[408508.144537] __x64_sys_ioctl+0x96/0xe0
[408508.144979] do_syscall_64+0x79/0x3d0
[408508.145415] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[408508.145860] RIP: 0033:0x7f3e0bb4caff
Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration")
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8dd72ebc73f37b216410db17340f15e6fb2cdb7b ]
Convert vport state to a bitmap and remove the DOWN state which is
redundant in the existing logic. There are no functional changes aside
from the use of bitwise operations when setting and checking the states.
Removed the double underscore to be consistent with the naming of other
bitmaps in the header and renamed current_state to vport_is_up to match
the meaning of the new variable.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Chittim Madhu <madhu.chittim@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-6-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 2e281e1155fc ("idpf: detach and close netdevs while handling a reset")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 083029bd8b445595222a3cd14076b880781c1765 ]
During a successful reset the driver would re-allocate vport resources
while keeping the netdevs intact. However, in case of an error in the
init task, the netdev of the failing vport will be unregistered,
effectively removing the network interface:
[ 121.211076] idpf 0000:83:00.0: enabling device (0100 -> 0102)
[ 121.221976] idpf 0000:83:00.0: Device HW Reset initiated
[ 124.161229] idpf 0000:83:00.0 ens801f0: renamed from eth0
[ 124.163364] idpf 0000:83:00.0 ens801f0d1: renamed from eth1
[ 125.934656] idpf 0000:83:00.0 ens801f0d2: renamed from eth2
[ 128.218429] idpf 0000:83:00.0 ens801f0d3: renamed from eth3
ip -br a
ens801f0 UP
ens801f0d1 UP
ens801f0d2 UP
ens801f0d3 UP
echo 1 > /sys/class/net/ens801f0/device/reset
[ 145.885537] idpf 0000:83:00.0: resetting
[ 145.990280] idpf 0000:83:00.0: reset done
[ 146.284766] idpf 0000:83:00.0: HW reset detected
[ 146.296610] idpf 0000:83:00.0: Device HW Reset initiated
[ 211.556719] idpf 0000:83:00.0: Transaction timed-out (op:526 cookie:7700 vc_op:526 salt:77 timeout:60000ms)
[ 272.996705] idpf 0000:83:00.0: Transaction timed-out (op:502 cookie:7800 vc_op:502 salt:78 timeout:60000ms)
ip -br a
ens801f0d1 DOWN
ens801f0d2 DOWN
ens801f0d3 DOWN
Re-shuffle the logic in the error path of the init task to make sure the
netdevs remain intact. This will allow the driver to attempt recovery via
subsequent resets, provided the FW is still functional.
The main change is to make sure that idpf_decfg_netdev() is not called
should the init task fail during a reset. The error handling is
consolidated under unwind_vports, as the removed labels had the same
cleanup logic split depending on the point of failure.
Fixes: ce1b75d0635c ("idpf: add ptypes and MAC filter support")
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit fd390ff144513eb0310c350b1cf5fa8d6ddd0c53 ]
Some systems ship with multiple display class devices but not all
of them are VGA devices. If the "only" VGA device on the system is not
used for displaying the image on the screen marking it as `boot_vga`
because nothing was found is totally wrong.
This behavior actually leads to mistakes of the wrong device being
advertised to userspace and then userspace can make incorrect decisions.
As there is an accurate `boot_display` sysfs file stop lying about
`boot_vga` by assuming if nothing is found it's the right device.
Reported-by: Aaron Erhardt <aer@tuxedocomputers.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220712
Tested-by: Aaron Erhardt <aer@tuxedocomputers.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: ad90860bd10ee ("fbcon: Use screen info to find primary device")
Tested-by: Luke D. Jones <luke@ljones.dev>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260106044638.52906-1-superm1@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 238e03d0466239410b72294b79494e43d4fabe77 ]
When skb_segment_list() is called during packet forwarding, it handles
packets that were aggregated by the GRO engine.
Historically, the segmentation logic in skb_segment_list assumes that
individual segments are split from a parent SKB and may need to carry
their own socket memory accounting. Accordingly, the code transfers
truesize from the parent to the newly created segments.
Prior to commit ed4cccef64c1 ("gro: fix ownership transfer"), this
truesize subtraction in skb_segment_list() was valid because fragments
still carry a reference to the original socket.
However, commit ed4cccef64c1 ("gro: fix ownership transfer") changed
this behavior by ensuring that fraglist entries are explicitly
orphaned (skb->sk = NULL) to prevent illegal orphaning later in the
stack. This change meant that the entire socket memory charge remained
with the head SKB, but the corresponding accounting logic in
skb_segment_list() was never updated.
As a result, the current code unconditionally adds each fragment's
truesize to delta_truesize and subtracts it from the parent SKB. Since
the fragments are no longer charged to the socket, this subtraction
results in an effective under-count of memory when the head is freed.
This causes sk_wmem_alloc to remain non-zero, preventing socket
destruction and leading to a persistent memory leak.
The leak can be observed via KMEMLEAK when tearing down the networking
environment:
unreferenced object 0xffff8881e6eb9100 (size 2048):
comm "ping", pid 6720, jiffies 4295492526
backtrace:
kmem_cache_alloc_noprof+0x5c6/0x800
sk_prot_alloc+0x5b/0x220
sk_alloc+0x35/0xa00
inet6_create.part.0+0x303/0x10d0
__sock_create+0x248/0x640
__sys_socket+0x11b/0x1d0
Since skb_segment_list() is exclusively used for SKB_GSO_FRAGLIST
packets constructed by GRO, the truesize adjustment is removed.
The call to skb_release_head_state() must be preserved. As documented in
commit cf673ed0e057 ("net: fix fraglist segmentation reference count
leak"), it is still required to correctly drop references to SKB
extensions that may be overwritten during __copy_skb_header().
Fixes: ed4cccef64c1 ("gro: fix ownership transfer")
Signed-off-by: Mohammad Heib <mheib@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260104213101.352887-1-mheib@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5e5be092ffadcab0093464ccd9e30f0c5cce16b9 ]
These marcos are not used after commit b5b4287accd7 ("riscv: mm: Use
hint address in mmap if available"). Cleanup VA_USER_XXX definitions
in asm/pgtable.h.
Fixes: b5b4287accd7 ("riscv: mm: Use hint address in mmap if available")
Signed-off-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org>
Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://patch.msgid.link/20251201005850.702569-1-guoren@kernel.org
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8632180daf735074a746ce2b3808a8f2c079310e ]
The Zk extension is a bundle consisting of Zkn, Zkr, and Zkt. The Zkn
extension itself is a bundle consisting of Zbkb, Zbkc, Zbkx, Zknd, Zkne,
and Zknh.
The current implementation of riscv_zk_bundled_exts manually listed
the dependencies but missed RISCV_ISA_EXT_ZKNH.
Fix this by introducing a RISCV_ISA_EXT_ZKN macro that lists the Zkn
components and using it in both riscv_zk_bundled_exts and
riscv_zkn_bundled_exts.
This adds the missing Zknh extension to Zk and reduces code duplication.
Fixes: 0d8295ed975b ("riscv: add ISA extension parsing for scalar crypto")
Link: https://patch.msgid.link/20231114141256.126749-4-cleger@rivosinc.com/
Signed-off-by: Guodong Xu <guodong@riscstar.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://patch.msgid.link/20251223-zk-missing-zknh-v1-1-b627c990ee1a@riscstar.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a7fc8c641cab855824c45e5e8877e40fd528b5df ]
Fix typos in npu rx DMA descriptor definitions.
Fixes: b3ef7bdec66fb ("net: airoha: Add airoha_offload.h header")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260102-airoha-npu-dma-rx-def-fixes-v1-1-205fc6bf7d94@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 530e3d4af566ca44807d79359b90794dea24c4f3 ]
Coverity reported a NULL pointer dereference issue (CID 1666756) in
do_abort_log_replay(). When btrfs_alloc_path() fails in
replay_one_buffer(), wc->subvol_path is NULL, but btrfs_abort_log_replay()
calls do_abort_log_replay() which unconditionally dereferences
wc->subvol_path when attempting to print debug information. Fix this by
adding a NULL check before dereferencing wc->subvol_path in
do_abort_log_replay().
Fixes: 2753e4917624 ("btrfs: dump detailed info and specific messages on log replay failures")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 30bcf4e824aa37d305502f52e1527c7b1eabef3d ]
[BUG]
Since the introduction of btrfs bs < ps support, v1 cache was never on
the plan due to its hard coded PAGE_SIZE usage, and the future plan to
properly deprecate it.
However for bs < ps cases, even if 'nospace_cache,clear_cache' mount
option is specified, it's never respected and free space tree is always
enabled:
mkfs.btrfs -f -O ^bgt,fst $dev
mount $dev $mnt -o clear_cache,nospace_cache
umount $mnt
btrfs ins dump-super $dev
...
compat_ro_flags 0x3
( FREE_SPACE_TREE |
FREE_SPACE_TREE_VALID )
...
This means a different behavior compared to bs >= ps cases.
[CAUSE]
The forcing usage of v2 space cache is done inside
btrfs_set_free_space_cache_settings(), however it never checks if we're
even using space cache but always enabling v2 cache.
[FIX]
Instead unconditionally enable v2 cache, only forcing v2 cache if the
old v1 cache is required.
Now v2 space cache can be properly disabled on bs < ps cases:
mkfs.btrfs -f -O ^bgt,fst $dev
mount $dev $mnt -o clear_cache,nospace_cache
umount $mnt
btrfs ins dump-super $dev
...
compat_ro_flags 0x0
...
Fixes: 9f73f1aef98b ("btrfs: force v2 space cache usage for subpage mount")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8731f2c50b0b1d2b58ed5b9671ef2c4bdc2f8347 ]
In btrfs_read_locked_inode() we are calling btrfs_init_file_extent_tree()
while holding a path with a read locked leaf from a subvolume tree, and
btrfs_init_file_extent_tree() may do a GFP_KERNEL allocation, which can
trigger reclaim.
This can create a circular lock dependency which lockdep warns about with
the following splat:
[6.1433] ======================================================
[6.1574] WARNING: possible circular locking dependency detected
[6.1583] 6.18.0+ #4 Tainted: G U
[6.1591] ------------------------------------------------------
[6.1599] kswapd0/117 is trying to acquire lock:
[6.1606] ffff8d9b6333c5b8 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node.part.0+0x39/0x2f0
[6.1625]
but task is already holding lock:
[6.1633] ffffffffa4ab8ce0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x195/0xc60
[6.1646]
which lock already depends on the new lock.
[6.1657]
the existing dependency chain (in reverse order) is:
[6.1667]
-> #2 (fs_reclaim){+.+.}-{0:0}:
[6.1677] fs_reclaim_acquire+0x9d/0xd0
[6.1685] __kmalloc_cache_noprof+0x59/0x750
[6.1694] btrfs_init_file_extent_tree+0x90/0x100
[6.1702] btrfs_read_locked_inode+0xc3/0x6b0
[6.1710] btrfs_iget+0xbb/0xf0
[6.1716] btrfs_lookup_dentry+0x3c5/0x8e0
[6.1724] btrfs_lookup+0x12/0x30
[6.1731] lookup_open.isra.0+0x1aa/0x6a0
[6.1739] path_openat+0x5f7/0xc60
[6.1746] do_filp_open+0xd6/0x180
[6.1753] do_sys_openat2+0x8b/0xe0
[6.1760] __x64_sys_openat+0x54/0xa0
[6.1768] do_syscall_64+0x97/0x3e0
[6.1776] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[6.1784]
-> #1 (btrfs-tree-00){++++}-{3:3}:
[6.1794] lock_release+0x127/0x2a0
[6.1801] up_read+0x1b/0x30
[6.1808] btrfs_search_slot+0x8e0/0xff0
[6.1817] btrfs_lookup_inode+0x52/0xd0
[6.1825] __btrfs_update_delayed_inode+0x73/0x520
[6.1833] btrfs_commit_inode_delayed_inode+0x11a/0x120
[6.1842] btrfs_log_inode+0x608/0x1aa0
[6.1849] btrfs_log_inode_parent+0x249/0xf80
[6.1857] btrfs_log_dentry_safe+0x3e/0x60
[6.1865] btrfs_sync_file+0x431/0x690
[6.1872] do_fsync+0x39/0x80
[6.1879] __x64_sys_fsync+0x13/0x20
[6.1887] do_syscall_64+0x97/0x3e0
[6.1894] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[6.1903]
-> #0 (&delayed_node->mutex){+.+.}-{3:3}:
[6.1913] __lock_acquire+0x15e9/0x2820
[6.1920] lock_acquire+0xc9/0x2d0
[6.1927] __mutex_lock+0xcc/0x10a0
[6.1934] __btrfs_release_delayed_node.part.0+0x39/0x2f0
[6.1944] btrfs_evict_inode+0x20b/0x4b0
[6.1952] evict+0x15a/0x2f0
[6.1958] prune_icache_sb+0x91/0xd0
[6.1966] super_cache_scan+0x150/0x1d0
[6.1974] do_shrink_slab+0x155/0x6f0
[6.1981] shrink_slab+0x48e/0x890
[6.1988] shrink_one+0x11a/0x1f0
[6.1995] shrink_node+0xbfd/0x1320
[6.1002] balance_pgdat+0x67f/0xc60
[6.1321] kswapd+0x1dc/0x3e0
[6.1643] kthread+0xff/0x240
[6.1965] ret_from_fork+0x223/0x280
[6.1287] ret_from_fork_asm+0x1a/0x30
[6.1616]
other info that might help us debug this:
[6.1561] Chain exists of:
&delayed_node->mutex --> btrfs-tree-00 --> fs_reclaim
[6.1503] Possible unsafe locking scenario:
[6.1110] CPU0 CPU1
[6.1411] ---- ----
[6.1707] lock(fs_reclaim);
[6.1998] lock(btrfs-tree-00);
[6.1291] lock(fs_reclaim);
[6.1581] lock(&delayed_node->mutex);
[6.1874]
*** DEADLOCK ***
[6.1716] 2 locks held by kswapd0/117:
[6.1999] #0: ffffffffa4ab8ce0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x195/0xc60
[6.1294] #1: ffff8d998344b0e0 (&type->s_umount_key#40){++++}- {3:3}, at: super_cache_scan+0x37/0x1d0
[6.1596]
stack backtrace:
[6.1183] CPU: 11 UID: 0 PID: 117 Comm: kswapd0 Tainted: G U 6.18.0+ #4 PREEMPT(lazy)
[6.1185] Tainted: [U]=USER
[6.1186] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 2001 02/01/2023
[6.1187] Call Trace:
[6.1187] <TASK>
[6.1189] dump_stack_lvl+0x6e/0xa0
[6.1192] print_circular_bug.cold+0x17a/0x1c0
[6.1194] check_noncircular+0x175/0x190
[6.1197] __lock_acquire+0x15e9/0x2820
[6.1200] lock_acquire+0xc9/0x2d0
[6.1201] ? __btrfs_release_delayed_node.part.0+0x39/0x2f0
[6.1204] __mutex_lock+0xcc/0x10a0
[6.1206] ? __btrfs_release_delayed_node.part.0+0x39/0x2f0
[6.1208] ? __btrfs_release_delayed_node.part.0+0x39/0x2f0
[6.1211] ? __btrfs_release_delayed_node.part.0+0x39/0x2f0
[6.1213] __btrfs_release_delayed_node.part.0+0x39/0x2f0
[6.1215] btrfs_evict_inode+0x20b/0x4b0
[6.1217] ? lock_acquire+0xc9/0x2d0
[6.1220] evict+0x15a/0x2f0
[6.1222] prune_icache_sb+0x91/0xd0
[6.1224] super_cache_scan+0x150/0x1d0
[6.1226] do_shrink_slab+0x155/0x6f0
[6.1228] shrink_slab+0x48e/0x890
[6.1229] ? shrink_slab+0x2d2/0x890
[6.1231] shrink_one+0x11a/0x1f0
[6.1234] shrink_node+0xbfd/0x1320
[6.1236] ? shrink_node+0xa2d/0x1320
[6.1236] ? shrink_node+0xbd3/0x1320
[6.1239] ? balance_pgdat+0x67f/0xc60
[6.1239] balance_pgdat+0x67f/0xc60
[6.1241] ? finish_task_switch.isra.0+0xc4/0x2a0
[6.1246] kswapd+0x1dc/0x3e0
[6.1247] ? __pfx_autoremove_wake_function+0x10/0x10
[6.1249] ? __pfx_kswapd+0x10/0x10
[6.1250] kthread+0xff/0x240
[6.1251] ? __pfx_kthread+0x10/0x10
[6.1253] ret_from_fork+0x223/0x280
[6.1255] ? __pfx_kthread+0x10/0x10
[6.1257] ret_from_fork_asm+0x1a/0x30
[6.1260] </TASK>
This is because:
1) The fsync task is holding an inode's delayed node mutex (for a
directory) while calling __btrfs_update_delayed_inode() and that needs
to do a search on the subvolume's btree (therefore read lock some
extent buffers);
2) The lookup task, at btrfs_lookup(), triggered reclaim with the
GFP_KERNEL allocation done by btrfs_init_file_extent_tree() while
holding a read lock on a subvolume leaf;
3) The reclaim triggered kswapd which is doing inode eviction for the
directory inode the fsync task is using as an argument to
btrfs_commit_inode_delayed_inode() - but in that call chain we are
trying to read lock the same leaf that the lookup task is holding
while calling btrfs_init_file_extent_tree() and doing the GFP_KERNEL
allocation.
Fix this by calling btrfs_init_file_extent_tree() after we don't need the
path anymore and release it in btrfs_read_locked_inode().
Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/linux-btrfs/6e55113a22347c3925458a5d840a18401a38b276.camel@linux.intel.com/
Fixes: 8679d2687c35 ("btrfs: initialize inode::file_extent_tree after i_mode has been set")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ce5e612dd411de096aa041b9e9325ba1bec5f9f4 ]
SO_ZEROCOPY handling in vsock_connectible_setsockopt() does not get called
on accept()ed sockets due to a missing flag. Flip it.
Fixes: e0718bd82e27 ("vsock: enable setting SO_ZEROCOPY")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20251229-vsock-child-sock-custom-sockopt-v2-1-64778d6c4f88@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit dc8a887de1a7d397ab4131f45676e89565417aa8 ]
v1:
the PMFW didn't initialize the PCIe DPM parameters
and requires the KMD to actively provide these parameters.
v2:
clean & remove unused code logic (lijo)
Fixes: 1a18607c07bb ("drm/amd/pm: override pcie dpm parameters only if it is necessary")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4671
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b0dbd5db7cf1f81e4aaedd25cb5e72ce369387b2)
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4f74c2dd970611d3ec3bb0d58215e73af5cd7214 ]
fix wrong pcie dpm parameter on navi1x
Fixes: 1a18607c07bb ("drm/amd/pm: override pcie dpm parameters only if it is necessary")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4671
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Co-developed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5c5189cf4b0cc0a22bac74a40743ee711cff07f8)
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ff5860f5088e9076ebcccf05a6ca709d5935cfa9 ]
With the change to hrtimer_try_to_cancel() in
perf_swevent_cancel_hrtimer() it appears possible for the hrtimer to
still be active by the time the event gets freed.
Make sure the event does a full hrtimer_cancel() on the free path by
installing a perf_event::destroy handler.
Fixes: eb3182ef0405 ("perf/core: Fix system hang caused by cpu-clock usage")
Reported-by: CyberUnicorns <a101e_iotvul@163.com>
Tested-by: CyberUnicorns <a101e_iotvul@163.com>
Debugged-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2ef02ac38d3c17f34a00c4b267d961a8d4b45d1a ]
Jakub added a warning in nf_conntrack_cleanup_net_list() to make debugging
leaked skbs/conntrack references more obvious.
syzbot reports this as triggering, and I can also reproduce this via
ip_defrag.sh selftest:
conntrack cleanup blocked for 60s
WARNING: net/netfilter/nf_conntrack_core.c:2512
[..]
conntrack clenups gets stuck because there are skbs with still hold nf_conn
references via their frag_list.
net.core.skb_defer_max=0 makes the hang disappear.
Eric Dumazet points out that skb_release_head_state() doesn't follow the
fraglist.
ip_defrag.sh can only reproduce this problem since
commit 6471658dc66c ("udp: use skb_attempt_defer_free()"), but AFAICS this
problem could happen with TCP as well if pmtu discovery is off.
The relevant problem path for udp is:
1. netns emits fragmented packets
2. nf_defrag_v6_hook reassembles them (in output hook)
3. reassembled skb is tracked (skb owns nf_conn reference)
4. ip6_output refragments
5. refragmented packets also own nf_conn reference (ip6_fragment
calls ip6_copy_metadata())
6. on input path, nf_defrag_v6_hook skips defragmentation: the
fragments already have skb->nf_conn attached
7. skbs are reassembled via ipv6_frag_rcv()
8. skb_consume_udp -> skb_attempt_defer_free() -> skb ends up
in pcpu freelist, but still has nf_conn reference.
Possible solutions:
1 let defrag engine drop nf_conn entry, OR
2 export kick_defer_list_purge() and call it from the conntrack
netns exit callback, OR
3 add skb_has_frag_list() check to skb_attempt_defer_free()
2 & 3 also solve ip_defrag.sh hang but share same drawback:
Such reassembled skbs, queued to socket, can prevent conntrack module
removal until userspace has consumed the packet. While both tcp and udp
stack do call nf_reset_ct() before placing skb on socket queue, that
function doesn't iterate frag_list skbs.
Therefore drop nf_conn entries when they are placed in defrag queue.
Keep the nf_conn entry of the first (offset 0) skb so that reassembled
skb retains nf_conn entry for sake of TX path.
Note that fixes tag is incorrect; it points to the commit introducing the
'ip_defrag.sh reproducible problem': no need to backport this patch to
every stable kernel.
Reported-by: syzbot+4393c47753b7808dac7d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/693b0fa7.050a0220.4004e.040d.GAE@google.com/
Fixes: 6471658dc66c ("udp: use skb_attempt_defer_free()")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260102140030.32367-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit acb4bc6e1ba34ae1a34a9334a1ce8474c909466e ]
Initial rss_hdr allocation uses virtio_device->device,
but virtnet_set_queues() frees using net_device->device.
This device mismatch causing below devres warning
[ 3788.514041] ------------[ cut here ]------------
[ 3788.514044] WARNING: drivers/base/devres.c:1095 at devm_kfree+0x84/0x98, CPU#16: vdpa/1463
[ 3788.514054] Modules linked in: octep_vdpa virtio_net virtio_vdpa [last unloaded: virtio_vdpa]
[ 3788.514064] CPU: 16 UID: 0 PID: 1463 Comm: vdpa Tainted: G W 6.18.0 #10 PREEMPT
[ 3788.514067] Tainted: [W]=WARN
[ 3788.514069] Hardware name: Marvell CN106XX board (DT)
[ 3788.514071] pstate: 63400009 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[ 3788.514074] pc : devm_kfree+0x84/0x98
[ 3788.514076] lr : devm_kfree+0x54/0x98
[ 3788.514079] sp : ffff800084e2f220
[ 3788.514080] x29: ffff800084e2f220 x28: ffff0003b2366000 x27: 000000000000003f
[ 3788.514085] x26: 000000000000003f x25: ffff000106f17c10 x24: 0000000000000080
[ 3788.514089] x23: ffff00045bb8ab08 x22: ffff00045bb8a000 x21: 0000000000000018
[ 3788.514093] x20: ffff0004355c3080 x19: ffff00045bb8aa00 x18: 0000000000080000
[ 3788.514098] x17: 0000000000000040 x16: 000000000000001f x15: 000000000007ffff
[ 3788.514102] x14: 0000000000000488 x13: 0000000000000005 x12: 00000000000fffff
[ 3788.514106] x11: ffffffffffffffff x10: 0000000000000005 x9 : ffff800080c8c05c
[ 3788.514110] x8 : ffff800084e2eeb8 x7 : 0000000000000000 x6 : 000000000000003f
[ 3788.514115] x5 : ffff8000831bafe0 x4 : ffff800080c8b010 x3 : ffff0004355c3080
[ 3788.514119] x2 : ffff0004355c3080 x1 : 0000000000000000 x0 : 0000000000000000
[ 3788.514123] Call trace:
[ 3788.514125] devm_kfree+0x84/0x98 (P)
[ 3788.514129] virtnet_set_queues+0x134/0x2e8 [virtio_net]
[ 3788.514135] virtnet_probe+0x9c0/0xe00 [virtio_net]
[ 3788.514139] virtio_dev_probe+0x1e0/0x338
[ 3788.514144] really_probe+0xc8/0x3a0
[ 3788.514149] __driver_probe_device+0x84/0x170
[ 3788.514152] driver_probe_device+0x44/0x120
[ 3788.514155] __device_attach_driver+0xc4/0x168
[ 3788.514158] bus_for_each_drv+0x8c/0xf0
[ 3788.514161] __device_attach+0xa4/0x1c0
[ 3788.514164] device_initial_probe+0x1c/0x30
[ 3788.514168] bus_probe_device+0xb4/0xc0
[ 3788.514170] device_add+0x614/0x828
[ 3788.514173] register_virtio_device+0x214/0x258
[ 3788.514175] virtio_vdpa_probe+0xa0/0x110 [virtio_vdpa]
[ 3788.514179] vdpa_dev_probe+0xa8/0xd8
[ 3788.514183] really_probe+0xc8/0x3a0
[ 3788.514186] __driver_probe_device+0x84/0x170
[ 3788.514189] driver_probe_device+0x44/0x120
[ 3788.514192] __device_attach_driver+0xc4/0x168
[ 3788.514195] bus_for_each_drv+0x8c/0xf0
[ 3788.514197] __device_attach+0xa4/0x1c0
[ 3788.514200] device_initial_probe+0x1c/0x30
[ 3788.514203] bus_probe_device+0xb4/0xc0
[ 3788.514206] device_add+0x614/0x828
[ 3788.514209] _vdpa_register_device+0x58/0x88
[ 3788.514211] octep_vdpa_dev_add+0x104/0x228 [octep_vdpa]
[ 3788.514215] vdpa_nl_cmd_dev_add_set_doit+0x2d0/0x3c0
[ 3788.514218] genl_family_rcv_msg_doit+0xe4/0x158
[ 3788.514222] genl_rcv_msg+0x218/0x298
[ 3788.514225] netlink_rcv_skb+0x64/0x138
[ 3788.514229] genl_rcv+0x40/0x60
[ 3788.514233] netlink_unicast+0x32c/0x3b0
[ 3788.514237] netlink_sendmsg+0x170/0x3b8
[ 3788.514241] __sys_sendto+0x12c/0x1c0
[ 3788.514246] __arm64_sys_sendto+0x30/0x48
[ 3788.514249] invoke_syscall.constprop.0+0x58/0xf8
[ 3788.514255] do_el0_svc+0x48/0xd0
[ 3788.514259] el0_svc+0x48/0x210
[ 3788.514264] el0t_64_sync_handler+0xa0/0xe8
[ 3788.514268] el0t_64_sync+0x198/0x1a0
[ 3788.514271] ---[ end trace 0000000000000000 ]---
Fix by using virtio_device->device consistently for
allocation and deallocation
Fixes: 4944be2f5ad8c ("virtio_net: Allocate rss_hdr with devres")
Signed-off-by: Kommula Shiva Shankar <kshankar@marvell.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://patch.msgid.link/20260102101900.692770-1-kshankar@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ffeafa65b2b26df2f5b5a6118d3174f17bd12ec5 ]
Fix the max number of bits passed to find_first_zero_bit() in
bnxt_alloc_agg_idx(). We were incorrectly passing the number of
long words. find_first_zero_bit() may fail to find a zero bit and
cause a wrong ID to be used. If the wrong ID is already in use, this
can cause data corruption. Sometimes an error like this can also be
seen:
bnxt_en 0000:83:00.0 enp131s0np0: TPA end agg_buf 2 != expected agg_bufs 1
Fix it by passing the correct number of bits MAX_TPA_P5. Use
DECLARE_BITMAP() to more cleanly define the bitmap. Add a sanity
check to warn if a bit cannot be found and reset the ring [MChan].
Fixes: ec4d8e7cf024 ("bnxt_en: Add TPA ID mapping logic for 57500 chips.")
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Srijit Bose <srijit.bose@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251231083625.3911652-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 92e6e0a87f6860a4710f9494f8c704d498ae60f8 ]
Commit 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support")
allocated memory for pp_qlt in ipc_mux_init() but did not free it in
ipc_mux_deinit(). This results in a memory leak when the driver is
unloaded.
Free the allocated memory in ipc_mux_deinit() to fix the leak.
Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support")
Co-developed-by: Jianhao Xu <jianhao.xu@seu.edu.cn>
Signed-off-by: Jianhao Xu <jianhao.xu@seu.edu.cn>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
Link: https://patch.msgid.link/20251230071853.1062223-1-zilin@seu.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8da901ffe497a53fa4ecc3ceed0e6d771586f88e ]
Fix assert lock warning while calling devl_param_driverinit_value_set()
in ena.
WARNING: net/devlink/core.c:261 at devl_assert_locked+0x62/0x90, CPU#0: kworker/0:0/9
CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.19.0-rc2+ #1 PREEMPT(lazy)
Hardware name: Amazon EC2 m8i-flex.4xlarge/, BIOS 1.0 10/16/2017
Workqueue: events work_for_cpu_fn
RIP: 0010:devl_assert_locked+0x62/0x90
Call Trace:
<TASK>
devl_param_driverinit_value_set+0x15/0x1c0
ena_devlink_alloc+0x18c/0x220 [ena]
? __pfx_ena_devlink_alloc+0x10/0x10 [ena]
? trace_hardirqs_on+0x18/0x140
? lockdep_hardirqs_on+0x8c/0x130
? __raw_spin_unlock_irqrestore+0x5d/0x80
? __raw_spin_unlock_irqrestore+0x46/0x80
? devm_ioremap_wc+0x9a/0xd0
ena_probe+0x4d2/0x1b20 [ena]
? __lock_acquire+0x56a/0xbd0
? __pfx_ena_probe+0x10/0x10 [ena]
? local_clock+0x15/0x30
? __lock_release.isra.0+0x1c9/0x340
? mark_held_locks+0x40/0x70
? lockdep_hardirqs_on_prepare.part.0+0x92/0x170
? trace_hardirqs_on+0x18/0x140
? lockdep_hardirqs_on+0x8c/0x130
? __raw_spin_unlock_irqrestore+0x5d/0x80
? __raw_spin_unlock_irqrestore+0x46/0x80
? __pfx_ena_probe+0x10/0x10 [ena]
......
</TASK>
Fixes: 816b52624cf6 ("net: ena: Control PHC enable through devlink")
Signed-off-by: Frank Liang <xiliang@redhat.com>
Reviewed-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20251231145808.6103-1-xiliang@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0462a15d2d1fafd3d48cf3c7c67393e42d03908c ]
The commit which added RX steering rules for PSP forgot to free a modify
header HW object on the cleanup path, which lead to health errors when
reloading the driver and uninitializing the device:
mlx5_core 0000:08:00.0: poll_health:803:(pid 3021): Fatal error 3 detected
Fix that by saving the modify header pointer in the PSP steering struct
and deallocating it after freeing the rule which references it.
Fixes: 9536fbe10c9d ("net/mlx5e: Add PSP steering in local NIC RX")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20251225132717.358820-6-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 144297e2a24e3e54aee1180ec21120ea38822b97 ]
Dumping module EEPROM on newer modules is supported through the netlink
interface only.
Querying with old userspace ethtool (or other tools, such as 'lshw')
which still uses the ioctl interface results in an error message that
could flood dmesg (in addition to the expected error return value).
The original message was added under the assumption that the driver
should be able to handle all module types, but now that such flows are
easily triggered from userspace, it doesn't serve its purpose.
Change the log level of the print in mlx5_query_module_eeprom() to
debug.
Fixes: bb64143eee8c ("net/mlx5e: Add ethtool support for dump module EEPROM")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20251225132717.358820-5-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6c75dc9de40ff91ec2b621b78f6cd9031762067c ]
Currently, the ppcnt_statistical_group capability check
incorrectly gates access to FEC histogram statistics.
This capability applies only to statistical and physical
counter groups, not for histogram data.
Restrict the ppcnt_statistical_group check to the
Physical_Layer_Counters and Physical_Layer_Statistical_Counters
groups.
Histogram statistics access remains gated by the pphcr
capability.
The issue is harmless as of today, as it happens that
ppcnt_statistical_group is set on all existing devices that
have pphcr set.
Fixes: 6b81b8a0b197 ("net/mlx5e: Don't query FEC statistics when FEC is disabled")
Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20251225132717.358820-3-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 31057979cdadfee9f934746fd84046b43506ba61 ]
Today multipath offload is controlled by a single route and the route
controlling is selected if it meets one of the following criteria:
1. No controlling route is set.
2. New route destination is the same as old one.
3. New route metric is lower than old route metric.
This can cause unwanted behaviour in case a new route is added
with a smaller network prefix which should get the priority.
Fix this by adding a new criteria to give priority to new route with
a smaller network prefix.
Fixes: ad11c4f1d8fd ("net/mlx5e: Lag, Only handle events from highest priority multipath entry")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20251225132717.358820-2-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 02d1e1a3f9239cdb3ecf2c6d365fb959d1bf39df ]
Directly increment the TSO features incurs a side effect: it will also
directly clear the flags in NETIF_F_ALL_FOR_ALL on the master device,
which can cause issues such as the inability to enable the nocache copy
feature on the bonding driver.
The fix is to include NETIF_F_ALL_FOR_ALL in the update mask, thereby
preventing it from being cleared.
Fixes: b0ce3508b25e ("bonding: allow TSO being set on bonding master")
Signed-off-by: Di Zhu <zhud@hygon.cn>
Link: https://patch.msgid.link/20251224012224.56185-1-zhud@hygon.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2a71a1a8d0ed718b1c7a9ac61f07e5755c47ae20 ]
skbuff_fclone_cache was created without defining a usercopy region,
[1] unlike skbuff_head_cache which properly whitelists the cb[] field.
[2] This causes a usercopy BUG() when CONFIG_HARDENED_USERCOPY is
enabled and the kernel attempts to copy sk_buff.cb data to userspace
via sock_recv_errqueue() -> put_cmsg().
The crash occurs when: 1. TCP allocates an skb using alloc_skb_fclone()
(from skbuff_fclone_cache) [1]
2. The skb is cloned via skb_clone() using the pre-allocated fclone
[3] 3. The cloned skb is queued to sk_error_queue for timestamp
reporting 4. Userspace reads the error queue via recvmsg(MSG_ERRQUEUE)
5. sock_recv_errqueue() calls put_cmsg() to copy serr->ee from skb->cb
[4] 6. __check_heap_object() fails because skbuff_fclone_cache has no
usercopy whitelist [5]
When cloned skbs allocated from skbuff_fclone_cache are used in the
socket error queue, accessing the sock_exterr_skb structure in skb->cb
via put_cmsg() triggers a usercopy hardening violation:
[ 5.379589] usercopy: Kernel memory exposure attempt detected from SLUB object 'skbuff_fclone_cache' (offset 296, size 16)!
[ 5.382796] kernel BUG at mm/usercopy.c:102!
[ 5.383923] Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
[ 5.384903] CPU: 1 UID: 0 PID: 138 Comm: poc_put_cmsg Not tainted 6.12.57 #7
[ 5.384903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 5.384903] RIP: 0010:usercopy_abort+0x6c/0x80
[ 5.384903] Code: 1a 86 51 48 c7 c2 40 15 1a 86 41 52 48 c7 c7 c0 15 1a 86 48 0f 45 d6 48 c7 c6 80 15 1a 86 48 89 c1 49 0f 45 f3 e8 84 27 88 ff <0f> 0b 490
[ 5.384903] RSP: 0018:ffffc900006f77a8 EFLAGS: 00010246
[ 5.384903] RAX: 000000000000006f RBX: ffff88800f0ad2a8 RCX: 1ffffffff0f72e74
[ 5.384903] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff87b973a0
[ 5.384903] RBP: 0000000000000010 R08: 0000000000000000 R09: fffffbfff0f72e74
[ 5.384903] R10: 0000000000000003 R11: 79706f6372657375 R12: 0000000000000001
[ 5.384903] R13: ffff88800f0ad2b8 R14: ffffea00003c2b40 R15: ffffea00003c2b00
[ 5.384903] FS: 0000000011bc4380(0000) GS:ffff8880bf100000(0000) knlGS:0000000000000000
[ 5.384903] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5.384903] CR2: 000056aa3b8e5fe4 CR3: 000000000ea26004 CR4: 0000000000770ef0
[ 5.384903] PKRU: 55555554
[ 5.384903] Call Trace:
[ 5.384903] <TASK>
[ 5.384903] __check_heap_object+0x9a/0xd0
[ 5.384903] __check_object_size+0x46c/0x690
[ 5.384903] put_cmsg+0x129/0x5e0
[ 5.384903] sock_recv_errqueue+0x22f/0x380
[ 5.384903] tls_sw_recvmsg+0x7ed/0x1960
[ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5
[ 5.384903] ? schedule+0x6d/0x270
[ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5
[ 5.384903] ? mutex_unlock+0x81/0xd0
[ 5.384903] ? __pfx_mutex_unlock+0x10/0x10
[ 5.384903] ? __pfx_tls_sw_recvmsg+0x10/0x10
[ 5.384903] ? _raw_spin_lock_irqsave+0x8f/0xf0
[ 5.384903] ? _raw_read_unlock_irqrestore+0x20/0x40
[ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5
The crash offset 296 corresponds to skb2->cb within skbuff_fclones:
- sizeof(struct sk_buff) = 232 - offsetof(struct sk_buff, cb) = 40 -
offset of skb2.cb in fclones = 232 + 40 = 272 - crash offset 296 =
272 + 24 (inside sock_exterr_skb.ee)
This patch uses a local stack variable as a bounce buffer to avoid the hardened usercopy check failure.
[1] https://elixir.bootlin.com/linux/v6.12.62/source/net/ipv4/tcp.c#L885
[2] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5104
[3] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5566
[4] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5491
[5] https://elixir.bootlin.com/linux/v6.12.62/source/mm/slub.c#L5719
Fixes: 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251223203534.1392218-2-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 62f7edd59964eb588e96fce1ad35a2327ea54424 ]
Implement soft_reset, suspend, and resume callbacks using
genphy_soft_reset(), genphy_suspend(), and genphy_resume()
to fix PHY initialization and power management issues.
The soft_reset callback is needed to properly recover the PHY after an
ifconfig down/up cycle. Without it, the PHY can remain in power-down
state, causing MDIO register access failures during config_init().
The soft reset ensures the PHY is operational before configuration.
The suspend/resume callbacks enable proper power management during
system suspend/resume cycles.
Fixes: b2908a989c59 ("net: phy: add driver for MaxLinear MxL86110 PHY")
Signed-off-by: Stefano Radaelli <stefano.r@variscite.com>
Link: https://patch.msgid.link/20251223120940.407195-1-stefano.r@variscite.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4c0856c225b39b1def6c9a6bc56faca79550da13 ]
When the ping program uses an IPPROTO_ICMP socket to send ICMP_ECHO
messages, ICMP_MIB_OUTMSGS is counted twice.
ping_v4_sendmsg
ping_v4_push_pending_frames
ip_push_pending_frames
ip_finish_skb
__ip_make_skb
icmp_out_count(net, icmp_type); // first count
icmp_out_count(sock_net(sk), user_icmph.type); // second count
However, when the ping program uses an IPPROTO_RAW socket,
ICMP_MIB_OUTMSGS is counted correctly only once.
Therefore, the first count should be removed.
Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Signed-off-by: yuan.gao <yuan.gao@ucloud.cn>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20251224063145.3615282-1-yuan.gao@ucloud.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 34f3ff52cb9fa7dbf04f5c734fcc4cb6ed5d1a95 ]
Commit 15faa1f67ab4 ("lan966x: Fix crash when adding interface under a lag")
fixed a similar issue in the lan966x driver caused by a NULL pointer dereference.
The ocelot_set_aggr_pgids() function in the ocelot driver has similar logic
and is susceptible to the same crash.
This issue specifically affects the ocelot_vsc7514.c frontend, which leaves
unused ports as NULL pointers. The felix_vsc9959.c frontend is unaffected as
it uses the DSA framework which registers all ports.
Fix this by checking if the port pointer is valid before accessing it.
Fixes: 528d3f190c98 ("net: mscc: ocelot: drop the use of the "lags" array")
Signed-off-by: Jerry Wu <w.7erry@foxmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/tencent_75EF812B305E26B0869C673DD1160866C90A@qq.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3128df6be147768fe536986fbb85db1d37806a9f ]
When using an 802.1ad bridge with vlan_tunnel, the C-VLAN tag is
incorrectly stripped from frames during egress processing.
br_handle_egress_vlan_tunnel() uses skb_vlan_pop() to remove the S-VLAN
from hwaccel before VXLAN encapsulation. However, skb_vlan_pop() also
moves any "next" VLAN from the payload into hwaccel:
/* move next vlan tag to hw accel tag */
__skb_vlan_pop(skb, &vlan_tci);
__vlan_hwaccel_put_tag(skb, vlan_proto, vlan_tci);
For QinQ frames where the C-VLAN sits in the payload, this moves it to
hwaccel where it gets lost during VXLAN encapsulation.
Fix by calling __vlan_hwaccel_clear_tag() directly, which clears only
the hwaccel S-VLAN and leaves the payload untouched.
This path is only taken when vlan_tunnel is enabled and tunnel_info
is configured, so 802.1Q bridges are unaffected.
Tested with 802.1ad bridge + VXLAN vlan_tunnel, verified C-VLAN
preserved in VXLAN payload via tcpdump.
Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths")
Signed-off-by: Alexandre Knecht <knecht.alexandre@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20251228020057.2788865-1-knecht.alexandre@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a428e0da1248c353557970848994f35fd3f005e2 ]
devlink_alloc() may return NULL on allocation failure, but
prestera_devlink_alloc() unconditionally calls devlink_priv() on
the returned pointer.
This leads to a NULL pointer dereference if devlink allocation fails.
Add a check for a NULL devlink pointer and return NULL early to avoid
the crash.
Fixes: 34dd1710f5a3 ("net: marvell: prestera: Add basic devlink support")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Acked-by: Elad Nachman <enachman@marvell.com>
Link: https://patch.msgid.link/20251230052124.897012-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7811ba452402d58628e68faedf38745b3d485e3c ]
Currently last_gc is being updated everytime a new connection is
tracked, that means that it is updated even if a GC wasn't performed.
With a sufficiently high packet rate, it is possible to always bypass
the GC, causing the list to grow infinitely.
Update the last_gc value only when a GC has been actually performed.
Fixes: d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d077e8119ddbb4fca67540f1a52453631a47f221 ]
In nf_tables_newrule(), if nft_use_inc() fails, the function jumps to
the err_release_rule label without freeing the allocated flow, leading
to a memory leak.
Fix this by adding a new label err_destroy_flow and jumping to it when
nft_use_inc() fails. This ensures that the flow is properly released
in this error case.
Fixes: 1689f25924ada ("netfilter: nf_tables: report use refcount overflow")
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 014a17deb41201449f76df2b20c857a9c3294a7c ]
GPIO drivers with latch input support may miss short pulses on input
pins even when input latching is enabled. The generic interrupt logic in
the pca953x driver reports interrupts by comparing the current input
value against the previously sampled one and only signals an event when
a level change is observed between two reads.
For short pulses, the first edge is captured when the input register is
read, but if the signal returns to its previous level before the read,
the second edge is not observed. As a result, successive pulses can
produce identical input values at read time and no level change is
detected, causing interrupts to be missed. Below timing diagram shows
this situation where the top signal is the input pin level and the
bottom signal indicates the latched value.
─────┐ ┌──*───────────────┐ ┌──*─────────────────┐ ┌──*───
│ │ . │ │ . │ │ .
│ │ │ │ │ │ │ │ │
└──*──┘ │ └──*──┘ │ └──*──┘ │
Input │ │ │ │ │ │
▼ │ ▼ │ ▼ │
IRQ │ IRQ │ IRQ │
. . .
─────┐ .┌──────────────┐ .┌────────────────┐ .┌──
│ │ │ │ │ │
│ │ │ │ │ │
└────────*┘ └────────*┘ └────────*┘
Latched │ │ │
▼ ▼ ▼
READ 0 READ 0 READ 0
NO CHANGE NO CHANGE
PCAL variants provide an interrupt status register that records which
pins triggered an interrupt, but the status and input registers cannot
be read atomically. The interrupt status is only cleared when the input
port is read, and the input value must also be read to determine the
triggering edge. If another interrupt occurs on a different line after
the status register has been read but before the input register is
sampled, that event will not be reflected in the earlier status
snapshot, so relying solely on the interrupt status register is also
insufficient.
Support for input latching and interrupt status handling was previously
added by [1], but the interrupt status-based logic was reverted by [2]
due to these issues. This patch addresses the original problem by
combining both sources of information. Events indicated by the interrupt
status register are merged with events detected through the existing
level-change logic. As a result:
* short pulses, whose second edges are invisible, are detected via the
interrupt status register, and
* interrupts that occur between the status and input reads are still
caught by the generic level-change logic.
This significantly improves robustness on devices that signal interrupts
as short pulses, while avoiding the issues that led to the earlier
reversion. In practice, even if only the first edge of a pulse is
observable, the interrupt is reliably detected.
This fixes missed interrupts from an Ilitek touch controller with its
interrupt line connected to a PCAL6416A, where active-low pulses are
approximately 200 us long.
[1] commit 44896beae605 ("gpio: pca953x: add PCAL9535 interrupt support for Galileo Gen2")
[2] commit d6179f6c6204 ("gpio: pca953x: Improve interrupt support")
Fixes: d6179f6c6204 ("gpio: pca953x: Improve interrupt support")
Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20251217153050.142057-1-ernestvanhoecke@gmail.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a7ac22d53d0990152b108c3f4fe30df45fcb0181 ]
If two drivers were calling gpiochip_add_data_with_key(), one may be
traversing the srcu-protected list in gpio_name_to_desc(), meanwhile
other has just added its gdev in gpiodev_add_to_list_unlocked().
This creates a non-mutexed and non-protected timeframe, when one
instance is dereferencing and using &gdev->srcu, before the other
has initialized it, resulting in crash:
[ 4.935481] Unable to handle kernel paging request at virtual address ffff800272bcc000
[ 4.943396] Mem abort info:
[ 4.943400] ESR = 0x0000000096000005
[ 4.943403] EC = 0x25: DABT (current EL), IL = 32 bits
[ 4.943407] SET = 0, FnV = 0
[ 4.943410] EA = 0, S1PTW = 0
[ 4.943413] FSC = 0x05: level 1 translation fault
[ 4.943416] Data abort info:
[ 4.943418] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[ 4.946220] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 4.955261] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 4.955268] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000038e6c000
[ 4.961449] [ffff800272bcc000] pgd=0000000000000000
[ 4.969203] , p4d=1000000039739003
[ 4.979730] , pud=0000000000000000
[ 4.980210] phandle (CPU): 0x0000005e, phandle (BE): 0x5e000000 for node "reset"
[ 4.991736] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
...
[ 5.121359] pc : __srcu_read_lock+0x44/0x98
[ 5.131091] lr : gpio_name_to_desc+0x60/0x1a0
[ 5.153671] sp : ffff8000833bb430
[ 5.298440]
[ 5.298443] Call trace:
[ 5.298445] __srcu_read_lock+0x44/0x98
[ 5.309484] gpio_name_to_desc+0x60/0x1a0
[ 5.320692] gpiochip_add_data_with_key+0x488/0xf00
5.946419] ---[ end trace 0000000000000000 ]---
Move initialization code for gdev fields before it is added to
gpio_devices, with adjacent initialization code.
Adjust goto statements to reflect modified order of operations
Fixes: 47d8b4c1d868 ("gpio: add SRCU infrastructure to struct gpio_device")
Reviewed-by: Jakub Lewalski <jakub.lewalski@nokia.com>
Signed-off-by: Paweł Narewski <pawel.narewski@nokia.com>
[Bartosz: fixed a build issue, removed stray newline]
Link: https://lore.kernel.org/r/20251224082641.10769-1-bartosz.golaszewski@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d4f335b410ddbe3e99f48f8b5ea78a25041274f1 ]
The chip_$level() macros take struct gpio_chip as argument so make it
follow the convention of using the 'gpiochip_' prefix.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Stable-dep-of: a7ac22d53d09 ("gpiolib: fix race condition for gdev->srcu")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0ba6f1ed3808b1f095fbdb490006f0ecd00f52bd ]
We don't need to add additional logs when returning -ENOMEM so remove
unnecessary error messages.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Stable-dep-of: a7ac22d53d09 ("gpiolib: fix race condition for gdev->srcu")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 36a3200575642846a96436d503d46544533bb943 ]
During nft_synproxy eval we are reading nf_synproxy_info struct which
can be modified on update operation concurrently. As nf_synproxy_info
struct fits in 32 bits, use READ_ONCE/WRITE_ONCE annotations.
Fixes: ee394f96ad75 ("netfilter: nft_synproxy: add synproxy stateful object support")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7711f4bb4b360d9c0ff84db1c0ec91e385625047 ]
set->klen has to be used, not sizeof(). The latter only compares a
single register but a full check of the entire key is needed.
Example:
table ip t {
map s {
typeof iifname . ip saddr : verdict
flags interval
}
}
nft add element t s '{ "lo" . 10.0.0.0/24 : drop }' # no error, expected
nft add element t s '{ "lo" . 10.0.0.0/24 : drop }' # no error, expected
nft add element t s '{ "lo" . 10.0.0.0/8 : drop }' # bug: no error
The 3rd 'add element' should be rejected via -ENOTEMPTY, not -EEXIST,
so userspace / nft can report an error to the user.
The latter is only correct for the 2nd case (re-add of existing element).
As-is, userspace is told that the command was successful, but no elements were
added.
After this patch, 3rd command gives:
Error: Could not process rule: File exists
add element t s { "lo" . 127.0.0.0/8 . "lo" : drop }
^^^^^^^^^^^^^^^^^^^^^^^^^
Fixes: 0eb4b5ee33f2 ("netfilter: nft_set_pipapo: Separate partial and complete overlap cases on insertion")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 89e87d0dc87eb3654c9ae01afc4a18c1c6d1e523 ]
Ethernet PHY interrupt mode is level triggered. Adjust the mode
accordingly.
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Fixes: 70cf622bb16e ("arm64: dts: mba8mx: Add Ethernet PHY IRQ support")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a988caeed9d918452aa0a68de2c6e94d86aa43ba ]
The commit 616effc0272b5 ("arm64: dts: imx8: Fix lpuart DMA channel
order") swap uart rx and tx channel at common imx8-ss-dma.dtsi. But miss
update imx8qm-ss-dma.dtsi.
The commit 5a8e9b022e569 ("arm64: dts: imx8qm-ss-dma: Pass lpuart
dma-names") just simple add dma-names as binding doc requirement.
Correct lpuart0 - lpuart3 dma rx and tx channels, and use defines for
the FSL_EDMA_RX flag.
Fixes: 5a8e9b022e56 ("arm64: dts: imx8qm-ss-dma: Pass lpuart dma-names")
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
i.MX8M Plus DHCOM
[ Upstream commit c63749a7ddc59ac6ec0b05abfa0a21af9f2c1d38 ]
Add missing 'clocks' property to LAN8740Ai PHY node, to allow the PHY driver
to manage LAN8740Ai CLKIN reference clock supply. This fixes sporadic link
bouncing caused by interruptions on the PHY reference clock, by letting the
PHY driver manage the reference clock and assure there are no interruptions.
This follows the matching PHY driver recommendation described in commit
bedd8d78aba3 ("net: phy: smsc: LAN8710/20: add phy refclk in support")
Fixes: 8d6712695bc8 ("arm64: dts: imx8mp: Add support for DH electronics i.MX8M Plus DHCOM and PDK2")
Signed-off-by: Marek Vasut <marek.vasut@mailbox.org>
Tested-by: Christoph Niedermaier <cniedermaier@dh-electronics.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit cdf4e631eec5ddd49bb625df9fb144d6ecdd6f15 ]
On this SoM eqos is the primary ethernet interface, Ka-Ro fuses the
address for it in eth_mac1, eth_mac2 seems to be left unfused. In their
downstream u-boot they fetch it from eth_mac1 [1][2], by setting alias
of eqos to ethernet0, the driver then fetches the mac address based on
the alias number.
Set eqos to read from eth_mac1 instead of eth_mac2. Also set fec to
point at eth_mac2 as it may be fused later even though it is disabled
by default.
With this changed barebox is now capable of loading the correct address.
Link: https://github.com/karo-electronics/karo-tx-uboot/blob/380543278410bbf04264d80a3bfbe340b8e62439/drivers/net/dwc_eth_qos.c#L1167 [1]
Link: https://github.com/karo-electronics/karo-tx-uboot/blob/380543278410bbf04264d80a3bfbe340b8e62439/arch/arm/dts/imx8mp-karo.dtsi#L12 [2]
Fixes: bac63d7c5f46 ("arm64: dts: freescale: add Ka-Ro Electronics tx8p-ml81 COM")
Signed-off-by: Maud Spierings <maudspierings@gocontroll.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|