summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-11-08scsi: lpfc: Fix leaked ndlp krefs when in point-to-point topologyJustin Tee
In point-to-point topology, the driver sometimes defers the unsolicited FLOGI LS_ACC until it sends its FLOGI to the remote port. When this happens, lpfc neglects to release the ndlp allocated for the unsolicited FLOGI. This patch adds code to release the ndlp for the deferred unsolicited FLOGI LS_ACC. An NLP_FLOGI_DFR_ACC flag is introduced to facilitate identifying an ndlp with an expected deferred FLOGI LS_ACC completion. When lpfc_cmpl_els_rsp() detects the correct qualifiers, it releases the initial reference on the ndlp object. And when lpfc_cmpl_els_rsp() exits, the remaining put for the deferred action is executed and the ndlp is released. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://patch.msgid.link/20251106224639.139176-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: lpfc: Ensure unregistration of rpis for received PLOGIsJustin Tee
Unregistration of an rpi object should be done when a PLOGI is received as PLOGI receipt implies an implicit LOGO. Previously, the driver would continue using the same, already registered, rpi and ACC the received PLOGI. Replace the ACC and early return statement with break to execute the rest of the lpfc_rcv_plogi logic outside the switch case statement. This ensures unregistration and reregistration of an rpi after PLOGI_ACC completion. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://patch.msgid.link/20251106224639.139176-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: lpfc: Remove redundant NULL ptr assignment in lpfc_els_free_iocb()Justin Tee
Remove redundant cmd_dmabuf and bpl_dmabuf NULL ptr assignment as they are already initialized to NULL when handling a new unsolicited event. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://patch.msgid.link/20251106224639.139176-4-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: lpfc: Revise discovery related function headers and commentsJustin Tee
Correcting discovery related function headers, return status information, and comment descriptions. There are no functional changes. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://patch.msgid.link/20251106224639.139176-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: lpfc: Update various NPIV diagnostic log messagingJustin Tee
Update PRLI status log message to automatically warn when CQE status is non-zero. When issuing an RSCN, log ndlp's kref count to the debugfs trace buffer. Add the NPIV virtual port index to the FDMI registration log message with the fabric. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://patch.msgid.link/20251106224639.139176-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08Merge patch series "smartpqi updates"Martin K. Petersen
Don Brace <don.brace@microchip.com> says: These patches are based on Martin Petersen's 6.19/scsi-queue tree https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git 6.19/scsi-queue This patch series includes four patches, with two main functional changes: 1. smartpqi-Add-timeout-value-to-RAID-path-requests-to-physical-devices Sets a timeout value for requests sent to physical devices via the RAID path. This prevents the controller firmware from waiting indefinitely and allows it to time out requests a few seconds before the OS issues Target Management Function (TMF) commands. The timeout value sent to the firmware is set to 3 seconds less than the OS timeout, which significantly reduces TMF invocations. 2. smartpqi-fix-Device-resources-accessed-after-device-removal Fixes a race condition during device removal by: * Checking for device removal in the reset handler and canceling any pending reset work if the device is no longer present. * Canceling outstanding TMF work items in the .sdev_destroy handler. Together, these changes eliminate races between reset operations and device removal. The other two patches: 3. smartpqi-add-new-Hurray-Data-pci-device Adds support for new Hurray Data PCI device. No functional changes. 4. smartpqi-update-driver-version-to-2.1.36-026 Updates the driver version string. No functional changes. Link: https://patch.msgid.link/20251106163823.786828-1-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: smartpqi: Update version to 2.1.36-026Don Brace
Update driver version to 2.1.36-026 Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Gerry Morong <gerry.morong@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://patch.msgid.link/20251106163823.786828-5-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: smartpqi: Add support for Hurray Data new controller PCI deviceDavid Strahan
Add support for new Hurray Data controller. All entries are in HEX. Add PCI IDs for Hurray Data controllers: VID / DID / SVID / SDID ---- ---- ---- ---- 9005 028f 207d 4840 Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: David Strahan <David.Strahan@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://patch.msgid.link/20251106163823.786828-4-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: smartpqi: Fix device resources accessed after device removalMike McGowen
Correct possible race conditions during device removal. Previously, a scheduled work item to reset a LUN could still execute after the device was removed, leading to use-after-free and other resource access issues. This race condition occurs because the abort handler may schedule a LUN reset concurrently with device removal via sdev_destroy(), leading to use-after-free and improper access to freed resources. - Check in the device reset handler if the device is still present in the controller's SCSI device list before running; if not, the reset is skipped. - Cancel any pending TMF work that has not started in sdev_destroy(). - Ensure device freeing in sdev_destroy() is done while holding the LUN reset mutex to avoid races with ongoing resets. Fixes: 2d80f4054f7f ("scsi: smartpqi: Update deleting a LUN via sysfs") Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Signed-off-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://patch.msgid.link/20251106163823.786828-3-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: smartpqi: Add timeout value to RAID path requests to physical devicesMike McGowen
Add a timeout value to requests sent to physical devices via the RAID path. A timeout value of zero means wait indefinitely, which may cause the OS to issue Target Management Function (TMF) commands if the device does not respond. For input timeouts of 8 seconds or greater, the value sent to firmware is reduced by 3 seconds to provide an earlier firmware timeout and allow the OS additional time before timing out. This change improves timeout handling between the driver, firmware, and OS, helping to better manage device responsiveness and avoid indefinite waits. Reviewed-by: David Strahan <david.strahan@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://patch.msgid.link/20251106163823.786828-2-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: ufs: ti-j721e: Add suspend-resume supportThomas Richard (TI.com)
Restore the ctrl register to resume the TI UFS wrapper. Signed-off-by: Thomas Richard (TI.com) <thomas.richard@bootlin.com> Link: https://patch.msgid.link/20251106-scsi-ufs-ti-j721e-suspend-resume-support-v1-1-6f395f51219e@bootlin.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: target: tcm_loop: Fix segfault in tcm_loop_tpg_address_show()Hamza Mahfooz
If the allocation of tl_hba->sh fails in tcm_loop_driver_probe() and we attempt to dereference it in tcm_loop_tpg_address_show() we will get a segfault, see below for an example. So, check tl_hba->sh before dereferencing it. Unable to allocate struct scsi_host BUG: kernel NULL pointer dereference, address: 0000000000000194 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 8356 Comm: tokio-runtime-w Not tainted 6.6.104.2-4.azl3 #1 Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 09/28/2024 RIP: 0010:tcm_loop_tpg_address_show+0x2e/0x50 [tcm_loop] ... Call Trace: <TASK> configfs_read_iter+0x12d/0x1d0 [configfs] vfs_read+0x1b5/0x300 ksys_read+0x6f/0xf0 ... Cc: stable@vger.kernel.org Fixes: 2628b352c3d4 ("tcm_loop: Show address of tpg in configfs") Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Allen Pais <apais@linux.microsoft.com> Link: https://patch.msgid.link/1762370746-6304-1-git-send-email-hamzamahfooz@linux.microsoft.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: fcoe: Add WQ_PERCPU to alloc_workqueue() usersMarco Crivellari
Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() uses WORK_CPU_UNBOUND (used when a CPU is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. Continue the effort to refactor workqueue APIs, which has begun with the change introducing new workqueues and a new alloc_workqueue flag: commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq") commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag") Adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-CPU when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://patch.msgid.link/20251105150336.244079-1-marco.crivellari@suse.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: st: Skip buffer flush for information ioctlsDavid Jeffery
With commit 9604eea5bd3a ("scsi: st: Add third party poweron reset handling") some customer tape applications fail from being unable to complete ioctls to verify ID information for the device when there has been any type of reset event to their tape devices. The st driver currently will fail all standard SCSI ioctls if a call to flush_buffer() fails in st_ioctl(). This causes ioctls which otherwise have no effect on tape state to succeed or fail based on events unrelated to the requested ioctl. This makes SCSI information ioctls unreliable after a reset even if no buffering is in use. With a reset setting the pos_unknown field, flush_buffer() will report failure and fail all ioctls. So any application expecting to use ioctls to check the identify the device will be unable to do so in such a state. For SCSI information ioctls, avoid the need for a buffer flush and allow the ioctls to execute regardless of buffer state. Signed-off-by: David Jeffery <djeffery@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Link: https://patch.msgid.link/20251104154709.6436-2-djeffery@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: st: Separate st-unique ioctl handling from SCSI common ioctl handlingDavid Jeffery
The st ioctl function currently interleaves code for handling various st specific ioctls with parts of code needed for handling ioctls common to all SCSI devices. Separate out st's code for the common ioctls into a more manageable, separate function. Signed-off-by: David Jeffery <djeffery@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Link: https://patch.msgid.link/20251104154709.6436-1-djeffery@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08scsi: stex: Fix reboot_notifier leak in probe error pathHaotian Zhang
In stex_probe(), register_reboot_notifier() is called at the beginning, but if any subsequent initialization step fails, the function returns without unregistering the notifier, resulting in a resource leak. Add unregister_reboot_notifier() in the out_disable error path to ensure proper cleanup on all failure paths. Fixes: 61b745fa63db ("scsi: stex: Add S6 support") Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Link: https://patch.msgid.link/20251104094847.270-1-vulab@iscas.ac.cn Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-11-08nbd: defer config put in recv_workZheng Qixing
There is one uaf issue in recv_work when running NBD_CLEAR_SOCK and NBD_CMD_RECONFIGURE: nbd_genl_connect // conf_ref=2 (connect and recv_work A) nbd_open // conf_ref=3 recv_work A done // conf_ref=2 NBD_CLEAR_SOCK // conf_ref=1 nbd_genl_reconfigure // conf_ref=2 (trigger recv_work B) close nbd // conf_ref=1 recv_work B config_put // conf_ref=0 atomic_dec(&config->recv_threads); -> UAF Or only running NBD_CLEAR_SOCK: nbd_genl_connect // conf_ref=2 nbd_open // conf_ref=3 NBD_CLEAR_SOCK // conf_ref=2 close nbd nbd_release config_put // conf_ref=1 recv_work config_put // conf_ref=0 atomic_dec(&config->recv_threads); -> UAF Commit 87aac3a80af5 ("nbd: call nbd_config_put() before notifying the waiter") moved nbd_config_put() to run before waking up the waiter in recv_work, in order to ensure that nbd_start_device_ioctl() would not be woken up while nbd->task_recv was still uncleared. However, in nbd_start_device_ioctl(), after being woken up it explicitly calls flush_workqueue() to make sure all current works are finished. Therefore, there is no need to move the config put ahead of the wakeup. Move nbd_config_put() to the end of recv_work, so that the reference is held for the whole lifetime of the worker thread. This makes sure the config cannot be freed while recv_work is still running, even if clear + reconfigure interleave. In addition, we don't need to worry about recv_work dropping the last nbd_put (which causes deadlock): path A (netlink with NBD_CFLAG_DESTROY_ON_DISCONNECT): connect // nbd_refs=1 (trigger recv_work) open nbd // nbd_refs=2 NBD_CLEAR_SOCK close nbd nbd_release nbd_disconnect_and_put flush_workqueue // recv_work done nbd_config_put nbd_put // nbd_refs=1 nbd_put // nbd_refs=0 queue_work path B (netlink without NBD_CFLAG_DESTROY_ON_DISCONNECT): connect // nbd_refs=2 (trigger recv_work) open nbd // nbd_refs=3 NBD_CLEAR_SOCK // conf_refs=2 close nbd nbd_release nbd_config_put // conf_refs=1 nbd_put // nbd_refs=2 recv_work done // conf_refs=0, nbd_refs=1 rmmod // nbd_refs=0 Reported-by: syzbot+56fbf4c7ddf65e95c7cc@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6907edce.a70a0220.37351b.0014.GAE@google.com/T/ Fixes: 87aac3a80af5 ("nbd: make the config put is called before the notifying the waiter") Depends-on: e2daec488c57 ("nbd: Fix hungtask when nbd_config_put") Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-08kbuild: Rename Makefile.extrawarn to Makefile.warnNathan Chancellor
Since commit e88ca24319e4 ("kbuild: consolidate warning flags in scripts/Makefile.extrawarn"), scripts/Makefile.extrawarn contains all warnings for the main kernel build, not just warnings enabled by the values for W=. Rename it to scripts/Makefile.warn to make it clearer that this Makefile is where all Kbuild warning handling should exist. Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20251023-rename-scripts-makefile-extrawarn-v1-1-8f7531542169@kernel.org Signed-off-by: Nicolas Schier <nsc@kernel.org>
2025-11-08PM: wakeup: Delete timer before removing wakeup source from listKaushlendra Kumar
Replace timer_delete_sync() with timer_shutdown_sync() and move it before list_del_rcu() in wakeup_source_remove() to improve the cleanup ordering and code clarity. This ensures that the timer is stopped before removing the wakeup source from the events list, providing a more logical cleanup sequence. While the current ordering is functionally correct, stopping the timer first makes the cleanup flow more intuitive and follows the general pattern of disabling active components before removing data structures. Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com> [ rjw: Subject and changelog edits ] Link: https://patch.msgid.link/20251027044127.2456365-1-kaushlendra.kumar@intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-08Merge tag 'kbuild-ms-extensions-6.19' into kbuild-nextNicolas Schier
Shared branch between Kbuild and other trees for enabling '-fms-extensions' for 6.19 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nicolas Schier <nsc@kernel.org>
2025-11-08clk: lan966x: remove unused dt-bindings includeRobert Marko
In preparation for LAN969x support, all instances referring to defines in the LAN966x specific header were dropped, so its safe to drop its inclusion in the driver. Signed-off-by: Robert Marko <robert.marko@sartura.hr> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Link: https://lore.kernel.org/r/20250924202810.1641883-1-robert.marko@sartura.hr Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
2025-11-08md/raid5: remove redundant __GFP_NOWARNHuiwen He
The __GFP_NOWARN flag was included in GFP_NOWAIT since commit 16f5dfbc851b ("gfp: include __GFP_NOWARN in GFP_NOWAIT"). So remove the redundant __GFP_NOWARN flag. Link: https://lore.kernel.org/linux-raid/20251102152540.871568-1-hehuiwen@kylinos.cn Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: Li Nan <linan122@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2025-11-08md: avoid repeated calls to del_gendiskXiao Ni
There is a uaf problem which is found by case 23rdev-lifetime: Oops: general protection fault, probably for non-canonical address 0xdead000000000122 RIP: 0010:bdi_unregister+0x4b/0x170 Call Trace: <TASK> __del_gendisk+0x356/0x3e0 mddev_unlock+0x351/0x360 rdev_attr_store+0x217/0x280 kernfs_fop_write_iter+0x14a/0x210 vfs_write+0x29e/0x550 ksys_write+0x74/0xf0 do_syscall_64+0xbb/0x380 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff5250a177e The sequence is: 1. rdev remove path gets reconfig_mutex 2. rdev remove path release reconfig_mutex in mddev_unlock 3. md stop calls do_md_stop and sets MD_DELETED 4. rdev remove path calls del_gendisk because MD_DELETED is set 5. md stop path release reconfig_mutex and calls del_gendisk again So there is a race condition we should resolve. This patch adds a flag MD_DO_DELETE to avoid the race condition. Link: https://lore.kernel.org/linux-raid/20251029063419.21700-1-xni@redhat.com Fixes: 9e59d609763f ("md: call del_gendisk in control path") Signed-off-by: Xiao Ni <xni@redhat.com> Suggested-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Li Nan <linan122@huawei.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2025-11-08Revert "drm/nouveau: set DMA mask before creating the flush page"Dave Airlie
This reverts commit ebe755605082eddff80eafe0c50915b1366ee98f. Tested the latest kernel on my GB203 and this seems to break it somehow. Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: gsp: GSP-FMC boot failed (mbox: 0x0000000b) Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: gsp: init failed, -5 Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: init failed with -5 Nov 09 04:16:14 bighp kernel: nouveau: drm:00000000:00000080: init failed with -5 Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: drm: Device allocation failed: -5 Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: probe with driver nouveau failed with error -5 Not sure why, I went over the patch and thought it should have worked, but there must be some 32-bit problem maybe in the FMC boot path. Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-11-08md/md-llbitmap: Remove unneeded semicolonChen Ni
Remove unnecessary semicolons reported by Coccinelle/coccicheck and the semantic patch at scripts/coccinelle/misc/semicolon.cocci. Link: https://lore.kernel.org/linux-raid/20250910091912.25624-1-nichen@iscas.ac.cn Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2025-11-08md/md-linear: Enable atomic writesJohn Garry
All the infrastructure has already been plumbed to support this for stacked devices, so just enable the request_queue limits features flag. A note about chunk sectors for linear arrays: While it is possible to set a chunk sectors param for building a linear array, this is for specifying the granularity at which data sectors from the device are used. It is not the same as a stripe size, like for RAID0. As such, it is not appropriate to set chunk_sectors request queue limit to the same value, as chunk_sectors request limit is a boundary for which requests cannot straddle. However, request_queue limit max_hw_sectors is set to chunk sectors, which almost has the same effect as setting chunk_sectors limit. Link: https://lore.kernel.org/linux-raid/20250903161052.3326176-1-john.g.garry@oracle.com Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Yu Kuai <yukuai3@fnnas.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2025-11-08Factor out code into md_should_do_recovery()Wu Guanghao
In md_check_recovery(), use new helper to make code cleaner. Link: https://lore.kernel.org/linux-raid/e62894c8-d916-94bc-ef48-3c60e6e1fc5d@huawei.com Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> Reviewed-by: Yu Kuai <yukuai3@fnnas.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2025-11-08md: fix rcu protection in md_wakeup_threadYun Zhou
We attempted to use RCU to protect the pointer 'thread', but directly passed the value when calling md_wakeup_thread(). This means that the RCU pointer has been acquired before rcu_read_lock(), which renders rcu_read_lock() ineffective and could lead to a use-after-free. Link: https://lore.kernel.org/linux-raid/20251015083227.1079009-1-yun.zhou@windriver.com Fixes: 446931543982 ("md: protect md_thread with rcu") Signed-off-by: Yun Zhou <yun.zhou@windriver.com> Reviewed-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai@fnnas.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2025-11-08md: delete mddev kobj before deleting gendisk kobjXiao Ni
In sync del gendisk path, it deletes gendisk first and the directory /sys/block/md is removed. Then it releases mddev kobj in a delayed work. If we enable debug log in sysfs_remove_group, we can see the debug log 'sysfs group bitmap not found for kobject md'. It's the reason that the parent kobj has been deleted, so it can't find parent directory. In creating path, it allocs gendisk first, then adds mddev kobj. So it should delete mddev kobj before deleting gendisk. Before commit 9e59d609763f ("md: call del_gendisk in control path"), it releases mddev kobj first. If the kobj hasn't been deleted, it does clean job and deletes the kobj. Then it calls del_gendisk and releases gendisk kobj. So it doesn't need to call kobject_del to delete mddev kobj. After this patch, in sync del gendisk path, the sequence changes. So it needs to call kobject_del to delete mddev kobj. After this patch, the sequence is: 1. kobject del mddev kobj 2. del_gendisk deletes gendisk kobj 3. mddev_delayed_delete releases mddev kobj 4. md_kobj_release releases gendisk kobj Link: https://lore.kernel.org/linux-raid/20250928012424.61370-1-xni@redhat.com Fixes: 9e59d609763f ("md: call del_gendisk in control path") Signed-off-by: Xiao Ni <xni@redhat.com> Reviewed-by: Li Nan <linan122@huawei.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2025-11-07Merge branch '40GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-11-06 (i40, ice, iavf) Mohammad Heib introduces a new devlink parameter, max_mac_per_vf, for controlling the maximum number of MAC address filters allowed by a VF. This allows administrators to control the VF behavior in a more nuanced manner. Aleksandr and Przemek add support for Receive Side Scaling of GTP to iAVF for VFs running on E800 series ice hardware. This improves performance and scalability for virtualized network functions in 5G and LTE deployments. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: iavf: add RSS support for GTP protocol via ethtool ice: Extend PTYPE bitmap coverage for GTP encapsulated flows ice: improve TCAM priority handling for RSS profiles ice: implement GTP RSS context tracking and configuration ice: add virtchnl definitions and static data for GTP RSS ice: add flow parsing for GTP and new protocol field support i40e: support generic devlink param "max_mac_per_vf" devlink: Add new "max_mac_per_vf" generic device param ==================== Link: https://patch.msgid.link/20251106225321.1609605-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: stmmac: sti: use ->set_phy_intf_sel()Russell King (Oracle)
Rather than placing the phy_intf_sel() setup in the ->init() method, move it to the new ->set_phy_intf_sel() method. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vGy5o-0000000DhQn-34JE@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: stmmac: sti: use stmmac_get_phy_intf_sel()Russell King (Oracle)
Use stmmac_get_phy_intf_sel() to decode the PHY interface mode to the phy_intf_sel value, validate the result and use that to set the control register to select the operating mode for the DWMAC core. Note that when an unsupported interface mode is used, the array would decode this to PHY_INTF_SEL_GMII_MII, so preserve this behaviour. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vGy5j-0000000DhQh-2e0x@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: stmmac: sti: use PHY_INTF_SEL_x directlyRussell King (Oracle)
Use the PHY_INTF_SEL_x values directly rather than the driver private ETH_PHY_SEL_x values. Move the FIELD_PREP() into sti_dwmac_set_mode(). Use dwmac->interface directly. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vGy5e-0000000DhQb-2B7I@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: stmmac: sti: use PHY_INTF_SEL_x to select PHY interfaceRussell King (Oracle)
Use the common dwmac definitions for the PHY interface selection field, adding MII_PHY_SEL_VAL() temporarily to avoid line wrapping. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vGy5Z-0000000DhQV-1e2l@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: stmmac: lpc18xx: use ->set_phy_intf_sel()Russell King (Oracle)
Move the configuration of the dwmac PHY interface selection to the new ->set_phy_intf_sel() method. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vGy5U-0000000DhQP-19Hd@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: stmmac: lpc18xx: validate phy_intf_selRussell King (Oracle)
Validate the phy_intf_sel value rather than the PHY interface mode. This will allow us to transition to the ->set_phy_intf_sel() method. Note that this will allow GMII as well as MII as the phy_intf_sel value is the same for both. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vGy5P-0000000DhQJ-0Oi3@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: stmmac: lpc18xx: use stmmac_get_phy_intf_sel()Russell King (Oracle)
Use stmmac_get_phy_intf_sel() to decode the PHY interface mode to the phy_intf_sel value, and use the result to program the ethernet mode. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vGy5J-0000000DhQD-46Ob@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: stmmac: lpc18xx: use PHY_INTF_SEL_x directlyRussell King (Oracle)
Use the PHY_INTF_SEL_x values directly rather than the driver private LPC18XX_CREG_CREG6_ETHMODE_x definitions, and convert LPC18XX_CREG_CREG6_ETHMODE_MASK to use GENMASK(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vGy5E-0000000DhQ7-3cuy@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: stmmac: lpc18xx: convert to PHY_INTF_SEL_xRussell King (Oracle)
Use the common dwmac definitions for the PHY interface selection field. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vGy59-0000000DhQ1-393H@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: phy: micrel: lan8814 fix reset of the QSGMII interfaceHoratiu Vultur
The lan8814 is a quad-phy and it is using QSGMII towards the MAC. The problem is that everytime when one of the ports is configured then the PCS is reseted for all the PHYs. Meaning that the other ports can loose traffic until the link is establish again. To fix this, do the reset one time for the entire PHY package. Fixes: ece19502834d ("net: phy: micrel: 1588 support for LAN8814 phy") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Divya Koppera <Divya.Koppera@microchip.com > Link: https://patch.msgid.link/20251106090637.2030625-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: fec: correct rx_bytes statistic for the case SHIFT16 is setWei Fang
Two additional bytes in front of each frame received into the RX FIFO if SHIFT16 is set, so we need to subtract the extra two bytes from pkt_len to correct the statistic of rx_bytes. Fixes: 3ac72b7b63d5 ("net: fec: align IP header in hardware") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20251106021421.2096585-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07netdevsim: implement psp device statsDaniel Zahka
For now only tx/rx packets/bytes are reported. This is not compliant with the PSP Architecture Specification. Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20251106002608.1578518-6-daniel.zahka@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net/mlx5e: Add PSP stats support for Rx/Tx flowsJakub Kicinski
Add all statistics described under the "Implementation Requirements" section of the PSP Architecture Specification: Rx successfully decrypted PSP packets: psp_rx_pkts : Number of packets decrypted successfully psp_rx_bytes : Number of bytes decrypted successfully Rx PSP authentication failure statistics: psp_rx_pkts_auth_fail : Number of PSP packets that failed authentication psp_rx_bytes_auth_fail : Number of PSP bytes that failed authentication Rx PSP bad frame error statistics: psp_rx_pkts_frame_err; psp_rx_bytes_frame_err; Rx PSP drop statistics: psp_rx_pkts_drop : Number of PSP packets dropped psp_rx_bytes_drop : Number of PSP bytes dropped Tx successfully encrypted PSP packets: psp_tx_pkts : Number of packets encrypted successfully psp_tx_bytes : Number of bytes encrypted successfully Tx drops: tx_drop : Number of misc psp related drops The above can be seen using the ynl cli: ./pyynl/cli.py --spec netlink/specs/psp.yaml --dump get-stats Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20251106002608.1578518-5-daniel.zahka@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: phy: fixed_phy: shrink size of struct fixed_phy_statusHeiner Kallweit
All three members are effectively of type bool, so make this explicit and shrink size of struct fixed_phy_status. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/9eca3d7e-fa64-4724-8fdc-f2c1a8f2ae8f@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: phy: microchip_t1s:: add cable diagnostic support for LAN867x Rev.D0Parthiban Veerasooran
Enable Open Alliance TC14 (OATC14) 10Base-T1S cable diagnostic feature for Microchip LAN867x Rev.D0 PHY by implementing `cable_test_start` and `cable_test_get_status` using the generic C45 functions. This allows the `ethtool` utility to perform cable diagnostic tests directly on the PHY, improving network troubleshooting and maintenance. Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com> Link: https://patch.msgid.link/20251105051213.50443-3-parthiban.veerasooran@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: phy: phy-c45: add OATC14 10BASE-T1S PHY cable diagnostic supportParthiban Veerasooran
Add support for Open Alliance TC14 (OATC14) 10BASE-T1S PHYs cable diagnostic feature. This patch implements: - genphy_c45_oatc14_cable_test_start() to initiate a cable test - genphy_c45_oatc14_cable_test_get_status() to retrieve test results - Helper function to map PHY cable test status to ethtool result codes - Function declarations and exports for use by PHY drivers This enables ethtool to report ok, open, short, and undetectable cable conditions on OATC14 10Base-T1S PHYs. Open Alliance TC14 10BASE-T1S Advanced Diagnostic PHY Features Specification ref: https://opensig.org/wp-content/uploads/2025/06/OPEN_Alliance_10BASE-T1S_Advanced_PHY_features_for-automotive_Ethernet_V2.1b.pdf Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com> Link: https://patch.msgid.link/20251105051213.50443-2-parthiban.veerasooran@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: mana: Fix incorrect speed reported by debugfsErni Sri Satya Vennela
Once the netshaper is created for MANA, the current bandwidth is reported in debugfs like this: $ sudo ./tools/net/ynl/pyynl/cli.py \ --spec Documentation/netlink/specs/net_shaper.yaml \ --do set \ --json '{"ifindex":'3', "handle":{ "scope": "netdev", "id":'1' }, "bw-max": 200000000 }' None $ sudo cat /sys/kernel/debug/mana/1/vport0/current_speed 200 After the shaper is deleted, it is expected to report the maximum speed supported by the SKU. But currently it is reporting 0, which is incorrect. Fix this inconsistency, by resetting apc->speed to apc->max_speed during deletion of the shaper object. This will improve readability and debuggability. Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/1762369468-32570-1-git-send-email-ernis@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07net: airoha: Add the capability to consume out-of-order DMA tx descriptorsLorenzo Bianconi
EN7581 and AN7583 SoCs are capable of DMA mapping non-linear tx skbs on non-consecutive DMA descriptors. This feature is useful when multiple flows are queued on the same hw tx queue since it allows to fully utilize the available tx DMA descriptors and to avoid the starvation of high-priority flow we have in the current codebase due to head-of-line blocking introduced by low-priority flows. Tested-by: Xuegang Lu <xuegang.lu@airoha.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20251106-airoha-tx-linked-list-v2-1-0706d4a322bd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-08gpu: nova-core: justify remaining uses of `as`Alexandre Courbot
There are a few remaining cases where we *do* want to use `as`, because we specifically want to strip the data that does not fit into the destination type. Comment these uses to clear confusion about the intent. Acked-by: Danilo Krummrich <dakr@kernel.org> [acourbot@nvidia.com: fix merge conflicts after rebase.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251029-nova-as-v3-6-6a30c7333ad9@nvidia.com>
2025-11-08gpu: nova-core: replace use of `as` with functions from `num`Alexandre Courbot
Use the newly-introduced `num` module to replace the use of `as` wherever it is safe to do. This ensures that a given conversion cannot lose data if its source or destination type ever changes. Acked-by: Danilo Krummrich <dakr@kernel.org> [acourbot@nvidia.com: fix merge conflicts after rebase.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251029-nova-as-v3-5-6a30c7333ad9@nvidia.com>