linux-stable.git/drivers/scsi, branch v6.6.78

scsi: storvsc: Set correct data length for sending SCSI command without payload

2025-02-17T08:40:27+00:00

commit 87c4b5e8a6b65189abd9ea5010ab308941f964a4 upstream.

In StorVSC, payload->range.len is used to indicate if this SCSI command
carries payload. This data is allocated as part of the private driver data
by the upper layer and may get passed to lower driver uninitialized.

For example, the SCSI error handling mid layer may send TEST_UNIT_READY or
REQUEST_SENSE while reusing the buffer from a failed command. The private
data section may have stale data from the previous command.

If the SCSI command doesn't carry payload, the driver may use this value as
is for communicating with host, resulting in possible corruption.

Fix this by always initializing this value.

Fixes: be0cf6ca301c ("scsi: storvsc: Set the tablesize based on the information given by the host")
Cc: stable@kernel.org
Tested-by: Roman Kisel 
Reviewed-by: Roman Kisel 
Reviewed-by: Michael Kelley 
Signed-off-by: Long Li 
Link: https://lore.kernel.org/r/1737601642-7759-1-git-send-email-longli@linuxonhyperv.com
Signed-off-by: Martin K. Petersen 
Signed-off-by: Greg Kroah-Hartman

scsi: qla2xxx: Move FCE Trace buffer allocation to user control

2025-02-17T08:40:26+00:00

commit 841df27d619ee1f5ca6473e15227b39d6136562d upstream.

Currently FCE Tracing is enabled to log additional ELS events. Instead,
user will enable or disable this feature through debugfs.

Modify existing DFS knob to allow user to enable or disable this
feature.

echo [1 | 0] > /sys/kernel/debug/qla2xxx/qla2xxx_??/fce
cat  /sys/kernel/debug/qla2xxx/qla2xxx_??/fce

Cc: stable@vger.kernel.org
Fixes: df613b96077c ("[SCSI] qla2xxx: Add Fibre Channel Event (FCE) tracing support.")
Signed-off-by: Quinn Tran 
Signed-off-by: Nilesh Javali 
Link: https://lore.kernel.org/r/20241115130313.46826-4-njavali@marvell.com
Reviewed-by: Himanshu Madhani 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Greg Kroah-Hartman

scsi: st: Don't set pos_unknown just after device recognition

2025-02-17T08:40:26+00:00

commit 98b37881b7492ae9048ad48260cc8a6ee9eb39fd upstream.

Commit 9604eea5bd3a ("scsi: st: Add third party poweron reset handling") in
v6.6 added new code to handle the Power On/Reset Unit Attention (POR UA)
sense data. This was in addition to the existing method. When this Unit
Attention is received, the driver blocks attempts to read, write and some
other operations because the reset may have rewinded the tape. Because of
the added code, also the initial POR UA resulted in blocking operations,
including those that are used to set the driver options after the device is
recognized. Also, reading and writing are refused, whereas they succeeded
before this commit.

Add code to not set pos_unknown to block operations if the POR UA is
received from the first test_ready() call after the st device has been
created. This restores the behavior before v6.6.

Signed-off-by: Kai Mäkisara 
Link: https://lore.kernel.org/r/20241216113755.30415-1-Kai.Makisara@kolumbus.fi
Fixes: 9604eea5bd3a ("scsi: st: Add third party poweron reset handling")
CC: stable@vger.kernel.org
Closes: https://lore.kernel.org/linux-scsi/2201CF73-4795-4D3B-9A79-6EE5215CF58D@kolumbus.fi/
Signed-off-by: Martin K. Petersen 
Signed-off-by: Greg Kroah-Hartman

scsi: mpt3sas: Set ioc->manu_pg11.EEDPTagMode directly to 1

2025-02-08T08:52:26+00:00

[ Upstream commit ad7c3c0cb8f61d6d5a48b83e62ca4a9fd2f26153 ]

Currently, the code does:

    if (x == 0) {
    	x &= ~0x3;
	x |= 0x1;
    }

Zeroing bits 0 and 1 of a variable that is 0 is not necessary. So directly
set the variable to 1.

Cc: Sreekanth Reddy 
Fixes: f92363d12359 ("[SCSI] mpt3sas: add new driver supporting 12GB SAS")
Signed-off-by: Paul Menzel 
Link: https://lore.kernel.org/r/20241212221817.78940-2-pmenzel@molgen.mpg.de
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin

scsi: storvsc: Ratelimit warning logs to prevent VM denial of service

2025-02-01T17:37:55+00:00

commit d2138eab8cde61e0e6f62d0713e45202e8457d6d upstream.

If there's a persistent error in the hypervisor, the SCSI warning for
failed I/O can flood the kernel log and max out CPU utilization,
preventing troubleshooting from the VM side. Ratelimit the warning so
it doesn't DoS the VM.

Closes: https://github.com/microsoft/WSL/issues/9173
Signed-off-by: Easwar Hariharan 
Link: https://lore.kernel.org/r/20250107-eahariha-ratelimit-storvsc-v1-1-7fc193d1f2b0@linux.microsoft.com
Reviewed-by: Michael Kelley 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Greg Kroah-Hartman

scsi: iscsi: Fix redundant response for ISCSI_UEVENT_GET_HOST_STATS request

2025-02-01T17:37:51+00:00

[ Upstream commit 63ca02221cc5aa0731fe2b0cc28158aaa4b84982 ]

The ISCSI_UEVENT_GET_HOST_STATS request is already handled in
iscsi_get_host_stats(). This fix ensures that redundant responses are
skipped in iscsi_if_rx().

 - On success: send reply and stats from iscsi_get_host_stats()
   within if_recv_msg().

 - On error: fall through.

Signed-off-by: Xiang Zhang 
Link: https://lore.kernel.org/r/20250107022432.65390-1-hawkxiang.cpp@gmail.com
Reviewed-by: Mike Christie 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin

scsi: hisi_sas: Remove redundant checks for automatic debugfs dump

2025-01-09T12:32:09+00:00

commit 3f030550476566b12091687c70071d05ad433e0d upstream.

In commit 63f0733d07ce ("scsi: hisi_sas: Allocate DFX memory during dump
trigger"), the memory allocation time of the DFX is changed from device
initialization to dump occurs, so .debugfs_itct is not a valid address and
do not need to check.

The parameter hisi_sas_debugfs_enable is enough to check whether automatic
debugfs dump is triggered, so remove redunant checks.

Fixes: 63f0733d07ce ("scsi: hisi_sas: Allocate DFX memory during dump trigger")
Signed-off-by: Yihang Li 
Signed-off-by: Xiang Chen 
Link: https://lore.kernel.org/r/1705904747-62186-3-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen 
Signed-off-by: Greg Kroah-Hartman

scsi: hisi_sas: Fix a deadlock issue related to automatic dump

2025-01-09T12:31:53+00:00

[ Upstream commit 3c4f53b2c341ec6428b98cb51a89a09b025d0953 ]

If we issue a disabling PHY command, the device attached with it will go
offline, if a 2 bit ECC error occurs at the same time, a hung task may be
found:

[ 4613.652388] INFO: task kworker/u256:0:165233 blocked for more than 120 seconds.
[ 4613.666297] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4613.674809] task:kworker/u256:0  state:D stack:    0 pid:165233 ppid:     2 flags:0x00000208
[ 4613.683959] Workqueue: 0000:74:02.0_disco_q sas_revalidate_domain [libsas]
[ 4613.691518] Call trace:
[ 4613.694678]  __switch_to+0xf8/0x17c
[ 4613.698872]  __schedule+0x660/0xee0
[ 4613.703063]  schedule+0xac/0x240
[ 4613.706994]  schedule_timeout+0x500/0x610
[ 4613.711705]  __down+0x128/0x36c
[ 4613.715548]  down+0x240/0x2d0
[ 4613.719221]  hisi_sas_internal_abort_timeout+0x1bc/0x260 [hisi_sas_main]
[ 4613.726618]  sas_execute_internal_abort+0x144/0x310 [libsas]
[ 4613.732976]  sas_execute_internal_abort_dev+0x44/0x60 [libsas]
[ 4613.739504]  hisi_sas_internal_task_abort_dev.isra.0+0xbc/0x1b0 [hisi_sas_main]
[ 4613.747499]  hisi_sas_dev_gone+0x174/0x250 [hisi_sas_main]
[ 4613.753682]  sas_notify_lldd_dev_gone+0xec/0x2e0 [libsas]
[ 4613.759781]  sas_unregister_common_dev+0x4c/0x7a0 [libsas]
[ 4613.765962]  sas_destruct_devices+0xb8/0x120 [libsas]
[ 4613.771709]  sas_do_revalidate_domain.constprop.0+0x1b8/0x31c [libsas]
[ 4613.778930]  sas_revalidate_domain+0x60/0xa4 [libsas]
[ 4613.784716]  process_one_work+0x248/0x950
[ 4613.789424]  worker_thread+0x318/0x934
[ 4613.793878]  kthread+0x190/0x200
[ 4613.797810]  ret_from_fork+0x10/0x18
[ 4613.802121] INFO: task kworker/u256:4:316722 blocked for more than 120 seconds.
[ 4613.816026] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4613.824538] task:kworker/u256:4  state:D stack:    0 pid:316722 ppid:     2 flags:0x00000208
[ 4613.833670] Workqueue: 0000:74:02.0 hisi_sas_rst_work_handler [hisi_sas_main]
[ 4613.841491] Call trace:
[ 4613.844647]  __switch_to+0xf8/0x17c
[ 4613.848852]  __schedule+0x660/0xee0
[ 4613.853052]  schedule+0xac/0x240
[ 4613.856984]  schedule_timeout+0x500/0x610
[ 4613.861695]  __down+0x128/0x36c
[ 4613.865542]  down+0x240/0x2d0
[ 4613.869216]  hisi_sas_controller_prereset+0x58/0x1fc [hisi_sas_main]
[ 4613.876324]  hisi_sas_rst_work_handler+0x40/0x8c [hisi_sas_main]
[ 4613.883019]  process_one_work+0x248/0x950
[ 4613.887732]  worker_thread+0x318/0x934
[ 4613.892204]  kthread+0x190/0x200
[ 4613.896118]  ret_from_fork+0x10/0x18
[ 4613.900423] INFO: task kworker/u256:1:348985 blocked for more than 121 seconds.
[ 4613.914341] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4613.922852] task:kworker/u256:1  state:D stack:    0 pid:348985 ppid:     2 flags:0x00000208
[ 4613.931984] Workqueue: 0000:74:02.0_event_q sas_port_event_worker [libsas]
[ 4613.939549] Call trace:
[ 4613.942702]  __switch_to+0xf8/0x17c
[ 4613.946892]  __schedule+0x660/0xee0
[ 4613.951083]  schedule+0xac/0x240
[ 4613.955015]  schedule_timeout+0x500/0x610
[ 4613.959725]  wait_for_common+0x200/0x610
[ 4613.964349]  wait_for_completion+0x3c/0x5c
[ 4613.969146]  flush_workqueue+0x198/0x790
[ 4613.973776]  sas_porte_broadcast_rcvd+0x1e8/0x320 [libsas]
[ 4613.979960]  sas_port_event_worker+0x54/0xa0 [libsas]
[ 4613.985708]  process_one_work+0x248/0x950
[ 4613.990420]  worker_thread+0x318/0x934
[ 4613.994868]  kthread+0x190/0x200
[ 4613.998800]  ret_from_fork+0x10/0x18

This is because when the device goes offline, we obtain the hisi_hba
semaphore and send the ABORT_DEV command to the device. However, the
internal abort timed out due to the 2 bit ECC error and triggers automatic
dump. In addition, since the hisi_hba semaphore has been obtained, the dump
cannot be executed and the controller cannot be reset.

Therefore, the deadlocks occur on the following circular dependencies:
hisi_sas_dev_gone() -> down() -> hisi_sas_internal_task_abort_dev() -> ...
-> hisi_sas_internal_abort_timeout() -> down().

The deadlock is triggered only when the timeout occurs during device goes
offline. To fix this issue, use .rst_ha_timeout to distinguish the scenario
where a device goes offline from other scenarios.

Fixes: 2ff07b5c6fe9 ("scsi: hisi_sas: Directly call register snapshot instead of using workqueue")
Signed-off-by: Yihang Li 
Signed-off-by: Xiang Chen 
Link: https://lore.kernel.org/r/1705904747-62186-2-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin

scsi: mpi3mr: Start controller indexing from 0

2025-01-09T12:31:51+00:00

[ Upstream commit 0d32014f1e3e7a7adf1583c45387f26b9bb3a49d ]

Instead of displaying the controller index starting from '1' make the
driver display the controller index starting from '0'.

Signed-off-by: Sumit Saxena 
Signed-off-by: Ranjan Kumar 
Link: https://lore.kernel.org/r/20241110194405.10108-4-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin

scsi: mpi3mr: Use ida to manage mrioc ID

2025-01-09T12:31:51+00:00

[ Upstream commit 29b75184f721b16c51ef6e67eec0e40ed88381c7 ]

To ensure that the same ID is not obtained during concurrent execution of
the probe, an ida is used to manage the mrioc's ID.

Signed-off-by: Guixin Liu 
Link: https://lore.kernel.org/r/20231229040331.52518-1-kanie@linux.alibaba.com
Reviewed-by: Lee Duncan 
Reviewed-by: Martin Wilck 
Signed-off-by: Martin K. Petersen 
Stable-dep-of: 0d32014f1e3e ("scsi: mpi3mr: Start controller indexing from 0")
Signed-off-by: Sasha Levin