linux-stable.git/drivers/scsi, branch v4.4.67

scsi: cxlflash: Improve EEH recovery time

2017-05-08T05:46:02+00:00

commit 05dab43230fdc0d14ca885b473a2740fe017ecb1 upstream.

When an EEH occurs during device initialization, the port timeout logic
can cause excessive delays as MMIO reads will fail. Depending on where
they are experienced, these delays can lead to a prolonged reset,
causing an unnecessary triggering of other timeout logic in the SCSI
stack or user applications.

To expedite recovery, the port timeout logic is updated to decay the
timeout at a much faster rate when in the presence of a likely EEH
frozen event.

Signed-off-by: Matthew R. Ochs 
Acked-by: Uma Krishnan 
Signed-off-by: Martin K. Petersen 
Cc: Sumit Semwal 
Signed-off-by: Greg Kroah-Hartman

scsi: cxlflash: Fix to avoid EEH and host reset collisions

2017-05-08T05:46:01+00:00

commit 1d3324c382b1a617eb567e3650dcb51f22dfec9a upstream.

The EEH reset handler is ignorant to the current state of the driver
when processing a frozen event and initiating a device reset. This can
be an issue if an EEH event occurs while a user or stack initiated reset
is executing. More specifically, if an EEH occurs while the SCSI host
reset handler is active, the reset initiated by the EEH thread will
likely collide with the host reset thread. This can leave the device in
an inconsistent state, or worse, cause a system crash.

As a remedy, the EEH handler is updated to evaluate the device state and
take appropriate action (proceed, wait, or disconnect host). The host
reset handler is also updated to handle situations where an EEH occurred
during a host reset. In such situations, the host reset handler will
delay reporting back a success to give the EEH reset an opportunity to
complete.

Signed-off-by: Matthew R. Ochs 
Acked-by: Uma Krishnan 
Signed-off-by: Martin K. Petersen 
Cc: Sumit Semwal 
Signed-off-by: Greg Kroah-Hartman

scsi: cxlflash: Scan host only after the port is ready for I/O

2017-05-08T05:46:01+00:00

commit bbbfae962b7c221237c0f92547ee0c83f7204747 upstream.

When a port link is established, the AFU sends a 'link up' interrupt.
After the link is up, corresponding initialization steps are performed
on the card. Following that, when the card is ready for I/O, the AFU
sends 'login succeeded' interrupt. Today, cxlflash invokes
scsi_scan_host() upon receipt of both interrupts.

SCSI commands sent to the port prior to the 'login succeeded' interrupt
will fail with 'port not available' error. This is not desirable.
Moreover, when async_scan is active for the host, subsequent scan calls
are terminated with error. Due to this, the scsi_scan_host() call
performed after 'login succeeded' interrupt could portentially return
error and the devices may not be scanned properly.

To avoid this problem, scsi_scan_host() should be called only after the
'login succeeded' interrupt.

Signed-off-by: Uma Krishnan 
Acked-by: Matthew R. Ochs 
Signed-off-by: Martin K. Petersen 
Cc: Sumit Semwal 
Signed-off-by: Greg Kroah-Hartman

scsi: sd: Fix capacity calculation with 32-bit sector_t

2017-04-21T07:30:05+00:00

commit 7c856152cb92f8eee2df29ef325a1b1f43161aff upstream.

We previously made sure that the reported disk capacity was less than
0xffffffff blocks when the kernel was not compiled with large sector_t
support (CONFIG_LBDAF). However, this check assumed that the capacity
was reported in units of 512 bytes.

Add a sanity check function to ensure that we only enable disks if the
entire reported capacity can be expressed in terms of sector_t.

Reported-by: Steve Magnani 
Cc: Bart Van Assche 
Reviewed-by: Bart Van Assche 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Greg Kroah-Hartman

scsi: sd: Consider max_xfer_blocks if opt_xfer_blocks is unusable

2017-04-21T07:30:05+00:00

commit 6780414519f91c2a84da9baa963a940ac916f803 upstream.

If device reports a small max_xfer_blocks and a zero opt_xfer_blocks, we
end up using BLK_DEF_MAX_SECTORS, which is wrong and r/w of that size
may get error.

[mkp: tweaked to avoid setting rw_max twice and added typecast]

Fixes: ca369d51b3e ("block/sd: Fix device-imposed transfer length limits")
Signed-off-by: Fam Zheng 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Greg Kroah-Hartman

scsi: sr: Sanity check returned mode data

2017-04-21T07:30:05+00:00

commit a00a7862513089f17209b732f230922f1942e0b9 upstream.

Kefeng Wang discovered that old versions of the QEMU CD driver would
return mangled mode data causing us to walk off the end of the buffer in
an attempt to parse it. Sanity check the returned mode sense data.

Reported-by: Kefeng Wang 
Tested-by: Kefeng Wang 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Greg Kroah-Hartman

scsi: libsas: fix ata xfer length

2017-04-08T07:53:31+00:00

commit 9702c67c6066f583b629cf037d2056245bb7a8e6 upstream.

The total ata xfer length may not be calculated properly, in that we do
not use the proper method to get an sg element dma length.

According to the code comment, sg_dma_len() should be used after
dma_map_sg() is called.

This issue was found by turning on the SMMUv3 in front of the hisi_sas
controller in hip07. Multiple sg elements were being combined into a
single element, but the original first element length was being use as
the total xfer length.

Fixes: ff2aeb1eb64c8a4770a6 ("libata: convert to chained sg")
Signed-off-by: John Garry 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Greg Kroah-Hartman

scsi: sg: check length passed to SG_NEXT_CMD_LEN

2017-04-08T07:53:31+00:00

commit bf33f87dd04c371ea33feb821b60d63d754e3124 upstream.

The user can control the size of the next command passed along, but the
value passed to the ioctl isn't checked against the usable max command
size.

Signed-off-by: Peter Chang 
Acked-by: Douglas Gilbert 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Greg Kroah-Hartman

scsi: mpt3sas: fix hang on ata passthrough commands

2017-04-08T07:53:30+00:00

commit ffb58456589443ca572221fabbdef3db8483a779 upstream.

mpt3sas has a firmware failure where it can only handle one pass through
ATA command at a time.  If another comes in, contrary to the SAT
standard, it will hang until the first one completes (causing long
commands like secure erase to timeout).  The original fix was to block
the device when an ATA command came in, but this caused a regression
with

commit 669f044170d8933c3d66d231b69ea97cb8447338
Author: Bart Van Assche 
Date:   Tue Nov 22 16:17:13 2016 -0800

    scsi: srp_transport: Move queuecommand() wait code to SCSI core

So fix the original fix of the secure erase timeout by properly
returning SAM_STAT_BUSY like the SAT recommends.  The original patch
also had a concurrency problem since scsih_qcmd is lockless at that
point (this is fixed by using atomic bitops to set and test the flag).

[mkp: addressed feedback wrt. test_bit and fixed whitespace]

Fixes: 18f6084a989ba1b (mpt3sas: Fix secure erase premature termination)
Signed-off-by: James Bottomley 
Acked-by: Sreekanth Reddy 
Reviewed-by: Christoph Hellwig 
Reported-by: Ingo Molnar 
Tested-by: Ingo Molnar 
Signed-off-by: Martin K. Petersen 
Cc: Joe Korty 
Signed-off-by: Greg Kroah-Hartman

scsi: libiscsi: add lock around task lists to fix list corruption regression

2017-03-26T10:13:19+00:00

commit 6f8830f5bbab16e54f261de187f3df4644a5b977 upstream.

There's a rather long standing regression from the commit "libiscsi:
Reduce locking contention in fast path"

Depending on iSCSI target behavior, it's possible to hit the case in
iscsi_complete_task where the task is still on a pending list
(!list_empty(&task->running)).  When that happens the task is removed
from the list while holding the session back_lock, but other task list
modification occur under the frwd_lock.  That leads to linked list
corruption and eventually a panicked system.

Rather than back out the session lock split entirely, in order to try
and keep some of the performance gains this patch adds another lock to
maintain the task lists integrity.

Major enterprise supported kernels have been backing out the lock split
for while now, thanks to the efforts at IBM where a lab setup has the
most reliable reproducer I've seen on this issue.  This patch has been
tested there successfully.

Signed-off-by: Chris Leech 
Fixes: 659743b02c41 ("[SCSI] libiscsi: Reduce locking contention in fast path")
Reported-by: Prashantha Subbarao 
Reviewed-by: Guilherme G. Piccoli 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Greg Kroah-Hartman