linux.git/drivers/nvme, branch v4.16-rc2

nvme-rdma: fix sysfs invoked reset_ctrl error flow

2018-02-14T13:44:22+00:00

When reset_controller that is invoked by sysfs fails,
it enters an error flow which practically removes the
nvme ctrl entirely (similar to delete_ctrl flow). It
causes the system to hang, since a sysfs attribute cannot
be unregistered by one of its own methods.

This can be fixed by calling delete_ctrl as a work rather
than sequential code. In addition, it should give the ctrl
a chance to recover using reconnection mechanism (consistant
with FC reset_ctrl error flow). Also, while we're here, return
suitable errno in case the reset ended with non live ctrl.

Signed-off-by: Nitzan Carmi 
Reviewed-by: Max Gurtovoy 
Signed-off-by: Sagi Grimberg

nvmet: Change return code of discard command if not supported

2018-02-14T13:38:59+00:00

Execute discard command on block device that doesn't support it
should return success.
Returning internal error while using multi-path fails the path.

Reviewed-by: Max Gurtovoy 
Signed-off-by: Israel Rukshin 
Signed-off-by: Sagi Grimberg

nvme-pci: Fix timeouts in connecting state

2018-02-14T00:09:50+00:00

We need to halt the controller immediately if we haven't completed
initialization as indicated by the new "connecting" state.

Fixes: ad70062cdb ("nvme-pci: introduce RECONNECTING state to mark initializing procedure")
Signed-off-by: Keith Busch 
Reviewed-by: Christoph Hellwig

nvme-pci: Remap CMB SQ entries on every controller reset

2018-02-14T00:09:50+00:00

The controller memory buffer is remapped into a kernel address on each
reset, but the driver was setting the submission queue base address
only on the very first queue creation. The remapped address is likely to
change after a reset, so accessing the old address will hit a kernel bug.

This patch fixes that by setting the queue's CMB base address each time
the queue is created.

Fixes: f63572dff1421 ("nvme: unmap CMB and remove sysfs file in reset path")
Reported-by: Christian Black 
Cc: Jon Derrick 
Cc:  # 4.9+
Signed-off-by: Keith Busch 
Reviewed-by: Christoph Hellwig

nvme: fix the deadlock in nvme_update_formats

2018-02-14T00:09:50+00:00

nvme_update_formats will invoke nvme_ns_remove under namespaces_mutext.
The will cause deadlock because nvme_ns_remove will also require
the namespaces_mutext. Fix it by getting the ns entries which should
be removed under namespaces_mutext and invoke nvme_ns_remove out of
namespaces_mutext.

Signed-off-by: Jianchao Wang 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Keith Busch 
Signed-off-by: Sagi Grimberg

nvme: Don't use a stack buffer for keep-alive command

2018-02-12T20:18:14+00:00

In nvme_keep_alive() we pass a request with a pointer to an NVMe command on
the stack into blk_execute_rq_nowait().  However, the block layer doesn't
guarantee that the request is fully queued before blk_execute_rq_nowait()
returns.  If not, and the request is queued after nvme_keep_alive() returns,
then we'll end up using stack memory that might have been overwritten to
form the NVMe command we pass to hardware.

Fix this by keeping a special command struct in the nvme_ctrl struct right
next to the delayed work struct used for keep-alives.

Signed-off-by: Roland Dreier 
Signed-off-by: Sagi Grimberg

nvme_fc: cleanup io completion

2018-02-11T08:45:43+00:00

There was some old cold that dealt with complete_rq being called
prior to the lldd returning the io completion. This is garbage code.
The complete_rq routine was being called after eh_timeouts were
called and it was due to eh_timeouts not being handled properly.
The timeouts were fixed in prior patches so that in general, a
timeout will initiate an abort and the reset timer restarted as
the abort operation will take care of completing things. Given the
reset timer restarted, the erroneous complete_rq calls were eliminated.

So remove the work that was synchronizing complete_rq with io
completion.

Reviewed-by: Johannes Thumshirn 
Signed-off-by: James Smart 
Signed-off-by: Sagi Grimberg

nvme_fc: correct abort race condition on resets

2018-02-11T08:45:34+00:00

During reset handling, there is live io completing while the reset
is taking place. The reset path attempts to abort all outstanding io,
counting the number of ios that were reset. It then waits for those
ios to be reclaimed from the lldd before continuing.

The transport's logic on io state and flag setting was poor, allowing
ios to complete simultaneous to the abort request. The completed ios
were counted, but as the completion had already occurred, the
completion never reduced the count. As the count never zeros, the
reset/delete never completes.

Tighten it up by unconditionally changing the op state to completed
when the io done handler is called.  The reset/abort path now changes
the op state to aborted, but the abort only continues if the op
state was live priviously. If complete, the abort is backed out.
Thus proper counting of io aborts and their completions is working
again.

Also removed the TERMIO state on the op as it's redundant with the
op's aborted state.

Reviewed-by: Johannes Thumshirn 
Signed-off-by: James Smart 
Signed-off-by: Sagi Grimberg

nvme: Fix discard buffer overrun

2018-02-08T16:35:55+00:00

This patch checks the discard range array bounds before setting it in
case the driver gets a badly formed request.

Signed-off-by: Keith Busch 
Reviewed-by: Jens Axboe 
Signed-off-by: Sagi Grimberg

nvme: delete NVME_CTRL_LIVE --> NVME_CTRL_CONNECTING transition

2018-02-08T16:35:54+00:00

There is no logical reason to move from live state to connecting
state. In case of initial connection establishment, the transition
should be NVME_CTRL_NEW --> NVME_CTRL_CONNECTING --> NVME_CTRL_LIVE.
In case of error recovery or reset, the transition should be
NVME_CTRL_LIVE --> NVME_CTRL_RESETTING --> NVME_CTRL_CONNECTING -->
NVME_CTRL_LIVE.

Signed-off-by: Max Gurtovoy 
Reviewed-by: James Smart 
Signed-off-by: Sagi Grimberg