linux-stable.git/drivers/infiniband/ulp, branch v3.2.99

IB/srp: Avoid that a cable pull can trigger a kernel crash

2018-02-13T18:32:09+00:00

commit 8a0d18c62121d3c554a83eb96e2752861d84d937 upstream.

This patch fixes the following kernel crash:

general protection fault: 0000 [#1] PREEMPT SMP
Workqueue: ib_mad2 timeout_sends [ib_core]
Call Trace:
 ib_sa_path_rec_callback+0x1c4/0x1d0 [ib_core]
 send_handler+0xb2/0xd0 [ib_core]
 timeout_sends+0x14d/0x220 [ib_core]
 process_one_work+0x200/0x630
 worker_thread+0x4e/0x3b0
 kthread+0x113/0x150

Fixes: commit aef9ec39c47f ("IB: Add SCSI RDMA Protocol (SRP) initiator")
Signed-off-by: Bart Van Assche 
Reviewed-by: Sagi Grimberg 
Signed-off-by: Doug Ledford 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings

IB/ipoib: Remove double pointer assigning

2017-11-11T13:34:26+00:00

commit 1b355094b308f3377c8f574ce86135ee159c6285 upstream.

There is no need to assign "p" pointer twice.

This patch fixes the following smatch warning:
drivers/infiniband/ulp/ipoib/ipoib_cm.c:517 ipoib_cm_rx_handler() warn:
	missing break? reassigning 'p->id'

Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support")
Signed-off-by: Leon Romanovsky 
Signed-off-by: Ben Hutchings

IB/ipoib: Prevent setting negative values to max_nonsrq_conn_qp

2017-11-11T13:34:26+00:00

commit 11f74b40359b19f760964e71d04882a6caf530cc upstream.

Don't allow negative values to max_nonsrq_conn_qp. There is no functional
impact on a negative value but it is logicically incorrect.

Fixes: 68e995a29572 ("IPoIB/cm: Add connected mode support for devices without SRQs")
Signed-off-by: Alex Vesker 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Ben Hutchings

IB/ipoib: Change list_del to list_del_init in the tx object

2017-06-05T20:13:41+00:00

commit 27d41d29c7f093f6f77843624fbb080c1b4a8b9c upstream.

Since ipoib_cm_tx_start function and ipoib_cm_tx_reap function
belong to different work queues, they can run in parallel.
In this case if ipoib_cm_tx_reap calls list_del and release the
lock, ipoib_cm_tx_start may acquire it and call list_del_init
on the already deleted object.
Changing list_del to list_del_init in ipoib_cm_tx_reap fixes the problem.

Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support")
Signed-off-by: Feras Daoud 
Signed-off-by: Erez Shitrit 
Reviewed-by: Alex Vesker 
Signed-off-by: Leon Romanovsky 
Reviewed-by: Yuval Shaia 
Signed-off-by: Doug Ledford 
Signed-off-by: Ben Hutchings

IB/ipoib: Set device connection mode only when needed

2017-06-05T20:13:41+00:00

commit 80b5b35aba62232521b31440f0a3cf6caa033849 upstream.

When changing the connection mode, the ipoib_set_mode function
did not check if the previous connection mode equals to the
new one. This commit adds the required check and return 0 if the new
mode equals to the previous one.

Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support")
Signed-off-by: Feras Daoud 
Signed-off-by: Erez Shitrit 
Reviewed-by: Alex Vesker 
Reviewed-by: Yuval Shaia 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
[bwh: Backported to 3.2:
 - Adjust filename
 - Unlock RTNL lock before returning]
Signed-off-by: Ben Hutchings

IB/ipoib: Don't allow MC joins during light MC flush

2016-11-20T01:01:41+00:00

commit 344bacca8cd811809fc33a249f2738ab757d327f upstream.

This fix solves a race between light flush and on the fly joins.
Light flush doesn't set the device to down and unset IPOIB_OPER_UP
flag, this means that if while flushing we have a MC join in progress
and the QP was attached to BC MGID we can have a mismatches when
re-attaching a QP to the BC MGID.

The light flush would set the broadcast group to NULL causing an on
the fly join to rejoin and reattach to the BC MCG as well as adding
the BC MGID to the multicast list. The flush process would later on
remove the BC MGID and detach it from the QP. On the next flush
the BC MGID is present in the multicast list but not found when trying
to detach it because of the previous double attach and single detach.

[18332.714265] ------------[ cut here ]------------
[18332.717775] WARNING: CPU: 6 PID: 3767 at drivers/infiniband/core/verbs.c:280 ib_dealloc_pd+0xff/0x120 [ib_core]
...
[18332.775198] Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
[18332.779411]  0000000000000000 ffff8800b50dfbb0 ffffffff813fed47 0000000000000000
[18332.784960]  0000000000000000 ffff8800b50dfbf0 ffffffff8109add1 0000011832f58300
[18332.790547]  ffff880226a596c0 ffff880032482000 ffff880032482830 ffff880226a59280
[18332.796199] Call Trace:
[18332.798015]  [] dump_stack+0x63/0x8c
[18332.801831]  [] __warn+0xd1/0xf0
[18332.805403]  [] warn_slowpath_null+0x1d/0x20
[18332.809706]  [] ib_dealloc_pd+0xff/0x120 [ib_core]
[18332.814384]  [] ipoib_transport_dev_cleanup+0xfc/0x1d0 [ib_ipoib]
[18332.820031]  [] ipoib_ib_dev_cleanup+0x98/0x110 [ib_ipoib]
[18332.825220]  [] ipoib_dev_cleanup+0x2d8/0x550 [ib_ipoib]
[18332.830290]  [] ipoib_uninit+0x2f/0x40 [ib_ipoib]
[18332.834911]  [] rollback_registered_many+0x1aa/0x2c0
[18332.839741]  [] rollback_registered+0x31/0x40
[18332.844091]  [] unregister_netdevice_queue+0x48/0x80
[18332.848880]  [] ipoib_vlan_delete+0x1fb/0x290 [ib_ipoib]
[18332.853848]  [] delete_child+0x7d/0xf0 [ib_ipoib]
[18332.858474]  [] dev_attr_store+0x18/0x30
[18332.862510]  [] sysfs_kf_write+0x3a/0x50
[18332.866349]  [] kernfs_fop_write+0x120/0x170
[18332.870471]  [] __vfs_write+0x28/0xe0
[18332.874152]  [] ? percpu_down_read+0x1f/0x50
[18332.878274]  [] vfs_write+0xa2/0x1a0
[18332.881896]  [] SyS_write+0x46/0xa0
[18332.885632]  [] do_syscall_64+0x57/0xb0
[18332.889709]  [] entry_SYSCALL64_slow_path+0x25/0x25
[18332.894727] ---[ end trace 09ebbe31f831ef17 ]---

Fixes: ee1e2c82c245 ("IPoIB: Refresh paths instead of flushing them on SM change events")
Signed-off-by: Alex Vesker 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings

IB/ipoib: Fix memory corruption in ipoib cm mode connect flow

2016-11-20T01:01:36+00:00

commit 546481c2816ea3c061ee9d5658eb48070f69212e upstream.

When a new CM connection is being requested, ipoib driver copies data
from the path pointer in the CM/tx object, the path object might be
invalid at the point and memory corruption will happened later when now
the CM driver will try using that data.

The next scenario demonstrates it:
	neigh_add_path --> ipoib_cm_create_tx -->
	queue_work (pointer to path is in the cm/tx struct)
	#while the work is still in the queue,
	#the port goes down and causes the ipoib_flush_paths:
	ipoib_flush_paths --> path_free --> kfree(path)
	#at this point the work scheduled starts.
	ipoib_cm_tx_start --> copy from the (invalid)path pointer:
	(memcpy(&pathrec, &p->path->pathrec, sizeof pathrec);)
	 -> memory corruption.

To fix that the driver now starts the CM/tx connection only if that
specific path exists in the general paths database.
This check is protected with the relevant locks, and uses the gid from
the neigh member in the CM/tx object which is valid according to the ref
count that was taken by the CM/tx.

Fixes: 839fcaba35 ('IPoIB: Connected mode experimental support')
Signed-off-by: Erez Shitrit 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
[bwh: Backported to 3.2: s/neigh->daddr/neigh->neighbour->ha/]
Signed-off-by: Ben Hutchings

IB/srp: Fix a sporadic crash triggered by cable pulling

2014-07-11T12:33:35+00:00

commit 024ca90151f5e4296d30f72c13ff9a075e23c9ec upstream.

Avoid that the loops that iterate over the request ring can encounter
a pointer to a SCSI command in req->scmnd that is no longer associated
with that request. If the function srp_unmap_data() is invoked twice
for a SCSI command that is not in flight then that would cause
ib_fmr_pool_unmap() to be invoked with an invalid pointer as argument,
resulting in a kernel oops.

Reported-by: Sagi Grimberg 
Reference: http://thread.gmane.org/gmane.linux.drivers.rdma/19068/focus=19069
Signed-off-by: Bart Van Assche 
Reviewed-by: Sagi Grimberg 
Signed-off-by: Roland Dreier 
Signed-off-by: Ben Hutchings

IPoIB: Fix send lockup due to missed TX completion

2013-04-10T02:20:02+00:00

commit 1ee9e2aa7b31427303466776f455d43e5e3c9275 upstream.

Commit f0dc117abdfa ("IPoIB: Fix TX queue lockup with mixed UD/CM
traffic") attempts to solve an issue where unprocessed UD send
completions can deadlock the netdev.

The patch doesn't fully resolve the issue because if more than half
the tx_outstanding's were UD and all of the destinations are RC
reachable, arming the CQ doesn't solve the issue.

This patch uses the IB_CQ_REPORT_MISSED_EVENTS on the
ib_req_notify_cq().  If the rc is above 0, the UD send cq completion
callback is called directly to re-arm the send completion timer.

This issue is seen in very large parallel filesystem deployments
and the patch has been shown to correct the issue.

Reviewed-by: Dean Luick 
Signed-off-by: Mike Marciniszyn 
Signed-off-by: Roland Dreier 
Signed-off-by: Ben Hutchings

IB/srp: Avoid having aborted requests hang

2012-10-17T02:49:19+00:00

commit d8536670916a685df116b5c2cb256573fd25e4e3 upstream.

We need to call scsi_done() for commands after we abort them.

Signed-off-by: Bart Van Assche 
Acked-by: David Dillow 
Signed-off-by: Roland Dreier 
Signed-off-by: Ben Hutchings