<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/infiniband/ulp, branch linux-4.15.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>IB/ipoib: Fix for notify send CQ failure messages</title>
<updated>2018-04-12T10:31:05+00:00</updated>
<author>
<name>Alex Estrin</name>
<email>alex.estrin@intel.com</email>
</author>
<published>2018-01-03T12:39:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=902dae240b4a45971858c671b5361365a89af114'/>
<id>902dae240b4a45971858c671b5361365a89af114</id>
<content type='text'>
[ Upstream commit 809cb6955650d892c6ef95f1d55f28fceded0ce1 ]

If IB_CQ_REPORT_MISSED_EVENTS flag is passed in ib_req_notify_cq()
it may return positive value indicating non-empty CQ.
If return code not verified the log might be flooded with false
warning messages "request notify on send CQ failed".

Fixes: 8966e28d2e40 ("IB/ipoib: Use NAPI in UD/TX flows")
Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Alex Estrin &lt;alex.estrin@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 809cb6955650d892c6ef95f1d55f28fceded0ce1 ]

If IB_CQ_REPORT_MISSED_EVENTS flag is passed in ib_req_notify_cq()
it may return positive value indicating non-empty CQ.
If return code not verified the log might be flooded with false
warning messages "request notify on send CQ failed".

Fixes: 8966e28d2e40 ("IB/ipoib: Use NAPI in UD/TX flows")
Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Alex Estrin &lt;alex.estrin@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>iser-target: avoid reinitializing rdma contexts for isert commands</title>
<updated>2018-03-24T10:02:48+00:00</updated>
<author>
<name>Bharat Potnuri</name>
<email>bharat@chelsio.com</email>
</author>
<published>2017-11-28T18:28:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3a23663bda899cc85d39023e0fafbf8c0ea2f045'/>
<id>3a23663bda899cc85d39023e0fafbf8c0ea2f045</id>
<content type='text'>
[ Upstream commit 66f53e6f5400578bae58db0c06d85a8820831f40 ]

isert commands that failed during isert_rdma_rw_ctx_post() are queued to
Queue-Full(QF) queue and are scheduled to be reposted during queue-full
queue processing. During this reposting, the rdma contexts are initialised
again in isert_rdma_rw_ctx_post(), which is leaking significant memory.

unreferenced object 0xffff8830201d9640 (size 64):
  comm "kworker/0:2", pid 195, jiffies 4295374851 (age 4528.436s)
  hex dump (first 32 bytes):
    00 60 8b cb 2e 00 00 00 00 10 00 00 00 00 00 00  .`..............
    00 90 e3 cb 2e 00 00 00 00 10 00 00 00 00 00 00  ................
  backtrace:
    [&lt;ffffffff8170711e&gt;] kmemleak_alloc+0x4e/0xb0
    [&lt;ffffffff811f8ba5&gt;] __kmalloc+0x125/0x2b0
    [&lt;ffffffffa046b24f&gt;] rdma_rw_ctx_init+0x15f/0x6f0 [ib_core]
    [&lt;ffffffffa07ab644&gt;] isert_rdma_rw_ctx_post+0xc4/0x3c0 [ib_isert]
    [&lt;ffffffffa07ad972&gt;] isert_put_datain+0x112/0x1c0 [ib_isert]
    [&lt;ffffffffa07dddce&gt;] lio_queue_data_in+0x2e/0x30 [iscsi_target_mod]
    [&lt;ffffffffa076c322&gt;] target_qf_do_work+0x2b2/0x4b0 [target_core_mod]
    [&lt;ffffffff81080c3b&gt;] process_one_work+0x1db/0x5d0
    [&lt;ffffffff8108107d&gt;] worker_thread+0x4d/0x3e0
    [&lt;ffffffff81088667&gt;] kthread+0x117/0x150
    [&lt;ffffffff81713fa7&gt;] ret_from_fork+0x27/0x40
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

Here is patch to use the older rdma contexts while reposting
the isert commands intead of reinitialising them.

Signed-off-by: Potnuri Bharat Teja &lt;bharat@chelsio.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 66f53e6f5400578bae58db0c06d85a8820831f40 ]

isert commands that failed during isert_rdma_rw_ctx_post() are queued to
Queue-Full(QF) queue and are scheduled to be reposted during queue-full
queue processing. During this reposting, the rdma contexts are initialised
again in isert_rdma_rw_ctx_post(), which is leaking significant memory.

unreferenced object 0xffff8830201d9640 (size 64):
  comm "kworker/0:2", pid 195, jiffies 4295374851 (age 4528.436s)
  hex dump (first 32 bytes):
    00 60 8b cb 2e 00 00 00 00 10 00 00 00 00 00 00  .`..............
    00 90 e3 cb 2e 00 00 00 00 10 00 00 00 00 00 00  ................
  backtrace:
    [&lt;ffffffff8170711e&gt;] kmemleak_alloc+0x4e/0xb0
    [&lt;ffffffff811f8ba5&gt;] __kmalloc+0x125/0x2b0
    [&lt;ffffffffa046b24f&gt;] rdma_rw_ctx_init+0x15f/0x6f0 [ib_core]
    [&lt;ffffffffa07ab644&gt;] isert_rdma_rw_ctx_post+0xc4/0x3c0 [ib_isert]
    [&lt;ffffffffa07ad972&gt;] isert_put_datain+0x112/0x1c0 [ib_isert]
    [&lt;ffffffffa07dddce&gt;] lio_queue_data_in+0x2e/0x30 [iscsi_target_mod]
    [&lt;ffffffffa076c322&gt;] target_qf_do_work+0x2b2/0x4b0 [target_core_mod]
    [&lt;ffffffff81080c3b&gt;] process_one_work+0x1db/0x5d0
    [&lt;ffffffff8108107d&gt;] worker_thread+0x4d/0x3e0
    [&lt;ffffffff81088667&gt;] kthread+0x117/0x150
    [&lt;ffffffff81713fa7&gt;] ret_from_fork+0x27/0x40
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

Here is patch to use the older rdma contexts while reposting
the isert commands intead of reinitialising them.

Signed-off-by: Potnuri Bharat Teja &lt;bharat@chelsio.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Avoid memory leak if the SA returns a different DGID</title>
<updated>2018-03-24T10:02:48+00:00</updated>
<author>
<name>Erez Shitrit</name>
<email>erezsh@mellanox.com</email>
</author>
<published>2017-11-14T12:51:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f333c18b543421233fa2fd55b8139eab452a6ce2'/>
<id>f333c18b543421233fa2fd55b8139eab452a6ce2</id>
<content type='text'>
[ Upstream commit 439000892ee17a9c92f1e4297818790ef8bb4ced ]

The ipoib path database is organized around DGIDs from the LLADDR, but the
SA is free to return a different GID when asked for path. This causes a
bug because the SA's modified DGID is copied into the database key, even
though it is no longer the correct lookup key, causing a memory leak and
other malfunctions.

Ensure the database key does not change after the SA query completes.

Demonstration of the bug is as  follows
ipoib wants to send to GID fe80:0000:0000:0000:0002:c903:00ef:5ee2, it
creates new record in the DB with that gid as a key, and issues a new
request to the SM.
Now, the SM from some reason returns path-record with other SGID (for
example, 2001:0000:0000:0000:0002:c903:00ef:5ee2 that contains the local
subnet prefix) now ipoib will overwrite the current entry with the new
one, and if new request to the original GID arrives ipoib  will not find
it in the DB (was overwritten) and will create new record that in its
turn will also be overwritten by the response from the SM, and so on
till the driver eats all the device memory.

Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 439000892ee17a9c92f1e4297818790ef8bb4ced ]

The ipoib path database is organized around DGIDs from the LLADDR, but the
SA is free to return a different GID when asked for path. This causes a
bug because the SA's modified DGID is copied into the database key, even
though it is no longer the correct lookup key, causing a memory leak and
other malfunctions.

Ensure the database key does not change after the SA query completes.

Demonstration of the bug is as  follows
ipoib wants to send to GID fe80:0000:0000:0000:0002:c903:00ef:5ee2, it
creates new record in the DB with that gid as a key, and issues a new
request to the SM.
Now, the SM from some reason returns path-record with other SGID (for
example, 2001:0000:0000:0000:0002:c903:00ef:5ee2 that contains the local
subnet prefix) now ipoib will overwrite the current entry with the new
one, and if new request to the original GID arrives ipoib  will not find
it in the DB (was overwritten) and will create new record that in its
turn will also be overwritten by the response from the SM, and so on
till the driver eats all the device memory.

Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Warn when one port fails to initialize</title>
<updated>2018-03-24T10:02:44+00:00</updated>
<author>
<name>Yuval Shaia</name>
<email>yuval.shaia@oracle.com</email>
</author>
<published>2017-11-29T06:34:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=be176a5c98e793d23f318c51dad043586f1dbeda'/>
<id>be176a5c98e793d23f318c51dad043586f1dbeda</id>
<content type='text'>
[ Upstream commit ac6dbf7fa4707c75a247b540cc0b5c881f3d0ba8 ]

If one port fails to initialize an error message should indicate the
reason and driver should continue serving the working port(s) and other
HCA(s).

Fixes: e4b2d06892c7 ("IB/ipoib: Remove device when one port fails to init").
Signed-off-by: Yuval Shaia &lt;yuval.shaia@oracle.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit ac6dbf7fa4707c75a247b540cc0b5c881f3d0ba8 ]

If one port fails to initialize an error message should indicate the
reason and driver should continue serving the working port(s) and other
HCA(s).

Fixes: e4b2d06892c7 ("IB/ipoib: Remove device when one port fails to init").
Signed-off-by: Yuval Shaia &lt;yuval.shaia@oracle.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: don't call update_pmtu unconditionally</title>
<updated>2018-01-25T21:27:34+00:00</updated>
<author>
<name>Nicolas Dichtel</name>
<email>nicolas.dichtel@6wind.com</email>
</author>
<published>2018-01-25T18:03:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f15ca723c1ebe6c1a06bc95fda6b62cd87b44559'/>
<id>f15ca723c1ebe6c1a06bc95fda6b62cd87b44559</id>
<content type='text'>
Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to:
"BUG: unable to handle kernel NULL pointer dereference at           (null)"

Let's add a helper to check if update_pmtu is available before calling it.

Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path")
Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path")
CC: Roman Kapl &lt;code@rkapl.cz&gt;
CC: Xin Long &lt;lucien.xin@gmail.com&gt;
Signed-off-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to:
"BUG: unable to handle kernel NULL pointer dereference at           (null)"

Let's add a helper to check if update_pmtu is available before calling it.

Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path")
Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path")
CC: Roman Kapl &lt;code@rkapl.cz&gt;
CC: Xin Long &lt;lucien.xin@gmail.com&gt;
Signed-off-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>iser-target: Fix possible use-after-free in connection establishment error</title>
<updated>2018-01-10T21:46:03+00:00</updated>
<author>
<name>Sagi Grimberg</name>
<email>sagi@grimberg.me</email>
</author>
<published>2017-11-26T13:31:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=cd52cb26e7ead5093635e98e07e221e4df482d34'/>
<id>cd52cb26e7ead5093635e98e07e221e4df482d34</id>
<content type='text'>
In case we fail to establish the connection we must drain our pre-posted
login recieve work request before continuing safely with connection
teardown.

Fixes: a060b5629ab0 ("IB/core: generic RDMA READ/WRITE API")
Cc: &lt;stable@vger.kernel.org&gt; # 4.7+
Reported-by: Amrani, Ram &lt;Ram.Amrani@cavium.com&gt;
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In case we fail to establish the connection we must drain our pre-posted
login recieve work request before continuing safely with connection
teardown.

Fixes: a060b5629ab0 ("IB/core: generic RDMA READ/WRITE API")
Cc: &lt;stable@vger.kernel.org&gt; # 4.7+
Reported-by: Amrani, Ram &lt;Ram.Amrani@cavium.com&gt;
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/srpt: Fix ACL lookup during login</title>
<updated>2018-01-04T03:07:51+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bart.vanassche@wdc.com</email>
</author>
<published>2018-01-03T21:39:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a1ffa4670cb97ae3a4b3e8535d88be5f643f7c3b'/>
<id>a1ffa4670cb97ae3a4b3e8535d88be5f643f7c3b</id>
<content type='text'>
Make sure that the initiator port GUID is stored in ch-&gt;ini_guid.
Note: when initiating a connection sgid and dgid members in struct
sa_path_rec represent the source and destination GIDs. When accepting
a connection however sgid represents the destination GID and dgid the
source GID.

Fixes: commit 2bce1a6d2209 ("IB/srpt: Accept GUIDs as port names")
Signed-off-by: Bart Van Assche &lt;bart.vanassche@wdc.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make sure that the initiator port GUID is stored in ch-&gt;ini_guid.
Note: when initiating a connection sgid and dgid members in struct
sa_path_rec represent the source and destination GIDs. When accepting
a connection however sgid represents the destination GID and dgid the
source GID.

Fixes: commit 2bce1a6d2209 ("IB/srpt: Accept GUIDs as port names")
Signed-off-by: Bart Van Assche &lt;bart.vanassche@wdc.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/srpt: Disable RDMA access by the initiator</title>
<updated>2018-01-04T03:07:21+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bart.vanassche@wdc.com</email>
</author>
<published>2018-01-03T21:39:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bec40c26041de61162f7be9d2ce548c756ce0f65'/>
<id>bec40c26041de61162f7be9d2ce548c756ce0f65</id>
<content type='text'>
With the SRP protocol all RDMA operations are initiated by the target.
Since no RDMA operations are initiated by the initiator, do not grant
the initiator permission to submit RDMA reads or writes to the target.

Signed-off-by: Bart Van Assche &lt;bart.vanassche@wdc.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With the SRP protocol all RDMA operations are initiated by the target.
Since no RDMA operations are initiated by the initiator, do not grant
the initiator permission to submit RDMA reads or writes to the target.

Signed-off-by: Bart Van Assche &lt;bart.vanassche@wdc.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Fix race condition in neigh creation</title>
<updated>2018-01-02T18:09:05+00:00</updated>
<author>
<name>Erez Shitrit</name>
<email>erezsh@mellanox.com</email>
</author>
<published>2017-12-31T13:33:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=16ba3defb8bd01a9464ba4820a487f5b196b455b'/>
<id>16ba3defb8bd01a9464ba4820a487f5b196b455b</id>
<content type='text'>
When using enhanced mode for IPoIB, two threads may execute xmit in
parallel to two different TX queues while the target is the same.
In this case, both of them will add the same neighbor to the path's
neigh link list and we might see the following message:

  list_add double add: new=ffff88024767a348, prev=ffff88024767a348...
  WARNING: lib/list_debug.c:31__list_add_valid+0x4e/0x70
  ipoib_start_xmit+0x477/0x680 [ib_ipoib]
  dev_hard_start_xmit+0xb9/0x3e0
  sch_direct_xmit+0xf9/0x250
  __qdisc_run+0x176/0x5d0
  __dev_queue_xmit+0x1f5/0xb10
  __dev_queue_xmit+0x55/0xb10

Analysis:
Two SKB are scheduled to be transmitted from two cores.
In ipoib_start_xmit, both gets NULL when calling ipoib_neigh_get.
Two calls to neigh_add_path are made. One thread takes the spin-lock
and calls ipoib_neigh_alloc which creates the neigh structure,
then (after the __path_find) the neigh is added to the path's neigh
link list. When the second thread enters the critical section it also
calls ipoib_neigh_alloc but in this case it gets the already allocated
ipoib_neigh structure, which is already linked to the path's neigh
link list and adds it again to the list. Which beside of triggering
the list, it creates a loop in the linked list. This loop leads to
endless loop inside path_rec_completion.

Solution:
Check list_empty(&amp;neigh-&gt;list) before adding to the list.
Add a similar fix in "ipoib_multicast.c::ipoib_mcast_send"

Fixes: b63b70d87741 ('IPoIB: Use a private hash table for path lookup in xmit path')
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Reviewed-by: Alex Vesker &lt;valex@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When using enhanced mode for IPoIB, two threads may execute xmit in
parallel to two different TX queues while the target is the same.
In this case, both of them will add the same neighbor to the path's
neigh link list and we might see the following message:

  list_add double add: new=ffff88024767a348, prev=ffff88024767a348...
  WARNING: lib/list_debug.c:31__list_add_valid+0x4e/0x70
  ipoib_start_xmit+0x477/0x680 [ib_ipoib]
  dev_hard_start_xmit+0xb9/0x3e0
  sch_direct_xmit+0xf9/0x250
  __qdisc_run+0x176/0x5d0
  __dev_queue_xmit+0x1f5/0xb10
  __dev_queue_xmit+0x55/0xb10

Analysis:
Two SKB are scheduled to be transmitted from two cores.
In ipoib_start_xmit, both gets NULL when calling ipoib_neigh_get.
Two calls to neigh_add_path are made. One thread takes the spin-lock
and calls ipoib_neigh_alloc which creates the neigh structure,
then (after the __path_find) the neigh is added to the path's neigh
link list. When the second thread enters the critical section it also
calls ipoib_neigh_alloc but in this case it gets the already allocated
ipoib_neigh structure, which is already linked to the path's neigh
link list and adds it again to the list. Which beside of triggering
the list, it creates a loop in the linked list. This loop leads to
endless loop inside path_rec_completion.

Solution:
Check list_empty(&amp;neigh-&gt;list) before adding to the list.
Add a similar fix in "ipoib_multicast.c::ipoib_mcast_send"

Fixes: b63b70d87741 ('IPoIB: Use a private hash table for path lookup in xmit path')
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Reviewed-by: Alex Vesker &lt;valex@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush</title>
<updated>2017-12-21T23:06:07+00:00</updated>
<author>
<name>Alex Vesker</name>
<email>valex@mellanox.com</email>
</author>
<published>2017-12-21T15:38:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1f80bd6a6cc8358b81194e1f5fc16449947396ec'/>
<id>1f80bd6a6cc8358b81194e1f5fc16449947396ec</id>
<content type='text'>
The locking order of vlan_rwsem (LOCK A) and then rtnl (LOCK B),
contradicts other flows such as ipoib_open possibly causing a deadlock.
To prevent this deadlock heavy flush is called with RTNL locked and
only then tries to acquire vlan_rwsem.
This deadlock is possible only when there are child interfaces.

[  140.941758] ======================================================
[  140.946276] WARNING: possible circular locking dependency detected
[  140.950950] 4.15.0-rc1+ #9 Tainted: G           O
[  140.954797] ------------------------------------------------------
[  140.959424] kworker/u32:1/146 is trying to acquire lock:
[  140.963450]  (rtnl_mutex){+.+.}, at: [&lt;ffffffffc083516a&gt;] __ipoib_ib_dev_flush+0x2da/0x4e0 [ib_ipoib]
[  140.970006]
but task is already holding lock:
[  140.975141]  (&amp;priv-&gt;vlan_rwsem){++++}, at: [&lt;ffffffffc0834ee1&gt;] __ipoib_ib_dev_flush+0x51/0x4e0 [ib_ipoib]
[  140.982105]
which lock already depends on the new lock.
[  140.990023]
the existing dependency chain (in reverse order) is:
[  140.998650]
-&gt; #1 (&amp;priv-&gt;vlan_rwsem){++++}:
[  141.005276]        down_read+0x4d/0xb0
[  141.009560]        ipoib_open+0xad/0x120 [ib_ipoib]
[  141.014400]        __dev_open+0xcb/0x140
[  141.017919]        __dev_change_flags+0x1a4/0x1e0
[  141.022133]        dev_change_flags+0x23/0x60
[  141.025695]        devinet_ioctl+0x704/0x7d0
[  141.029156]        sock_do_ioctl+0x20/0x50
[  141.032526]        sock_ioctl+0x221/0x300
[  141.036079]        do_vfs_ioctl+0xa6/0x6d0
[  141.039656]        SyS_ioctl+0x74/0x80
[  141.042811]        entry_SYSCALL_64_fastpath+0x1f/0x96
[  141.046891]
-&gt; #0 (rtnl_mutex){+.+.}:
[  141.051701]        lock_acquire+0xd4/0x220
[  141.055212]        __mutex_lock+0x88/0x970
[  141.058631]        __ipoib_ib_dev_flush+0x2da/0x4e0 [ib_ipoib]
[  141.063160]        __ipoib_ib_dev_flush+0x71/0x4e0 [ib_ipoib]
[  141.067648]        process_one_work+0x1f5/0x610
[  141.071429]        worker_thread+0x4a/0x3f0
[  141.074890]        kthread+0x141/0x180
[  141.078085]        ret_from_fork+0x24/0x30
[  141.081559]

other info that might help us debug this:
[  141.088967]  Possible unsafe locking scenario:
[  141.094280]        CPU0                    CPU1
[  141.097953]        ----                    ----
[  141.101640]   lock(&amp;priv-&gt;vlan_rwsem);
[  141.104771]                                lock(rtnl_mutex);
[  141.109207]                                lock(&amp;priv-&gt;vlan_rwsem);
[  141.114032]   lock(rtnl_mutex);
[  141.116800]
 *** DEADLOCK ***

Fixes: b4b678b06f6e ("IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop")
Signed-off-by: Alex Vesker &lt;valex@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The locking order of vlan_rwsem (LOCK A) and then rtnl (LOCK B),
contradicts other flows such as ipoib_open possibly causing a deadlock.
To prevent this deadlock heavy flush is called with RTNL locked and
only then tries to acquire vlan_rwsem.
This deadlock is possible only when there are child interfaces.

[  140.941758] ======================================================
[  140.946276] WARNING: possible circular locking dependency detected
[  140.950950] 4.15.0-rc1+ #9 Tainted: G           O
[  140.954797] ------------------------------------------------------
[  140.959424] kworker/u32:1/146 is trying to acquire lock:
[  140.963450]  (rtnl_mutex){+.+.}, at: [&lt;ffffffffc083516a&gt;] __ipoib_ib_dev_flush+0x2da/0x4e0 [ib_ipoib]
[  140.970006]
but task is already holding lock:
[  140.975141]  (&amp;priv-&gt;vlan_rwsem){++++}, at: [&lt;ffffffffc0834ee1&gt;] __ipoib_ib_dev_flush+0x51/0x4e0 [ib_ipoib]
[  140.982105]
which lock already depends on the new lock.
[  140.990023]
the existing dependency chain (in reverse order) is:
[  140.998650]
-&gt; #1 (&amp;priv-&gt;vlan_rwsem){++++}:
[  141.005276]        down_read+0x4d/0xb0
[  141.009560]        ipoib_open+0xad/0x120 [ib_ipoib]
[  141.014400]        __dev_open+0xcb/0x140
[  141.017919]        __dev_change_flags+0x1a4/0x1e0
[  141.022133]        dev_change_flags+0x23/0x60
[  141.025695]        devinet_ioctl+0x704/0x7d0
[  141.029156]        sock_do_ioctl+0x20/0x50
[  141.032526]        sock_ioctl+0x221/0x300
[  141.036079]        do_vfs_ioctl+0xa6/0x6d0
[  141.039656]        SyS_ioctl+0x74/0x80
[  141.042811]        entry_SYSCALL_64_fastpath+0x1f/0x96
[  141.046891]
-&gt; #0 (rtnl_mutex){+.+.}:
[  141.051701]        lock_acquire+0xd4/0x220
[  141.055212]        __mutex_lock+0x88/0x970
[  141.058631]        __ipoib_ib_dev_flush+0x2da/0x4e0 [ib_ipoib]
[  141.063160]        __ipoib_ib_dev_flush+0x71/0x4e0 [ib_ipoib]
[  141.067648]        process_one_work+0x1f5/0x610
[  141.071429]        worker_thread+0x4a/0x3f0
[  141.074890]        kthread+0x141/0x180
[  141.078085]        ret_from_fork+0x24/0x30
[  141.081559]

other info that might help us debug this:
[  141.088967]  Possible unsafe locking scenario:
[  141.094280]        CPU0                    CPU1
[  141.097953]        ----                    ----
[  141.101640]   lock(&amp;priv-&gt;vlan_rwsem);
[  141.104771]                                lock(rtnl_mutex);
[  141.109207]                                lock(&amp;priv-&gt;vlan_rwsem);
[  141.114032]   lock(rtnl_mutex);
[  141.116800]
 *** DEADLOCK ***

Fixes: b4b678b06f6e ("IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop")
Signed-off-by: Alex Vesker &lt;valex@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
