<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/infiniband/ulp/ipoib, branch linux-3.18.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>IB/ipoib: Avoid a race condition between start_xmit and cm_rep_handler</title>
<updated>2018-09-26T06:33:57+00:00</updated>
<author>
<name>Aaron Knister</name>
<email>aaron.s.knister@nasa.gov</email>
</author>
<published>2018-08-24T12:42:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9778987cf346862c59d6b43666e1c6795682b79d'/>
<id>9778987cf346862c59d6b43666e1c6795682b79d</id>
<content type='text'>
commit 816e846c2eb9129a3e0afa5f920c8bbc71efecaa upstream.

Inside of start_xmit() the call to check if the connection is up and the
queueing of the packets for later transmission is not atomic which leaves
a window where cm_rep_handler can run, set the connection up, dequeue
pending packets and leave the subsequently queued packets by start_xmit()
sitting on neigh-&gt;queue until they're dropped when the connection is torn
down. This only applies to connected mode. These dropped packets can
really upset TCP, for example, and cause multi-minute delays in
transmission for open connections.

Here's the code in start_xmit where we check to see if the connection is
up:

       if (ipoib_cm_get(neigh)) {
               if (ipoib_cm_up(neigh)) {
                       ipoib_cm_send(dev, skb, ipoib_cm_get(neigh));
                       goto unref;
               }
       }

The race occurs if cm_rep_handler execution occurs after the above
connection check (specifically if it gets to the point where it acquires
priv-&gt;lock to dequeue pending skb's) but before the below code snippet in
start_xmit where packets are queued.

       if (skb_queue_len(&amp;neigh-&gt;queue) &lt; IPOIB_MAX_PATH_REC_QUEUE) {
               push_pseudo_header(skb, phdr-&gt;hwaddr);
               spin_lock_irqsave(&amp;priv-&gt;lock, flags);
               __skb_queue_tail(&amp;neigh-&gt;queue, skb);
               spin_unlock_irqrestore(&amp;priv-&gt;lock, flags);
       } else {
               ++dev-&gt;stats.tx_dropped;
               dev_kfree_skb_any(skb);
       }

The patch acquires the netif tx lock in cm_rep_handler for the section
where it sets the connection up and dequeues and retransmits deferred
skb's.

Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support")
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Knister &lt;aaron.s.knister@nasa.gov&gt;
Tested-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 816e846c2eb9129a3e0afa5f920c8bbc71efecaa upstream.

Inside of start_xmit() the call to check if the connection is up and the
queueing of the packets for later transmission is not atomic which leaves
a window where cm_rep_handler can run, set the connection up, dequeue
pending packets and leave the subsequently queued packets by start_xmit()
sitting on neigh-&gt;queue until they're dropped when the connection is torn
down. This only applies to connected mode. These dropped packets can
really upset TCP, for example, and cause multi-minute delays in
transmission for open connections.

Here's the code in start_xmit where we check to see if the connection is
up:

       if (ipoib_cm_get(neigh)) {
               if (ipoib_cm_up(neigh)) {
                       ipoib_cm_send(dev, skb, ipoib_cm_get(neigh));
                       goto unref;
               }
       }

The race occurs if cm_rep_handler execution occurs after the above
connection check (specifically if it gets to the point where it acquires
priv-&gt;lock to dequeue pending skb's) but before the below code snippet in
start_xmit where packets are queued.

       if (skb_queue_len(&amp;neigh-&gt;queue) &lt; IPOIB_MAX_PATH_REC_QUEUE) {
               push_pseudo_header(skb, phdr-&gt;hwaddr);
               spin_lock_irqsave(&amp;priv-&gt;lock, flags);
               __skb_queue_tail(&amp;neigh-&gt;queue, skb);
               spin_unlock_irqrestore(&amp;priv-&gt;lock, flags);
       } else {
               ++dev-&gt;stats.tx_dropped;
               dev_kfree_skb_any(skb);
       }

The patch acquires the netif tx lock in cm_rep_handler for the section
where it sets the connection up and dequeues and retransmits deferred
skb's.

Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support")
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Knister &lt;aaron.s.knister@nasa.gov&gt;
Tested-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Fix for potential no-carrier state</title>
<updated>2018-05-30T05:47:31+00:00</updated>
<author>
<name>Alex Estrin</name>
<email>alex.estrin@intel.com</email>
</author>
<published>2018-02-01T18:55:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2ab6382c53d4b15960004c96c469fe12f81db8ca'/>
<id>2ab6382c53d4b15960004c96c469fe12f81db8ca</id>
<content type='text'>
[ Upstream commit 1029361084d18cc270f64dfd39529fafa10cfe01 ]

On reboot SM can program port pkey table before ipoib registered its
event handler, which could result in missing pkey event and leave root
interface with initial pkey value from index 0.

Since OPA port starts with invalid pkey in index 0, root interface will
fail to initialize and stay down with no-carrier flag.

For IB ipoib interface may end up with pkey different from value
opensm put in pkey table idx 0, resulting in connectivity issues
(different mcast groups, for example).

Close the window by calling event handler after registration
to make sure ipoib pkey is in sync with port pkey table.

Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Signed-off-by: Alex Estrin &lt;alex.estrin@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 1029361084d18cc270f64dfd39529fafa10cfe01 ]

On reboot SM can program port pkey table before ipoib registered its
event handler, which could result in missing pkey event and leave root
interface with initial pkey value from index 0.

Since OPA port starts with invalid pkey in index 0, root interface will
fail to initialize and stay down with no-carrier flag.

For IB ipoib interface may end up with pkey different from value
opensm put in pkey table idx 0, resulting in connectivity issues
(different mcast groups, for example).

Close the window by calling event handler after registration
to make sure ipoib pkey is in sync with port pkey table.

Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Signed-off-by: Alex Estrin &lt;alex.estrin@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Avoid memory leak if the SA returns a different DGID</title>
<updated>2018-03-24T09:57:35+00:00</updated>
<author>
<name>Erez Shitrit</name>
<email>erezsh@mellanox.com</email>
</author>
<published>2017-11-14T12:51:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6bfd5a3a3c2599657a78583eb728afc1d8519b58'/>
<id>6bfd5a3a3c2599657a78583eb728afc1d8519b58</id>
<content type='text'>
[ Upstream commit 439000892ee17a9c92f1e4297818790ef8bb4ced ]

The ipoib path database is organized around DGIDs from the LLADDR, but the
SA is free to return a different GID when asked for path. This causes a
bug because the SA's modified DGID is copied into the database key, even
though it is no longer the correct lookup key, causing a memory leak and
other malfunctions.

Ensure the database key does not change after the SA query completes.

Demonstration of the bug is as  follows
ipoib wants to send to GID fe80:0000:0000:0000:0002:c903:00ef:5ee2, it
creates new record in the DB with that gid as a key, and issues a new
request to the SM.
Now, the SM from some reason returns path-record with other SGID (for
example, 2001:0000:0000:0000:0002:c903:00ef:5ee2 that contains the local
subnet prefix) now ipoib will overwrite the current entry with the new
one, and if new request to the original GID arrives ipoib  will not find
it in the DB (was overwritten) and will create new record that in its
turn will also be overwritten by the response from the SM, and so on
till the driver eats all the device memory.

Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 439000892ee17a9c92f1e4297818790ef8bb4ced ]

The ipoib path database is organized around DGIDs from the LLADDR, but the
SA is free to return a different GID when asked for path. This causes a
bug because the SA's modified DGID is copied into the database key, even
though it is no longer the correct lookup key, causing a memory leak and
other malfunctions.

Ensure the database key does not change after the SA query completes.

Demonstration of the bug is as  follows
ipoib wants to send to GID fe80:0000:0000:0000:0002:c903:00ef:5ee2, it
creates new record in the DB with that gid as a key, and issues a new
request to the SM.
Now, the SM from some reason returns path-record with other SGID (for
example, 2001:0000:0000:0000:0002:c903:00ef:5ee2 that contains the local
subnet prefix) now ipoib will overwrite the current entry with the new
one, and if new request to the original GID arrives ipoib  will not find
it in the DB (was overwritten) and will create new record that in its
turn will also be overwritten by the response from the SM, and so on
till the driver eats all the device memory.

Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Update broadcast object if PKey value was changed in index 0</title>
<updated>2018-03-24T09:57:33+00:00</updated>
<author>
<name>Feras Daoud</name>
<email>ferasda@mellanox.com</email>
</author>
<published>2017-03-19T09:18:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bff38ff05503e2161d6fe2d6e1979db4e9b80e46'/>
<id>bff38ff05503e2161d6fe2d6e1979db4e9b80e46</id>
<content type='text'>
[ Upstream commit 9a9b8112699d78e7f317019b37f377e90023f3ed ]

Update the broadcast address in the priv-&gt;broadcast object when the
Pkey value changes in index 0, otherwise the multicast GID value will
keep the previous value of the PKey, and will not be updated.
This leads to interface state down because the interface will keep the
old PKey value.

For example, in SR-IOV environment, if the PF changes the value of PKey
index 0 for one of the VFs, then the VF receives PKey change event that
triggers heavy flush. This flush calls update_parent_pkey that update the
broadcast object and its relevant members. If in this case the multicast
GID will not be updated, the interface state will be down.

Fixes: c2904141696e ("IPoIB: Fix pkey change flow for virtualization environments")
Signed-off-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Reviewed-by: Alex Vesker &lt;valex@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 9a9b8112699d78e7f317019b37f377e90023f3ed ]

Update the broadcast address in the priv-&gt;broadcast object when the
Pkey value changes in index 0, otherwise the multicast GID value will
keep the previous value of the PKey, and will not be updated.
This leads to interface state down because the interface will keep the
old PKey value.

For example, in SR-IOV environment, if the PF changes the value of PKey
index 0 for one of the VFs, then the VF receives PKey change event that
triggers heavy flush. This flush calls update_parent_pkey that update the
broadcast object and its relevant members. If in this case the multicast
GID will not be updated, the interface state will be down.

Fixes: c2904141696e ("IPoIB: Fix pkey change flow for virtualization environments")
Signed-off-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Reviewed-by: Alex Vesker &lt;valex@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Change list_del to list_del_init in the tx object</title>
<updated>2017-11-15T09:04:12+00:00</updated>
<author>
<name>Feras Daoud</name>
<email>ferasda@mellanox.com</email>
</author>
<published>2016-12-28T12:47:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=76fc35d849a1464142af7d628e29eb6dd62570b0'/>
<id>76fc35d849a1464142af7d628e29eb6dd62570b0</id>
<content type='text'>
[ Upstream commit 27d41d29c7f093f6f77843624fbb080c1b4a8b9c ]

Since ipoib_cm_tx_start function and ipoib_cm_tx_reap function
belong to different work queues, they can run in parallel.
In this case if ipoib_cm_tx_reap calls list_del and release the
lock, ipoib_cm_tx_start may acquire it and call list_del_init
on the already deleted object.
Changing list_del to list_del_init in ipoib_cm_tx_reap fixes the problem.

Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support")
Signed-off-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Reviewed-by: Alex Vesker &lt;valex@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Reviewed-by: Yuval Shaia &lt;yuval.shaia@oracle.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 27d41d29c7f093f6f77843624fbb080c1b4a8b9c ]

Since ipoib_cm_tx_start function and ipoib_cm_tx_reap function
belong to different work queues, they can run in parallel.
In this case if ipoib_cm_tx_reap calls list_del and release the
lock, ipoib_cm_tx_start may acquire it and call list_del_init
on the already deleted object.
Changing list_del to list_del_init in ipoib_cm_tx_reap fixes the problem.

Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support")
Signed-off-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Reviewed-by: Alex Vesker &lt;valex@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Reviewed-by: Yuval Shaia &lt;yuval.shaia@oracle.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Replace list_del of the neigh-&gt;list with list_del_init</title>
<updated>2017-10-08T08:11:19+00:00</updated>
<author>
<name>Feras Daoud</name>
<email>ferasda@mellanox.com</email>
</author>
<published>2016-12-28T12:47:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7e4194e7e7f4d744b0de96843a3a645c455d60a9'/>
<id>7e4194e7e7f4d744b0de96843a3a645c455d60a9</id>
<content type='text'>
[ Upstream commit c586071d1dc8227a7182179b8e50ee92cc43f6d2 ]

In order to resolve a situation where a few process delete
the same list element in sequence and cause panic, list_del
is replaced with list_del_init. In this case if the first
process that calls list_del releases the lock before acquiring
it again, other processes who can acquire the lock will call
list_del_init.

Fixes: b63b70d87741 ("IPoIB: Use a private hash table for path lookup")
Signed-off-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Reviewed-by: Alex Vesker &lt;valex@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Reviewed-by: Yuval Shaia &lt;yuval.shaia@oracle.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit c586071d1dc8227a7182179b8e50ee92cc43f6d2 ]

In order to resolve a situation where a few process delete
the same list element in sequence and cause panic, list_del
is replaced with list_del_init. In this case if the first
process that calls list_del releases the lock before acquiring
it again, other processes who can acquire the lock will call
list_del_init.

Fixes: b63b70d87741 ("IPoIB: Use a private hash table for path lookup")
Signed-off-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Reviewed-by: Alex Vesker &lt;valex@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Reviewed-by: Yuval Shaia &lt;yuval.shaia@oracle.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: rtnl_unlock can not come after free_netdev</title>
<updated>2017-10-08T08:11:19+00:00</updated>
<author>
<name>Feras Daoud</name>
<email>ferasda@mellanox.com</email>
</author>
<published>2016-12-28T12:47:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8d8a5bbc3976097e472ab4fbc2ccb5c0ff655b7f'/>
<id>8d8a5bbc3976097e472ab4fbc2ccb5c0ff655b7f</id>
<content type='text'>
[ Upstream commit 89a3987ab7a923c047c6dec008e60ad6f41fac22 ]

The ipoib_vlan_add function calls rtnl_unlock after free_netdev,
rtnl_unlock not only releases the lock, but also calls netdev_run_todo.
The latter function browses the net_todo_list array and completes the
unregistration of all its net_device instances. If we call free_netdev
before rtnl_unlock, then netdev_run_todo call over the freed device causes
panic.
To fix, move rtnl_unlock call before free_netdev call.

Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support")
Cc: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Reviewed-by: Yuval Shaia &lt;yuval.shaia@oracle.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 89a3987ab7a923c047c6dec008e60ad6f41fac22 ]

The ipoib_vlan_add function calls rtnl_unlock after free_netdev,
rtnl_unlock not only releases the lock, but also calls netdev_run_todo.
The latter function browses the net_todo_list array and completes the
unregistration of all its net_device instances. If we call free_netdev
before rtnl_unlock, then netdev_run_todo call over the freed device causes
panic.
To fix, move rtnl_unlock call before free_netdev call.

Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support")
Cc: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Reviewed-by: Yuval Shaia &lt;yuval.shaia@oracle.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Fix deadlock over vlan_mutex</title>
<updated>2017-10-08T08:11:19+00:00</updated>
<author>
<name>Feras Daoud</name>
<email>ferasda@mellanox.com</email>
</author>
<published>2016-12-28T12:47:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6fc261c1e77d6b3cce0a2bcfa95734b908c30615'/>
<id>6fc261c1e77d6b3cce0a2bcfa95734b908c30615</id>
<content type='text'>
[ Upstream commit 1c3098cdb05207e740715857df7b0998e372f527 ]

This patch fixes Deadlock while executing ipoib_vlan_delete.

The function takes the vlan_rwsem semaphore and calls
unregister_netdevice. The later function calls
ipoib_mcast_stop_thread that cause workqueue flush.

When the queue has one of the ipoib_ib_dev_flush_xxx events,
a deadlock occur because these events also tries to catch the
same vlan_rwsem semaphore.

To fix, unregister_netdevice should be called after releasing
the semaphore.

Fixes: cbbe1efa4972 ("IPoIB: Fix deadlock between ipoib_open() and child interface create")
Signed-off-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Reviewed-by: Alex Vesker &lt;valex@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 1c3098cdb05207e740715857df7b0998e372f527 ]

This patch fixes Deadlock while executing ipoib_vlan_delete.

The function takes the vlan_rwsem semaphore and calls
unregister_netdevice. The later function calls
ipoib_mcast_stop_thread that cause workqueue flush.

When the queue has one of the ipoib_ib_dev_flush_xxx events,
a deadlock occur because these events also tries to catch the
same vlan_rwsem semaphore.

To fix, unregister_netdevice should be called after releasing
the semaphore.

Fixes: cbbe1efa4972 ("IPoIB: Fix deadlock between ipoib_open() and child interface create")
Signed-off-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Reviewed-by: Alex Vesker &lt;valex@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/IPoIB: ibX: failed to create mcg debug file</title>
<updated>2017-05-20T12:18:41+00:00</updated>
<author>
<name>Shamir Rabinovitch</name>
<email>shamir.rabinovitch@oracle.com</email>
</author>
<published>2017-03-29T10:21:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f88282f6669777027e45ff1417967914e3965d82'/>
<id>f88282f6669777027e45ff1417967914e3965d82</id>
<content type='text'>
commit 771a52584096c45e4565e8aabb596eece9d73d61 upstream.

When udev renames the netdev devices, ipoib debugfs entries does not
get renamed. As a result, if subsequent probe of ipoib device reuse the
name then creating a debugfs entry for the new device would fail.

Also, moved ipoib_create_debug_files and ipoib_delete_debug_files as part
of ipoib event handling in order to avoid any race condition between these.

Fixes: 1732b0ef3b3a ([IPoIB] add path record information in debugfs)
Signed-off-by: Vijay Kumar &lt;vijay.ac.kumar@oracle.com&gt;
Signed-off-by: Shamir Rabinovitch &lt;shamir.rabinovitch@oracle.com&gt;
Reviewed-by: Mark Bloch &lt;markb@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 771a52584096c45e4565e8aabb596eece9d73d61 upstream.

When udev renames the netdev devices, ipoib debugfs entries does not
get renamed. As a result, if subsequent probe of ipoib device reuse the
name then creating a debugfs entry for the new device would fail.

Also, moved ipoib_create_debug_files and ipoib_delete_debug_files as part
of ipoib event handling in order to avoid any race condition between these.

Fixes: 1732b0ef3b3a ([IPoIB] add path record information in debugfs)
Signed-off-by: Vijay Kumar &lt;vijay.ac.kumar@oracle.com&gt;
Signed-off-by: Shamir Rabinovitch &lt;shamir.rabinovitch@oracle.com&gt;
Reviewed-by: Mark Bloch &lt;markb@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Fix deadlock between rmmod and set_mode</title>
<updated>2017-04-18T05:55:49+00:00</updated>
<author>
<name>Feras Daoud</name>
<email>ferasda@mellanox.com</email>
</author>
<published>2016-12-28T12:47:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4161529ef017c8f5f577845a17a2bc64cba93402'/>
<id>4161529ef017c8f5f577845a17a2bc64cba93402</id>
<content type='text'>
commit 0a0007f28304cb9fc87809c86abb80ec71317f20 upstream.

When calling set_mode from sys/fs, the call flow locks the sys/fs lock
first and then tries to lock rtnl_lock (when calling ipoib_set_mod).
On the other hand, the rmmod call flow takes the rtnl_lock first
(when calling unregister_netdev) and then tries to take the sys/fs
lock. Deadlock a-&gt;b, b-&gt;a.

The problem starts when ipoib_set_mod frees it's rtnl_lck and tries
to get it after that.

    set_mod:
    [&lt;ffffffff8104f2bd&gt;] ? check_preempt_curr+0x6d/0x90
    [&lt;ffffffff814fee8e&gt;] __mutex_lock_slowpath+0x13e/0x180
    [&lt;ffffffff81448655&gt;] ? __rtnl_unlock+0x15/0x20
    [&lt;ffffffff814fed2b&gt;] mutex_lock+0x2b/0x50
    [&lt;ffffffff81448675&gt;] rtnl_lock+0x15/0x20
    [&lt;ffffffffa02ad807&gt;] ipoib_set_mode+0x97/0x160 [ib_ipoib]
    [&lt;ffffffffa02b5f5b&gt;] set_mode+0x3b/0x80 [ib_ipoib]
    [&lt;ffffffff8134b840&gt;] dev_attr_store+0x20/0x30
    [&lt;ffffffff811f0fe5&gt;] sysfs_write_file+0xe5/0x170
    [&lt;ffffffff8117b068&gt;] vfs_write+0xb8/0x1a0
    [&lt;ffffffff8117ba81&gt;] sys_write+0x51/0x90
    [&lt;ffffffff8100b0f2&gt;] system_call_fastpath+0x16/0x1b

    rmmod:
    [&lt;ffffffff81279ffc&gt;] ? put_dec+0x10c/0x110
    [&lt;ffffffff8127a2ee&gt;] ? number+0x2ee/0x320
    [&lt;ffffffff814fe6a5&gt;] schedule_timeout+0x215/0x2e0
    [&lt;ffffffff8127cc04&gt;] ? vsnprintf+0x484/0x5f0
    [&lt;ffffffff8127b550&gt;] ? string+0x40/0x100
    [&lt;ffffffff814fe323&gt;] wait_for_common+0x123/0x180
    [&lt;ffffffff81060250&gt;] ? default_wake_function+0x0/0x20
    [&lt;ffffffff8119661e&gt;] ? ifind_fast+0x5e/0xb0
    [&lt;ffffffff814fe43d&gt;] wait_for_completion+0x1d/0x20
    [&lt;ffffffff811f2e68&gt;] sysfs_addrm_finish+0x228/0x270
    [&lt;ffffffff811f2fb3&gt;] sysfs_remove_dir+0xa3/0xf0
    [&lt;ffffffff81273f66&gt;] kobject_del+0x16/0x40
    [&lt;ffffffff8134cd14&gt;] device_del+0x184/0x1e0
    [&lt;ffffffff8144e59b&gt;] netdev_unregister_kobject+0xab/0xc0
    [&lt;ffffffff8143c05e&gt;] rollback_registered+0xae/0x130
    [&lt;ffffffff8143c102&gt;] unregister_netdevice+0x22/0x70
    [&lt;ffffffff8143c16e&gt;] unregister_netdev+0x1e/0x30
    [&lt;ffffffffa02a91b0&gt;] ipoib_remove_one+0xe0/0x120 [ib_ipoib]
    [&lt;ffffffffa01ed95f&gt;] ib_unregister_device+0x4f/0x100 [ib_core]
    [&lt;ffffffffa021f5e1&gt;] mlx4_ib_remove+0x41/0x180 [mlx4_ib]
    [&lt;ffffffffa01ab771&gt;] mlx4_remove_device+0x71/0x90 [mlx4_core]

Fixes: 862096a8bbf8 ("IB/ipoib: Add more rtnl_link_ops callbacks")
Cc: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 0a0007f28304cb9fc87809c86abb80ec71317f20 upstream.

When calling set_mode from sys/fs, the call flow locks the sys/fs lock
first and then tries to lock rtnl_lock (when calling ipoib_set_mod).
On the other hand, the rmmod call flow takes the rtnl_lock first
(when calling unregister_netdev) and then tries to take the sys/fs
lock. Deadlock a-&gt;b, b-&gt;a.

The problem starts when ipoib_set_mod frees it's rtnl_lck and tries
to get it after that.

    set_mod:
    [&lt;ffffffff8104f2bd&gt;] ? check_preempt_curr+0x6d/0x90
    [&lt;ffffffff814fee8e&gt;] __mutex_lock_slowpath+0x13e/0x180
    [&lt;ffffffff81448655&gt;] ? __rtnl_unlock+0x15/0x20
    [&lt;ffffffff814fed2b&gt;] mutex_lock+0x2b/0x50
    [&lt;ffffffff81448675&gt;] rtnl_lock+0x15/0x20
    [&lt;ffffffffa02ad807&gt;] ipoib_set_mode+0x97/0x160 [ib_ipoib]
    [&lt;ffffffffa02b5f5b&gt;] set_mode+0x3b/0x80 [ib_ipoib]
    [&lt;ffffffff8134b840&gt;] dev_attr_store+0x20/0x30
    [&lt;ffffffff811f0fe5&gt;] sysfs_write_file+0xe5/0x170
    [&lt;ffffffff8117b068&gt;] vfs_write+0xb8/0x1a0
    [&lt;ffffffff8117ba81&gt;] sys_write+0x51/0x90
    [&lt;ffffffff8100b0f2&gt;] system_call_fastpath+0x16/0x1b

    rmmod:
    [&lt;ffffffff81279ffc&gt;] ? put_dec+0x10c/0x110
    [&lt;ffffffff8127a2ee&gt;] ? number+0x2ee/0x320
    [&lt;ffffffff814fe6a5&gt;] schedule_timeout+0x215/0x2e0
    [&lt;ffffffff8127cc04&gt;] ? vsnprintf+0x484/0x5f0
    [&lt;ffffffff8127b550&gt;] ? string+0x40/0x100
    [&lt;ffffffff814fe323&gt;] wait_for_common+0x123/0x180
    [&lt;ffffffff81060250&gt;] ? default_wake_function+0x0/0x20
    [&lt;ffffffff8119661e&gt;] ? ifind_fast+0x5e/0xb0
    [&lt;ffffffff814fe43d&gt;] wait_for_completion+0x1d/0x20
    [&lt;ffffffff811f2e68&gt;] sysfs_addrm_finish+0x228/0x270
    [&lt;ffffffff811f2fb3&gt;] sysfs_remove_dir+0xa3/0xf0
    [&lt;ffffffff81273f66&gt;] kobject_del+0x16/0x40
    [&lt;ffffffff8134cd14&gt;] device_del+0x184/0x1e0
    [&lt;ffffffff8144e59b&gt;] netdev_unregister_kobject+0xab/0xc0
    [&lt;ffffffff8143c05e&gt;] rollback_registered+0xae/0x130
    [&lt;ffffffff8143c102&gt;] unregister_netdevice+0x22/0x70
    [&lt;ffffffff8143c16e&gt;] unregister_netdev+0x1e/0x30
    [&lt;ffffffffa02a91b0&gt;] ipoib_remove_one+0xe0/0x120 [ib_ipoib]
    [&lt;ffffffffa01ed95f&gt;] ib_unregister_device+0x4f/0x100 [ib_core]
    [&lt;ffffffffa021f5e1&gt;] mlx4_ib_remove+0x41/0x180 [mlx4_ib]
    [&lt;ffffffffa01ab771&gt;] mlx4_remove_device+0x71/0x90 [mlx4_core]

Fixes: 862096a8bbf8 ("IB/ipoib: Add more rtnl_link_ops callbacks")
Cc: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
</feed>
