<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/infiniband/ulp, branch v5.9-rc2</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq</title>
<updated>2020-07-29T17:26:53+00:00</updated>
<author>
<name>Jack Wang</name>
<email>jinpu.wang@cloud.ionos.com</email>
</author>
<published>2020-07-24T11:15:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=03ed5a8cda659e3c71d106b0dd4ce6520e4dcd6e'/>
<id>03ed5a8cda659e3c71d106b0dd4ce6520e4dcd6e</id>
<content type='text'>
lockdep triggers a warning from time to time when running a regression
test:

 rnbd_client L685: &lt;/dev/nullb0@bla&gt; Device disconnected.
 rnbd_client L1756: Unloading module

 workqueue: WQ_MEM_RECLAIM rtrs_client_wq:rtrs_clt_reconnect_work [rtrs_client] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
 WARNING: CPU: 2 PID: 18824 at kernel/workqueue.c:2517 check_flush_dependency+0xad/0x130

The root cause is workqueue core expect flushing should not be done for a
!WQ_MEM_RECLAIM wq from a WQ_MEM_RECLAIM workqueue.

In above case ib_addr workqueue without WQ_MEM_RECLAIM, but rtrs_wq
WQ_MEM_RECLAIM.

To avoid the warning, remove the WQ_MEM_RECLAIM flag.

Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20200724111508.15734-4-haris.iqbal@cloud.ionos.com
Signed-off-by: Jack Wang &lt;jinpu.wang@cloud.ionos.com&gt;
Signed-off-by: Md Haris Iqbal &lt;haris.iqbal@cloud.ionos.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
lockdep triggers a warning from time to time when running a regression
test:

 rnbd_client L685: &lt;/dev/nullb0@bla&gt; Device disconnected.
 rnbd_client L1756: Unloading module

 workqueue: WQ_MEM_RECLAIM rtrs_client_wq:rtrs_clt_reconnect_work [rtrs_client] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
 WARNING: CPU: 2 PID: 18824 at kernel/workqueue.c:2517 check_flush_dependency+0xad/0x130

The root cause is workqueue core expect flushing should not be done for a
!WQ_MEM_RECLAIM wq from a WQ_MEM_RECLAIM workqueue.

In above case ib_addr workqueue without WQ_MEM_RECLAIM, but rtrs_wq
WQ_MEM_RECLAIM.

To avoid the warning, remove the WQ_MEM_RECLAIM flag.

Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20200724111508.15734-4-haris.iqbal@cloud.ionos.com
Signed-off-by: Jack Wang &lt;jinpu.wang@cloud.ionos.com&gt;
Signed-off-by: Md Haris Iqbal &lt;haris.iqbal@cloud.ionos.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting</title>
<updated>2020-07-29T17:26:53+00:00</updated>
<author>
<name>Danil Kipnis</name>
<email>danil.kipnis@cloud.ionos.com</email>
</author>
<published>2020-07-24T11:15:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=09e0dbbeed82e35ce2cd21e086a6fac934163e2a'/>
<id>09e0dbbeed82e35ce2cd21e086a6fac934163e2a</id>
<content type='text'>
In order to avoid all the clients to start reconnecting at the same time
schedule the reconnect dwork with a random jitter of +[0,8] seconds.

Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20200724111508.15734-2-haris.iqbal@cloud.ionos.com
Signed-off-by: Danil Kipnis &lt;danil.kipnis@cloud.ionos.com&gt;
Signed-off-by: Md Haris Iqbal &lt;haris.iqbal@cloud.ionos.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to avoid all the clients to start reconnecting at the same time
schedule the reconnect dwork with a random jitter of +[0,8] seconds.

Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20200724111508.15734-2-haris.iqbal@cloud.ionos.com
Signed-off-by: Danil Kipnis &lt;danil.kipnis@cloud.ionos.com&gt;
Signed-off-by: Md Haris Iqbal &lt;haris.iqbal@cloud.ionos.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/srpt: use new shared CQ mechanism</title>
<updated>2020-07-29T12:10:32+00:00</updated>
<author>
<name>Yamin Friedman</name>
<email>yaminf@mellanox.com</email>
</author>
<published>2020-07-22T13:56:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c804af2c1d3152c0cf877eeb50d60c2d49ac0cf0'/>
<id>c804af2c1d3152c0cf877eeb50d60c2d49ac0cf0</id>
<content type='text'>
Have the driver use shared CQs provided by the rdma core driver.  This
provides the advantage of improved efficiency handling interrupts.

Link: https://lore.kernel.org/r/20200722135629.49467-3-maxg@mellanox.com
Signed-off-by: Yamin Friedman &lt;yaminf@mellanox.com&gt;
Reviewed-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Reviewed-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Have the driver use shared CQs provided by the rdma core driver.  This
provides the advantage of improved efficiency handling interrupts.

Link: https://lore.kernel.org/r/20200722135629.49467-3-maxg@mellanox.com
Signed-off-by: Yamin Friedman &lt;yaminf@mellanox.com&gt;
Reviewed-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Reviewed-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/isert: use new shared CQ mechanism</title>
<updated>2020-07-29T12:10:31+00:00</updated>
<author>
<name>Yamin Friedman</name>
<email>yaminf@mellanox.com</email>
</author>
<published>2020-07-22T13:56:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c6e663072333581364a9833a8bc895c699fb85ec'/>
<id>c6e663072333581364a9833a8bc895c699fb85ec</id>
<content type='text'>
Have the driver use shared CQs provided by the rdma core driver.  Since
this provides similar functionality to iser_comp it has been removed.

Now there is no reason to allocate very large CQs when the driver is
loaded while gaining the advantage of shared CQs. Previously when a single
connection was opened a CQ was opened for every core with enough space for
eight connections, this is a very large overhead that in most cases will
not be utilized.

Link: https://lore.kernel.org/r/20200722135629.49467-2-maxg@mellanox.com
Signed-off-by: Yamin Friedman &lt;yaminf@mellanox.com&gt;
Signed-off-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Have the driver use shared CQs provided by the rdma core driver.  Since
this provides similar functionality to iser_comp it has been removed.

Now there is no reason to allocate very large CQs when the driver is
loaded while gaining the advantage of shared CQs. Previously when a single
connection was opened a CQ was opened for every core with enough space for
eight connections, this is a very large overhead that in most cases will
not be utilized.

Link: https://lore.kernel.org/r/20200722135629.49467-2-maxg@mellanox.com
Signed-off-by: Yamin Friedman &lt;yaminf@mellanox.com&gt;
Signed-off-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/iser: use new shared CQ mechanism</title>
<updated>2020-07-29T12:10:31+00:00</updated>
<author>
<name>Yamin Friedman</name>
<email>yaminf@mellanox.com</email>
</author>
<published>2020-07-22T13:56:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d56a7852ec4d194aa07e45adf5586343a4ed0b59'/>
<id>d56a7852ec4d194aa07e45adf5586343a4ed0b59</id>
<content type='text'>
Have the driver use shared CQs provided by the rdma core driver.  Since
this provides similar functionality to iser_comp it has been removed. Now
there is no reason to allocate very large CQs when the driver is loaded
while gaining the advantage of shared CQs.

Link: https://lore.kernel.org/r/20200722135629.49467-1-maxg@mellanox.com
Signed-off-by: Yamin Friedman &lt;yaminf@mellanox.com&gt;
Acked-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Have the driver use shared CQs provided by the rdma core driver.  Since
this provides similar functionality to iser_comp it has been removed. Now
there is no reason to allocate very large CQs when the driver is loaded
while gaining the advantage of shared CQs.

Link: https://lore.kernel.org/r/20200722135629.49467-1-maxg@mellanox.com
Signed-off-by: Yamin Friedman &lt;yaminf@mellanox.com&gt;
Acked-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/isert: allocate RW ctxs according to max IO size</title>
<updated>2020-07-16T17:23:22+00:00</updated>
<author>
<name>Max Gurtovoy</name>
<email>maxg@mellanox.com</email>
</author>
<published>2020-07-08T09:19:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=317000b926b07c07e7f6c4032bd3344639f5eb4c'/>
<id>317000b926b07c07e7f6c4032bd3344639f5eb4c</id>
<content type='text'>
Current iSER target code allocates MR pool budget based on queue size.
Since there is no handshake between iSER initiator and target on max IO
size, we'll set the iSER target to support upto 16MiB IO operations and
allocate the correct number of RDMA ctxs according to the factor of MR's
per IO operation. This would guarantee sufficient size of the MR pool for
the required IO queue depth and IO size.

Link: https://lore.kernel.org/r/20200708091908.162263-1-maxg@mellanox.com
Reported-by: Krishnamraju Eraparaju &lt;krishna2@chelsio.com&gt;
Tested-by: Krishnamraju Eraparaju &lt;krishna2@chelsio.com&gt;
Signed-off-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Acked-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Current iSER target code allocates MR pool budget based on queue size.
Since there is no handshake between iSER initiator and target on max IO
size, we'll set the iSER target to support upto 16MiB IO operations and
allocate the correct number of RDMA ctxs according to the factor of MR's
per IO operation. This would guarantee sufficient size of the MR pool for
the required IO queue depth and IO size.

Link: https://lore.kernel.org/r/20200708091908.162263-1-maxg@mellanox.com
Reported-by: Krishnamraju Eraparaju &lt;krishna2@chelsio.com&gt;
Tested-by: Krishnamraju Eraparaju &lt;krishna2@chelsio.com&gt;
Signed-off-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Acked-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'mlx5_ipoib_qpn' into rdma.git for-next</title>
<updated>2020-07-06T17:29:58+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2020-07-06T17:29:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f11f3f76c7264ca6a0b7c495800879e9f73d0bf7'/>
<id>f11f3f76c7264ca6a0b7c495800879e9f73d0bf7</id>
<content type='text'>
Michael Guralnik says:

====================
This series handles IPoIB child interface creation with setting
interface's HW address.

In current implementation, lladdr requested by user is ignored and
overwritten. Child interface gets the same GID as the parent interface and
a QP number which is assigned by the underlying drivers.

In this series we fix this behavior so that user's requested address is
assigned to the newly created interface.

As specific QP number request is not supported for all vendors, QP number
requested by user will still be overwritten when this is not supported.

Behavior of creation of child interfaces through the sysfs mechanism or
without specifying a requested address, stays the same.
====================

Based on the mlx5-next branch at
      git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.

* branch 'mlx5_ipoib_qpn':
  RDMA/ipoib: Handle user-supplied address when creating child
  net/mlx5: Enable QP number request when creating IPoIB underlay QP

Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Michael Guralnik says:

====================
This series handles IPoIB child interface creation with setting
interface's HW address.

In current implementation, lladdr requested by user is ignored and
overwritten. Child interface gets the same GID as the parent interface and
a QP number which is assigned by the underlying drivers.

In this series we fix this behavior so that user's requested address is
assigned to the newly created interface.

As specific QP number request is not supported for all vendors, QP number
requested by user will still be overwritten when this is not supported.

Behavior of creation of child interfaces through the sysfs mechanism or
without specifying a requested address, stays the same.
====================

Based on the mlx5-next branch at
      git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.

* branch 'mlx5_ipoib_qpn':
  RDMA/ipoib: Handle user-supplied address when creating child
  net/mlx5: Enable QP number request when creating IPoIB underlay QP

Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/ipoib: Handle user-supplied address when creating child</title>
<updated>2020-07-06T15:57:58+00:00</updated>
<author>
<name>Michael Guralnik</name>
<email>michaelgur@mellanox.com</email>
</author>
<published>2020-06-23T11:01:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=87fb5c1ccb856838f7f75b1cc85ed0691ee8eb79'/>
<id>87fb5c1ccb856838f7f75b1cc85ed0691ee8eb79</id>
<content type='text'>
Use the address supplied by user when creating a child interface.

Previously, the address requested by the user was ignored and overridden
with parent's GID and the random QP number assigned to the child.

Link: https://lore.kernel.org/r/20200623110105.1225750-3-leon@kernel.org
Signed-off-by: Michael Guralnik &lt;michaelgur@mellanox.com&gt;
Reviewed-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the address supplied by user when creating a child interface.

Previously, the address requested by the user was ignored and overridden
with parent's GID and the random QP number assigned to the child.

Link: https://lore.kernel.org/r/20200623110105.1225750-3-leon@kernel.org
Signed-off-by: Michael Guralnik &lt;michaelgur@mellanox.com&gt;
Reviewed-by: Feras Daoud &lt;ferasda@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/ipoib: Fix ABBA deadlock with ipoib_reap_ah()</title>
<updated>2020-07-02T13:46:06+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2020-06-25T17:42:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=65936bf25f90fe440bb2d11624c7d10fab266639'/>
<id>65936bf25f90fe440bb2d11624c7d10fab266639</id>
<content type='text'>
ipoib_mcast_carrier_on_task() insanely open codes a rtnl_lock() such that
the only time flush_workqueue() can be called is if it also clears
IPOIB_FLAG_OPER_UP.

Thus the flush inside ipoib_flush_ah() will deadlock if it gets unlucky
enough, and lockdep doesn't help us to find it early:

          CPU0               CPU1          CPU2
   __ipoib_ib_dev_flush()
      down_read(vlan_rwsem)

                         ipoib_vlan_add()
                           rtnl_trylock()
                           down_write(vlan_rwsem)

				      ipoib_mcast_carrier_on_task()
					 while (!rtnl_trylock())
					      msleep(20);

      ipoib_flush_ah()
	flush_workqueue(priv-&gt;wq)

Clean up the ah_reaper related functions and lifecycle to make sense:

 - Start/Stop of the reaper should only be done in open/stop NDOs, not in
   any other places

 - cancel and flush of the reaper should only happen in the stop NDO.
   cancel is only functional when combined with IPOIB_STOP_REAPER.

 - Non-stop places were flushing the AH's just need to flush out dead AH's
   synchronously and ignore the background task completely. It is fully
   locked and harmless to leave running.

Which ultimately fixes the ABBA deadlock by removing the unnecessary
flush_workqueue() from the problematic place under the vlan_rwsem.

Fixes: efc82eeeae4e ("IB/ipoib: No longer use flush as a parameter")
Link: https://lore.kernel.org/r/20200625174219.290842-1-kamalheib1@gmail.com
Reported-by: Kamal Heib &lt;kheib@redhat.com&gt;
Tested-by: Kamal Heib &lt;kheib@redhat.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ipoib_mcast_carrier_on_task() insanely open codes a rtnl_lock() such that
the only time flush_workqueue() can be called is if it also clears
IPOIB_FLAG_OPER_UP.

Thus the flush inside ipoib_flush_ah() will deadlock if it gets unlucky
enough, and lockdep doesn't help us to find it early:

          CPU0               CPU1          CPU2
   __ipoib_ib_dev_flush()
      down_read(vlan_rwsem)

                         ipoib_vlan_add()
                           rtnl_trylock()
                           down_write(vlan_rwsem)

				      ipoib_mcast_carrier_on_task()
					 while (!rtnl_trylock())
					      msleep(20);

      ipoib_flush_ah()
	flush_workqueue(priv-&gt;wq)

Clean up the ah_reaper related functions and lifecycle to make sense:

 - Start/Stop of the reaper should only be done in open/stop NDOs, not in
   any other places

 - cancel and flush of the reaper should only happen in the stop NDO.
   cancel is only functional when combined with IPOIB_STOP_REAPER.

 - Non-stop places were flushing the AH's just need to flush out dead AH's
   synchronously and ignore the background task completely. It is fully
   locked and harmless to leave running.

Which ultimately fixes the ABBA deadlock by removing the unnecessary
flush_workqueue() from the problematic place under the vlan_rwsem.

Fixes: efc82eeeae4e ("IB/ipoib: No longer use flush as a parameter")
Link: https://lore.kernel.org/r/20200625174219.290842-1-kamalheib1@gmail.com
Reported-by: Kamal Heib &lt;kheib@redhat.com&gt;
Tested-by: Kamal Heib &lt;kheib@redhat.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/ipoib: Return void from ipoib_ib_dev_stop()</title>
<updated>2020-06-24T11:52:31+00:00</updated>
<author>
<name>Kamal Heib</name>
<email>kamalheib1@gmail.com</email>
</author>
<published>2020-06-23T10:52:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=95a5631f6c9f3045f26245e6045244652204dfdb'/>
<id>95a5631f6c9f3045f26245e6045244652204dfdb</id>
<content type='text'>
The return value from ipoib_ib_dev_stop() is always 0 - change it to be
void.

Link: https://lore.kernel.org/r/20200623105236.18683-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib &lt;kamalheib1@gmail.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The return value from ipoib_ib_dev_stop() is always 0 - change it to be
void.

Link: https://lore.kernel.org/r/20200623105236.18683-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib &lt;kamalheib1@gmail.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
