<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/infiniband, branch v4.4.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>net/mlx5_core: Fix trimming down IRQ number</title>
<updated>2016-01-31T19:29:01+00:00</updated>
<author>
<name>Doron Tsur</name>
<email>doront@mellanox.com</email>
</author>
<published>2016-01-17T09:25:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9ef3ceb4e78d12e1f1a81f6ebb88986e6344c7c8'/>
<id>9ef3ceb4e78d12e1f1a81f6ebb88986e6344c7c8</id>
<content type='text'>
[ Upstream commit 0b6e26ce89391327d955a756a7823272238eb867 ]

With several ConnectX-4 cards installed on a server, one may receive
irqn &gt; 255 from the kernel API, which we mistakenly trim to 8bit.

This causes EQ creation failure with the following stack trace:
[&lt;ffffffff812a11f4&gt;] dump_stack+0x48/0x64
[&lt;ffffffff810ace21&gt;] __setup_irq+0x3a1/0x4f0
[&lt;ffffffff810ad7e0&gt;] request_threaded_irq+0x120/0x180
[&lt;ffffffffa0923660&gt;] ? mlx5_eq_int+0x450/0x450 [mlx5_core]
[&lt;ffffffffa0922f64&gt;] mlx5_create_map_eq+0x1e4/0x2b0 [mlx5_core]
[&lt;ffffffffa091de01&gt;] alloc_comp_eqs+0xb1/0x180 [mlx5_core]
[&lt;ffffffffa091ea99&gt;] mlx5_dev_init+0x5e9/0x6e0 [mlx5_core]
[&lt;ffffffffa091ec29&gt;] init_one+0x99/0x1c0 [mlx5_core]
[&lt;ffffffff812e2afc&gt;] local_pci_probe+0x4c/0xa0

Fixing it by changing of the irqn type from u8 to unsigned int to
support values &gt; 255

Fixes: 61d0e73e0a5a ('net/mlx5_core: Use the the real irqn in eq-&gt;irqn')
Reported-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: Doron Tsur &lt;doront@mellanox.com&gt;
Signed-off-by: Matan Barak &lt;matanb@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 0b6e26ce89391327d955a756a7823272238eb867 ]

With several ConnectX-4 cards installed on a server, one may receive
irqn &gt; 255 from the kernel API, which we mistakenly trim to 8bit.

This causes EQ creation failure with the following stack trace:
[&lt;ffffffff812a11f4&gt;] dump_stack+0x48/0x64
[&lt;ffffffff810ace21&gt;] __setup_irq+0x3a1/0x4f0
[&lt;ffffffff810ad7e0&gt;] request_threaded_irq+0x120/0x180
[&lt;ffffffffa0923660&gt;] ? mlx5_eq_int+0x450/0x450 [mlx5_core]
[&lt;ffffffffa0922f64&gt;] mlx5_create_map_eq+0x1e4/0x2b0 [mlx5_core]
[&lt;ffffffffa091de01&gt;] alloc_comp_eqs+0xb1/0x180 [mlx5_core]
[&lt;ffffffffa091ea99&gt;] mlx5_dev_init+0x5e9/0x6e0 [mlx5_core]
[&lt;ffffffffa091ec29&gt;] init_one+0x99/0x1c0 [mlx5_core]
[&lt;ffffffff812e2afc&gt;] local_pci_probe+0x4c/0xa0

Fixing it by changing of the irqn type from u8 to unsigned int to
support values &gt; 255

Fixes: 61d0e73e0a5a ('net/mlx5_core: Use the the real irqn in eq-&gt;irqn')
Reported-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: Doron Tsur &lt;doront@mellanox.com&gt;
Signed-off-by: Matan Barak &lt;matanb@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/ocrdma: Depend on async link events from CNA</title>
<updated>2015-12-28T16:45:54+00:00</updated>
<author>
<name>Devesh Sharma</name>
<email>devesh.sharma@avagotech.com</email>
</author>
<published>2015-12-24T18:14:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=10a214dc996236e6547b84fb5ca007316b30c2e6'/>
<id>10a214dc996236e6547b84fb5ca007316b30c2e6</id>
<content type='text'>
Recently Dough Ledford reported a deadlock happening
between ocrdma-load sequence and NetworkManager service
issuing "open" on be2net interface.

The deadlock happens when any be2net hook (e.g. open/close) is called
in parallel to insmod ocrdma.ko.

A. be2net is sending administrative open/close event to ocrdma holding
   device_list_mutex. It does this from ndo_open/ndo_stop hooks of be2net.
   So sequence of locks is rtnl_lock---&gt; device_list lock

B.  When new ocrdma roce device gets registered, infiniband stack now
    takes rtnl_lock in ib_register_device() in GID initialization routines.
    So sequence of locks in this path is device_list lock ---&gt; rtnl_lock.

This improper locking sequence causes deadlock.

With this patch we stop using administrative open and close events
injected by be2net driver. These events were used to dispatch PORT_ACTIVE
and PORT_ERROR events to the IB-stack. This patch implements a logic
to receive async-link-events generated from CNA whenever link-state-change
is detected. Now on, these async-events will be used to dispatch
PORT_ACTIVE and PORT_ERROR events to IB-stack.

Depending on async-events from CNA removes the need to hold device-list-mutex
and thus breaks the busy-wait scenario.

Reported-by: Doug Ledford &lt;dledford@redhat.com&gt;
CC: Sathya Perla &lt;sathya.perla@avagotech.com&gt;
Signed-off-by: Padmanabh Ratnakar &lt;padmanabh.ratnakar@avagotech.com&gt;
Signed-off-by: Selvin Xavier &lt;selvin.xavier@avagotech.com&gt;
Signed-off-by: Devesh Sharma &lt;devesh.sharma@avagotech.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Recently Dough Ledford reported a deadlock happening
between ocrdma-load sequence and NetworkManager service
issuing "open" on be2net interface.

The deadlock happens when any be2net hook (e.g. open/close) is called
in parallel to insmod ocrdma.ko.

A. be2net is sending administrative open/close event to ocrdma holding
   device_list_mutex. It does this from ndo_open/ndo_stop hooks of be2net.
   So sequence of locks is rtnl_lock---&gt; device_list lock

B.  When new ocrdma roce device gets registered, infiniband stack now
    takes rtnl_lock in ib_register_device() in GID initialization routines.
    So sequence of locks in this path is device_list lock ---&gt; rtnl_lock.

This improper locking sequence causes deadlock.

With this patch we stop using administrative open and close events
injected by be2net driver. These events were used to dispatch PORT_ACTIVE
and PORT_ERROR events to the IB-stack. This patch implements a logic
to receive async-link-events generated from CNA whenever link-state-change
is detected. Now on, these async-events will be used to dispatch
PORT_ACTIVE and PORT_ERROR events to IB-stack.

Depending on async-events from CNA removes the need to hold device-list-mutex
and thus breaks the busy-wait scenario.

Reported-by: Doug Ledford &lt;dledford@redhat.com&gt;
CC: Sathya Perla &lt;sathya.perla@avagotech.com&gt;
Signed-off-by: Padmanabh Ratnakar &lt;padmanabh.ratnakar@avagotech.com&gt;
Signed-off-by: Selvin Xavier &lt;selvin.xavier@avagotech.com&gt;
Signed-off-by: Devesh Sharma &lt;devesh.sharma@avagotech.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/ocrdma: Dispatch only port event when port state changes</title>
<updated>2015-12-28T16:45:54+00:00</updated>
<author>
<name>Devesh Sharma</name>
<email>devesh.sharma@avagotech.com</email>
</author>
<published>2015-12-24T18:14:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=36ac0db0dbf7081afe4137d444ef85614213b8eb'/>
<id>36ac0db0dbf7081afe4137d444ef85614213b8eb</id>
<content type='text'>
Dispatch only port event to IB stack when port state changes.
Don't explicitly modify qps to error. Let application listen to
port events on async event queue or let QP fail with retry-exceeded
completion error.

Signed-off-by: Padmanabh Ratnakar &lt;padmanabh.ratnakar@avagotech.com&gt;
Signed-off-by: Devesh Sharma &lt;devesh.sharma@avagotech.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Dispatch only port event to IB stack when port state changes.
Don't explicitly modify qps to error. Let application listen to
port events on async event queue or let QP fail with retry-exceeded
completion error.

Signed-off-by: Padmanabh Ratnakar &lt;padmanabh.ratnakar@avagotech.com&gt;
Signed-off-by: Devesh Sharma &lt;devesh.sharma@avagotech.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/ocrdma: Fix vlan-id assignment in qp parameters</title>
<updated>2015-12-28T16:45:54+00:00</updated>
<author>
<name>Devesh Sharma</name>
<email>devesh.sharma@avagotech.com</email>
</author>
<published>2015-12-24T18:14:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c6002d5602ab36c19ef4fe0e20ecfa28aaabf028'/>
<id>c6002d5602ab36c19ef4fe0e20ecfa28aaabf028</id>
<content type='text'>
vlan-id is wrongly getting as 0 when PFC is enabled.
Set vlan-id configured by user in QP parameters.
In case vlan interface is not used, flash a warning to
user to configure vlan and assign vlan-id as 0 in qp params.

Fixes: dbf727de7440 ('IB/core: Use GID table in AH creation and dmac resolution')
Cc: Matan Barak &lt;matanb@mellanox.com&gt;
Signed-off-by: Devesh Sharma &lt;devesh.sharma@avagotech.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vlan-id is wrongly getting as 0 when PFC is enabled.
Set vlan-id configured by user in QP parameters.
In case vlan interface is not used, flash a warning to
user to configure vlan and assign vlan-id as 0 in qp params.

Fixes: dbf727de7440 ('IB/core: Use GID table in AH creation and dmac resolution')
Cc: Matan Barak &lt;matanb@mellanox.com&gt;
Signed-off-by: Devesh Sharma &lt;devesh.sharma@avagotech.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mlx4: Replace kfree with kvfree in mlx4_ib_destroy_srq</title>
<updated>2015-12-23T04:23:34+00:00</updated>
<author>
<name>Wengang Wang</name>
<email>wen.gang.wang@oracle.com</email>
</author>
<published>2015-12-17T02:54:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=df4176677fdf7ac0c5748083eb7a5b269fb1e156'/>
<id>df4176677fdf7ac0c5748083eb7a5b269fb1e156</id>
<content type='text'>
Commit 0ef2f05c7e02ff99c0b5b583d7dee2cd12b053f2 uses vmalloc for WR buffers
when needed and uses kvfree to free the buffers. It missed changing kfree
to kvfree in mlx4_ib_destroy_srq().

Reported-by: Matthew Finaly &lt;matt@Mellanox.com&gt;
Signed-off-by: Wengang Wang &lt;wen.gang.wang@oracle.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 0ef2f05c7e02ff99c0b5b583d7dee2cd12b053f2 uses vmalloc for WR buffers
when needed and uses kvfree to free the buffers. It missed changing kfree
to kvfree in mlx4_ib_destroy_srq().

Reported-by: Matthew Finaly &lt;matt@Mellanox.com&gt;
Signed-off-by: Wengang Wang &lt;wen.gang.wang@oracle.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/cma: cma_match_net_dev needs to take into account port_num</title>
<updated>2015-12-23T04:22:50+00:00</updated>
<author>
<name>Matan Barak</name>
<email>matanb@mellanox.com</email>
</author>
<published>2015-12-21T15:01:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fac51590c1a077809984139e9bb9e06ed366f219'/>
<id>fac51590c1a077809984139e9bb9e06ed366f219</id>
<content type='text'>
Previously, cma_match_net_dev called cma_protocol_roce which
tried to verify that the IB device uses RoCE protocol. However,
if rdma_id wasn't bound to a port, then the check would occur
against the first port of the device without regard to whether
that port was even of the same type as the type of port the
incoming packet was received on.

Fix this by passing the port of the request and only checking
against the same port of the device.

Reported-by: Or Gerlitz &lt;gerlitz.or@gmail.com&gt;
Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE')
Signed-off-by: Matan Barak &lt;matanb@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously, cma_match_net_dev called cma_protocol_roce which
tried to verify that the IB device uses RoCE protocol. However,
if rdma_id wasn't bound to a port, then the check would occur
against the first port of the device without regard to whether
that port was even of the same type as the type of port the
incoming packet was received on.

Fix this by passing the port of the request and only checking
against the same port of the device.

Reported-by: Or Gerlitz &lt;gerlitz.or@gmail.com&gt;
Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE')
Signed-off-by: Matan Barak &lt;matanb@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mlx5: Postpone remove_keys under knowledge of coming preemption</title>
<updated>2015-12-08T21:55:31+00:00</updated>
<author>
<name>Leon Romanovsky</name>
<email>leonro@mellanox.com</email>
</author>
<published>2015-10-21T06:21:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ab5cdc31630c7596d81ca8fbe7d695f10666f39b'/>
<id>ab5cdc31630c7596d81ca8fbe7d695f10666f39b</id>
<content type='text'>
The remove_keys() logic is performed as garbage collection task. Such
task is intended to be run when no other active processes are running.

The need_resched() will return TRUE if there are user tasks to be
activated in near future.

In such case, we don't execute remove_keys() and postpone
the garbage collection work to try to run in next cycle,
in order to free CPU resources to other tasks.

The possible pseudo-code to trigger such scenario:
1. Allocate a lot of MR to fill the cache above the limit.
2. Wait a small amount of time "to calm" the system.
3. Start CPU extensive operations on multi-node cluster.
4. Expect performance degradation during MR cache shrink operation.

Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Eli Cohen &lt;eli@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The remove_keys() logic is performed as garbage collection task. Such
task is intended to be run when no other active processes are running.

The need_resched() will return TRUE if there are user tasks to be
activated in near future.

In such case, we don't execute remove_keys() and postpone
the garbage collection work to try to run in next cycle,
in order to free CPU resources to other tasks.

The possible pseudo-code to trigger such scenario:
1. Allocate a lot of MR to fill the cache above the limit.
2. Wait a small amount of time "to calm" the system.
3. Start CPU extensive operations on multi-node cluster.
4. Expect performance degradation during MR cache shrink operation.

Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Eli Cohen &lt;eli@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mlx4: Use vmalloc for WR buffers when needed</title>
<updated>2015-12-08T21:48:10+00:00</updated>
<author>
<name>Wengang Wang</name>
<email>wen.gang.wang@oracle.com</email>
</author>
<published>2015-10-08T05:27:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0ef2f05c7e02ff99c0b5b583d7dee2cd12b053f2'/>
<id>0ef2f05c7e02ff99c0b5b583d7dee2cd12b053f2</id>
<content type='text'>
There are several hits that WR buffer allocation(kmalloc) failed.
It failed at order 3 and/or 4 contigous pages allocation. At the same time
there are actually 100MB+ free memory but well fragmented.
So try vmalloc when kmalloc failed.

Signed-off-by: Wengang Wang &lt;wen.gang.wang@oracle.com&gt;
Acked-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are several hits that WR buffer allocation(kmalloc) failed.
It failed at order 3 and/or 4 contigous pages allocation. At the same time
there are actually 100MB+ free memory but well fragmented.
So try vmalloc when kmalloc failed.

Signed-off-by: Wengang Wang &lt;wen.gang.wang@oracle.com&gt;
Acked-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>iser-target: Remove explicit mlx4 work-around</title>
<updated>2015-12-08T18:01:11+00:00</updated>
<author>
<name>Sagi Grimberg</name>
<email>sagig@mellanox.com</email>
</author>
<published>2015-10-28T11:28:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f1a47d37fbc65bc49d32f23ab61c3e09fefeab73'/>
<id>f1a47d37fbc65bc49d32f23ab61c3e09fefeab73</id>
<content type='text'>
The driver now exposes sufficient limits so we can
avoid having mlx4 specific work-around.

Signed-off-by: Sagi Grimberg &lt;sagig@mellanox.com&gt;
Reviewed-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The driver now exposes sufficient limits so we can
avoid having mlx4 specific work-around.

Signed-off-by: Sagi Grimberg &lt;sagig@mellanox.com&gt;
Reviewed-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mlx4: Expose correct max_sge_rd limit</title>
<updated>2015-12-08T17:42:44+00:00</updated>
<author>
<name>Sagi Grimberg</name>
<email>sagig@mellanox.com</email>
</author>
<published>2015-10-28T11:28:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a5e14ba334e202c58e45ef47414ec94c585c1a8c'/>
<id>a5e14ba334e202c58e45ef47414ec94c585c1a8c</id>
<content type='text'>
mlx4 devices (ConnectX-2, ConnectX-3) has a limitation
where rdma read work queue entries cannot exceed 512 bytes.
A rdma_read wqe needs to fit in 512 bytes:
- wqe control segment (16 bytes)
- rdma segment (16 bytes)
- scatter elements (16 bytes each)

So max_sge_rd should be: (512 - 16 - 16) / 16 = 30.

Signed-off-by: Sagi Grimberg &lt;sagig@mellanox.com&gt;
Reviewed-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
mlx4 devices (ConnectX-2, ConnectX-3) has a limitation
where rdma read work queue entries cannot exceed 512 bytes.
A rdma_read wqe needs to fit in 512 bytes:
- wqe control segment (16 bytes)
- rdma segment (16 bytes)
- scatter elements (16 bytes each)

So max_sge_rd should be: (512 - 16 - 16) / 16 = 30.

Signed-off-by: Sagi Grimberg &lt;sagig@mellanox.com&gt;
Reviewed-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
