<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/infiniband, branch v5.2.1</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>RDMA/efa: Handle mmap insertions overflow</title>
<updated>2019-06-18T20:27:24+00:00</updated>
<author>
<name>Gal Pressman</name>
<email>galpress@amazon.com</email>
</author>
<published>2019-06-18T13:07:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7a5834e456f7fb3eca9b63af2a6bc7f460ae482f'/>
<id>7a5834e456f7fb3eca9b63af2a6bc7f460ae482f</id>
<content type='text'>
When inserting a new mmap entry to the xarray we should check for
'mmap_page' overflow as it is limited to 32 bits.

Fixes: 40909f664d27 ("RDMA/efa: Add EFA verbs implementation")
Signed-off-by: Gal Pressman &lt;galpress@amazon.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When inserting a new mmap entry to the xarray we should check for
'mmap_page' overflow as it is limited to 32 bits.

Fixes: 40909f664d27 ("RDMA/efa: Add EFA verbs implementation")
Signed-off-by: Gal Pressman &lt;galpress@amazon.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/efa: Fix success return value in case of error</title>
<updated>2019-06-18T01:35:21+00:00</updated>
<author>
<name>Gal Pressman</name>
<email>galpress@amazon.com</email>
</author>
<published>2019-06-12T07:28:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=529254340c7f16d59b928e36568597c603bae917'/>
<id>529254340c7f16d59b928e36568597c603bae917</id>
<content type='text'>
Existing code would mistakenly return success in case of error instead
of a proper return value.

Fixes: e9c6c5373088 ("RDMA/efa: Add common command handlers")
Reviewed-by: Firas JahJah &lt;firasj@amazon.com&gt;
Reviewed-by: Yossi Leybovich &lt;sleybo@amazon.com&gt;
Signed-off-by: Gal Pressman &lt;galpress@amazon.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Existing code would mistakenly return success in case of error instead
of a proper return value.

Fixes: e9c6c5373088 ("RDMA/efa: Add common command handlers")
Reviewed-by: Firas JahJah &lt;firasj@amazon.com&gt;
Reviewed-by: Yossi Leybovich &lt;sleybo@amazon.com&gt;
Signed-off-by: Gal Pressman &lt;galpress@amazon.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Handle port down properly in pio</title>
<updated>2019-06-18T01:15:40+00:00</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@intel.com</email>
</author>
<published>2019-06-14T16:33:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=942a899335707fc9cfc97cb382a60734b2ff4e03'/>
<id>942a899335707fc9cfc97cb382a60734b2ff4e03</id>
<content type='text'>
The call to sc_buffer_alloc currently returns NULL (no buffer) or
a buffer descriptor.

There is a third case when the port is down.  Currently that
returns NULL and this prevents the caller from properly handling the
sc_buffer_alloc() failure.  A verbs code link test after the call is
racy so the indication needs to come from the state check inside the allocation
routine to be valid.

Fix by encoding the ECOMM failure like SDMA.   IS_ERR_OR_NULL() tests
are added at all call sites.  For verbs send, this needs to treat any
error by returning a completion without any MMIO copy.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The call to sc_buffer_alloc currently returns NULL (no buffer) or
a buffer descriptor.

There is a third case when the port is down.  Currently that
returns NULL and this prevents the caller from properly handling the
sc_buffer_alloc() failure.  A verbs code link test after the call is
racy so the indication needs to come from the state check inside the allocation
routine to be valid.

Fix by encoding the ECOMM failure like SDMA.   IS_ERR_OR_NULL() tests
are added at all call sites.  For verbs send, this needs to treat any
error by returning a completion without any MMIO copy.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Handle wakeup of orphaned QPs for pio</title>
<updated>2019-06-18T01:15:40+00:00</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@intel.com</email>
</author>
<published>2019-06-14T16:33:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=099a884ba4c00145cef283d36e050726311c2e95'/>
<id>099a884ba4c00145cef283d36e050726311c2e95</id>
<content type='text'>
Once a send context is taken down due to a link failure, any QPs waiting
for pio credits will stay on the waitlist indefinitely.

Fix by wakeing up all QPs linked to piowait list.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Once a send context is taken down due to a link failure, any QPs waiting
for pio credits will stay on the waitlist indefinitely.

Fix by wakeing up all QPs linked to piowait list.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Wakeup QPs orphaned on wait list after flush</title>
<updated>2019-06-18T01:15:40+00:00</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@intel.com</email>
</author>
<published>2019-06-14T16:32:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f972775b1cc0441ae22c9f8d06dd16b118463632'/>
<id>f972775b1cc0441ae22c9f8d06dd16b118463632</id>
<content type='text'>
Once an SDMA engine is taken down due to a link failure, any waiting QPs
that do not have outstanding descriptors in the ring will stay
on the dmawait list as long as the port is down.

Since there is no timer running, they will stay there for a long time.

The fix is to wake up all iowaits linked to dmawait. The send engine
will build and post packets that get flushed back.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Once an SDMA engine is taken down due to a link failure, any waiting QPs
that do not have outstanding descriptors in the ring will stay
on the dmawait list as long as the port is down.

Since there is no timer running, they will stay there for a long time.

The fix is to wake up all iowaits linked to dmawait. The send engine
will build and post packets that get flushed back.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Use aborts to trigger RC throttling</title>
<updated>2019-06-18T01:15:40+00:00</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@intel.com</email>
</author>
<published>2019-06-14T16:32:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4bb02e9572af1383038d83ad196d7166c515f2ee'/>
<id>4bb02e9572af1383038d83ad196d7166c515f2ee</id>
<content type='text'>
SDMA and pio flushes will cause a lot of packets to be transmitted
after a link has gone down, using a lot of CPU to retransmit
packets.

Fix for RC QPs by recognizing the flush status and:
- Forcing a timer start
- Putting the QP into a "send one" mode

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
SDMA and pio flushes will cause a lot of packets to be transmitted
after a link has gone down, using a lot of CPU to retransmit
packets.

Fix for RC QPs by recognizing the flush status and:
- Forcing a timer start
- Putting the QP into a "send one" mode

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Create inline to get extended headers</title>
<updated>2019-06-18T01:15:40+00:00</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@intel.com</email>
</author>
<published>2019-06-14T16:32:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9755f72496664eec70bc804104118b5797b6bf63'/>
<id>9755f72496664eec70bc804104118b5797b6bf63</id>
<content type='text'>
This paves the way for another patch that reacts to a
flush sdma completion for RC.

Fixes: 81cd3891f021 ("IB/hfi1: Add support for 16B Management Packets")
Reviewed-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This paves the way for another patch that reacts to a
flush sdma completion for RC.

Fixes: 81cd3891f021 ("IB/hfi1: Add support for 16B Management Packets")
Reviewed-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Silence txreq allocation warnings</title>
<updated>2019-06-18T01:15:40+00:00</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@intel.com</email>
</author>
<published>2019-06-14T16:32:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3230f4a8d44e4a0bb7afea814b280b5129521f52'/>
<id>3230f4a8d44e4a0bb7afea814b280b5129521f52</id>
<content type='text'>
The following warning can happen when a memory shortage
occurs during txreq allocation:

[10220.939246] SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
[10220.939246] Hardware name: Intel Corporation S2600WT2R/S2600WT2R, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/2016
[10220.939247]   cache: mnt_cache, object size: 384, buffer size: 384, default order: 2, min order: 0
[10220.939260] Workqueue: hfi0_0 _hfi1_do_send [hfi1]
[10220.939261]   node 0: slabs: 1026568, objs: 43115856, free: 0
[10220.939262] Call Trace:
[10220.939262]   node 1: slabs: 820872, objs: 34476624, free: 0
[10220.939263]  dump_stack+0x5a/0x73
[10220.939265]  warn_alloc+0x103/0x190
[10220.939267]  ? wake_all_kswapds+0x54/0x8b
[10220.939268]  __alloc_pages_slowpath+0x86c/0xa2e
[10220.939270]  ? __alloc_pages_nodemask+0x2fe/0x320
[10220.939271]  __alloc_pages_nodemask+0x2fe/0x320
[10220.939273]  new_slab+0x475/0x550
[10220.939275]  ___slab_alloc+0x36c/0x520
[10220.939287]  ? hfi1_make_rc_req+0x90/0x18b0 [hfi1]
[10220.939299]  ? __get_txreq+0x54/0x160 [hfi1]
[10220.939310]  ? hfi1_make_rc_req+0x90/0x18b0 [hfi1]
[10220.939312]  __slab_alloc+0x40/0x61
[10220.939323]  ? hfi1_make_rc_req+0x90/0x18b0 [hfi1]
[10220.939325]  kmem_cache_alloc+0x181/0x1b0
[10220.939336]  hfi1_make_rc_req+0x90/0x18b0 [hfi1]
[10220.939348]  ? hfi1_verbs_send_dma+0x386/0xa10 [hfi1]
[10220.939359]  ? find_prev_entry+0xb0/0xb0 [hfi1]
[10220.939371]  hfi1_do_send+0x1d9/0x3f0 [hfi1]
[10220.939372]  process_one_work+0x171/0x380
[10220.939374]  worker_thread+0x49/0x3f0
[10220.939375]  kthread+0xf8/0x130
[10220.939377]  ? max_active_store+0x80/0x80
[10220.939378]  ? kthread_bind+0x10/0x10
[10220.939379]  ret_from_fork+0x35/0x40
[10220.939381] SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)

The shortage is handled properly so the message isn't needed. Silence by
adding the no warn option to the slab allocation.

Fixes: 45842abbb292 ("staging/rdma/hfi1: move txreq header code")
Cc: &lt;stable@vger.kernel.org&gt;
Reviewed-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The following warning can happen when a memory shortage
occurs during txreq allocation:

[10220.939246] SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
[10220.939246] Hardware name: Intel Corporation S2600WT2R/S2600WT2R, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/2016
[10220.939247]   cache: mnt_cache, object size: 384, buffer size: 384, default order: 2, min order: 0
[10220.939260] Workqueue: hfi0_0 _hfi1_do_send [hfi1]
[10220.939261]   node 0: slabs: 1026568, objs: 43115856, free: 0
[10220.939262] Call Trace:
[10220.939262]   node 1: slabs: 820872, objs: 34476624, free: 0
[10220.939263]  dump_stack+0x5a/0x73
[10220.939265]  warn_alloc+0x103/0x190
[10220.939267]  ? wake_all_kswapds+0x54/0x8b
[10220.939268]  __alloc_pages_slowpath+0x86c/0xa2e
[10220.939270]  ? __alloc_pages_nodemask+0x2fe/0x320
[10220.939271]  __alloc_pages_nodemask+0x2fe/0x320
[10220.939273]  new_slab+0x475/0x550
[10220.939275]  ___slab_alloc+0x36c/0x520
[10220.939287]  ? hfi1_make_rc_req+0x90/0x18b0 [hfi1]
[10220.939299]  ? __get_txreq+0x54/0x160 [hfi1]
[10220.939310]  ? hfi1_make_rc_req+0x90/0x18b0 [hfi1]
[10220.939312]  __slab_alloc+0x40/0x61
[10220.939323]  ? hfi1_make_rc_req+0x90/0x18b0 [hfi1]
[10220.939325]  kmem_cache_alloc+0x181/0x1b0
[10220.939336]  hfi1_make_rc_req+0x90/0x18b0 [hfi1]
[10220.939348]  ? hfi1_verbs_send_dma+0x386/0xa10 [hfi1]
[10220.939359]  ? find_prev_entry+0xb0/0xb0 [hfi1]
[10220.939371]  hfi1_do_send+0x1d9/0x3f0 [hfi1]
[10220.939372]  process_one_work+0x171/0x380
[10220.939374]  worker_thread+0x49/0x3f0
[10220.939375]  kthread+0xf8/0x130
[10220.939377]  ? max_active_store+0x80/0x80
[10220.939378]  ? kthread_bind+0x10/0x10
[10220.939379]  ret_from_fork+0x35/0x40
[10220.939381] SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)

The shortage is handled properly so the message isn't needed. Silence by
adding the no warn option to the slab allocation.

Fixes: 45842abbb292 ("staging/rdma/hfi1: move txreq header code")
Cc: &lt;stable@vger.kernel.org&gt;
Reviewed-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Avoid hardlockup with flushlist_lock</title>
<updated>2019-06-18T01:15:40+00:00</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@intel.com</email>
</author>
<published>2019-06-14T16:32:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=cf131a81967583ae737df6383a0893b9fee75b4e'/>
<id>cf131a81967583ae737df6383a0893b9fee75b4e</id>
<content type='text'>
Heavy contention of the sde flushlist_lock can cause hard lockups at
extreme scale when the flushing logic is under stress.

Mitigate by replacing the item at a time copy to the local list with
an O(1) list_splice_init() and using the high priority work queue to
do the flushes.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Cc: &lt;stable@vger.kernel.org&gt;
Reviewed-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Heavy contention of the sde flushlist_lock can cause hard lockups at
extreme scale when the flushing logic is under stress.

Mitigate by replacing the item at a time copy to the local list with
an O(1) list_splice_init() and using the high priority work queue to
do the flushes.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Cc: &lt;stable@vger.kernel.org&gt;
Reviewed-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Correct tid qp rcd to match verbs context</title>
<updated>2019-06-11T20:06:45+00:00</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@intel.com</email>
</author>
<published>2019-06-10T16:28:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=cc78076af14e1478c1a8fb18997674b5f8cbe3c8'/>
<id>cc78076af14e1478c1a8fb18997674b5f8cbe3c8</id>
<content type='text'>
The qp priv rcd pointer doesn't match the context being used for verbs
causing issues when 9B and kdeth packets are processed by different
receive contexts and hence different CPUs.

When running on different CPUs the following panic can occur:

 WARNING: CPU: 3 PID: 2584 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0
 list_del corruption. prev-&gt;next should be ffff9a7ac31f7a30, but was ffff9a7c3bc89230
 CPU: 3 PID: 2584 Comm: z_wr_iss Kdump: loaded Tainted: P           OE  ------------   3.10.0-862.2.3.el7_lustre.x86_64 #1
 Call Trace:
  &lt;IRQ&gt;  [&lt;ffffffffb7b0d78e&gt;] dump_stack+0x19/0x1b
  [&lt;ffffffffb74916d8&gt;] __warn+0xd8/0x100
  [&lt;ffffffffb749175f&gt;] warn_slowpath_fmt+0x5f/0x80
  [&lt;ffffffffb7768671&gt;] __list_del_entry+0xa1/0xd0
  [&lt;ffffffffc0c7a945&gt;] process_rcv_qp_work+0xb5/0x160 [hfi1]
  [&lt;ffffffffc0c7bc2b&gt;] handle_receive_interrupt_nodma_rtail+0x20b/0x2b0 [hfi1]
  [&lt;ffffffffc0c70683&gt;] receive_context_interrupt+0x23/0x40 [hfi1]
  [&lt;ffffffffb7540a94&gt;] __handle_irq_event_percpu+0x44/0x1c0
  [&lt;ffffffffb7540c42&gt;] handle_irq_event_percpu+0x32/0x80
  [&lt;ffffffffb7540ccc&gt;] handle_irq_event+0x3c/0x60
  [&lt;ffffffffb7543a1f&gt;] handle_edge_irq+0x7f/0x150
  [&lt;ffffffffb742d504&gt;] handle_irq+0xe4/0x1a0
  [&lt;ffffffffb7b23f7d&gt;] do_IRQ+0x4d/0xf0
  [&lt;ffffffffb7b16362&gt;] common_interrupt+0x162/0x162
  &lt;EOI&gt;  [&lt;ffffffffb775a326&gt;] ? memcpy+0x6/0x110
  [&lt;ffffffffc109210d&gt;] ? abd_copy_from_buf_off_cb+0x1d/0x30 [zfs]
  [&lt;ffffffffc10920f0&gt;] ? abd_copy_to_buf_off_cb+0x30/0x30 [zfs]
  [&lt;ffffffffc1093257&gt;] abd_iterate_func+0x97/0x120 [zfs]
  [&lt;ffffffffc10934d9&gt;] abd_copy_from_buf_off+0x39/0x60 [zfs]
  [&lt;ffffffffc109b828&gt;] arc_write_ready+0x178/0x300 [zfs]
  [&lt;ffffffffb7b11032&gt;] ? mutex_lock+0x12/0x2f
  [&lt;ffffffffb7b11032&gt;] ? mutex_lock+0x12/0x2f
  [&lt;ffffffffc1164d05&gt;] zio_ready+0x65/0x3d0 [zfs]
  [&lt;ffffffffc04d725e&gt;] ? tsd_get_by_thread+0x2e/0x50 [spl]
  [&lt;ffffffffc04d1318&gt;] ? taskq_member+0x18/0x30 [spl]
  [&lt;ffffffffc115ef22&gt;] zio_execute+0xa2/0x100 [zfs]
  [&lt;ffffffffc04d1d2c&gt;] taskq_thread+0x2ac/0x4f0 [spl]
  [&lt;ffffffffb74cee80&gt;] ? wake_up_state+0x20/0x20
  [&lt;ffffffffc115ee80&gt;] ? zio_taskq_member.isra.7.constprop.10+0x80/0x80 [zfs]
  [&lt;ffffffffc04d1a80&gt;] ? taskq_thread_spawn+0x60/0x60 [spl]
  [&lt;ffffffffb74bae31&gt;] kthread+0xd1/0xe0
  [&lt;ffffffffb74bad60&gt;] ? insert_kthread_work+0x40/0x40
  [&lt;ffffffffb7b1f5f7&gt;] ret_from_fork_nospec_begin+0x21/0x21
  [&lt;ffffffffb74bad60&gt;] ? insert_kthread_work+0x40/0x40

Fix by reading the map entry in the same manner as the hardware so that
the kdeth and verbs contexts match.

Cc: &lt;stable@vger.kernel.org&gt;
Fixes: 5190f052a365 ("IB/hfi1: Allow the driver to initialize QP priv struct")
Reviewed-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The qp priv rcd pointer doesn't match the context being used for verbs
causing issues when 9B and kdeth packets are processed by different
receive contexts and hence different CPUs.

When running on different CPUs the following panic can occur:

 WARNING: CPU: 3 PID: 2584 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0
 list_del corruption. prev-&gt;next should be ffff9a7ac31f7a30, but was ffff9a7c3bc89230
 CPU: 3 PID: 2584 Comm: z_wr_iss Kdump: loaded Tainted: P           OE  ------------   3.10.0-862.2.3.el7_lustre.x86_64 #1
 Call Trace:
  &lt;IRQ&gt;  [&lt;ffffffffb7b0d78e&gt;] dump_stack+0x19/0x1b
  [&lt;ffffffffb74916d8&gt;] __warn+0xd8/0x100
  [&lt;ffffffffb749175f&gt;] warn_slowpath_fmt+0x5f/0x80
  [&lt;ffffffffb7768671&gt;] __list_del_entry+0xa1/0xd0
  [&lt;ffffffffc0c7a945&gt;] process_rcv_qp_work+0xb5/0x160 [hfi1]
  [&lt;ffffffffc0c7bc2b&gt;] handle_receive_interrupt_nodma_rtail+0x20b/0x2b0 [hfi1]
  [&lt;ffffffffc0c70683&gt;] receive_context_interrupt+0x23/0x40 [hfi1]
  [&lt;ffffffffb7540a94&gt;] __handle_irq_event_percpu+0x44/0x1c0
  [&lt;ffffffffb7540c42&gt;] handle_irq_event_percpu+0x32/0x80
  [&lt;ffffffffb7540ccc&gt;] handle_irq_event+0x3c/0x60
  [&lt;ffffffffb7543a1f&gt;] handle_edge_irq+0x7f/0x150
  [&lt;ffffffffb742d504&gt;] handle_irq+0xe4/0x1a0
  [&lt;ffffffffb7b23f7d&gt;] do_IRQ+0x4d/0xf0
  [&lt;ffffffffb7b16362&gt;] common_interrupt+0x162/0x162
  &lt;EOI&gt;  [&lt;ffffffffb775a326&gt;] ? memcpy+0x6/0x110
  [&lt;ffffffffc109210d&gt;] ? abd_copy_from_buf_off_cb+0x1d/0x30 [zfs]
  [&lt;ffffffffc10920f0&gt;] ? abd_copy_to_buf_off_cb+0x30/0x30 [zfs]
  [&lt;ffffffffc1093257&gt;] abd_iterate_func+0x97/0x120 [zfs]
  [&lt;ffffffffc10934d9&gt;] abd_copy_from_buf_off+0x39/0x60 [zfs]
  [&lt;ffffffffc109b828&gt;] arc_write_ready+0x178/0x300 [zfs]
  [&lt;ffffffffb7b11032&gt;] ? mutex_lock+0x12/0x2f
  [&lt;ffffffffb7b11032&gt;] ? mutex_lock+0x12/0x2f
  [&lt;ffffffffc1164d05&gt;] zio_ready+0x65/0x3d0 [zfs]
  [&lt;ffffffffc04d725e&gt;] ? tsd_get_by_thread+0x2e/0x50 [spl]
  [&lt;ffffffffc04d1318&gt;] ? taskq_member+0x18/0x30 [spl]
  [&lt;ffffffffc115ef22&gt;] zio_execute+0xa2/0x100 [zfs]
  [&lt;ffffffffc04d1d2c&gt;] taskq_thread+0x2ac/0x4f0 [spl]
  [&lt;ffffffffb74cee80&gt;] ? wake_up_state+0x20/0x20
  [&lt;ffffffffc115ee80&gt;] ? zio_taskq_member.isra.7.constprop.10+0x80/0x80 [zfs]
  [&lt;ffffffffc04d1a80&gt;] ? taskq_thread_spawn+0x60/0x60 [spl]
  [&lt;ffffffffb74bae31&gt;] kthread+0xd1/0xe0
  [&lt;ffffffffb74bad60&gt;] ? insert_kthread_work+0x40/0x40
  [&lt;ffffffffb7b1f5f7&gt;] ret_from_fork_nospec_begin+0x21/0x21
  [&lt;ffffffffb74bad60&gt;] ? insert_kthread_work+0x40/0x40

Fix by reading the map entry in the same manner as the hardware so that
the kdeth and verbs contexts match.

Cc: &lt;stable@vger.kernel.org&gt;
Fixes: 5190f052a365 ("IB/hfi1: Allow the driver to initialize QP priv struct")
Reviewed-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
