<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/infiniband, branch v6.2-rc7</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>IB/hfi1: Remove user expected buffer invalidate race</title>
<updated>2023-01-10T10:21:50+00:00</updated>
<author>
<name>Dean Luick</name>
<email>dean.luick@cornelisnetworks.com</email>
</author>
<published>2023-01-09T17:31:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b3deec25847bda34e34d5d7be02f633caf000bd8'/>
<id>b3deec25847bda34e34d5d7be02f633caf000bd8</id>
<content type='text'>
During setup, there is a possible race between a page invalidate
and hardware programming.  Add a covering invalidate over the user
target range during setup.  If anything within that range is
invalidated during setup, fail the setup.  Once set up, each
TID will have its own invalidate callback and invalidate.

Fixes: 3889551db212 ("RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv")
Signed-off-by: Dean Luick &lt;dean.luick@cornelisnetworks.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Link: https://lore.kernel.org/r/167328549178.1472310.9867497376936699488.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
During setup, there is a possible race between a page invalidate
and hardware programming.  Add a covering invalidate over the user
target range during setup.  If anything within that range is
invalidated during setup, fail the setup.  Once set up, each
TID will have its own invalidate callback and invalidate.

Fixes: 3889551db212 ("RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv")
Signed-off-by: Dean Luick &lt;dean.luick@cornelisnetworks.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Link: https://lore.kernel.org/r/167328549178.1472310.9867497376936699488.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Immediately remove invalid memory from hardware</title>
<updated>2023-01-10T10:21:50+00:00</updated>
<author>
<name>Dean Luick</name>
<email>dean.luick@cornelisnetworks.com</email>
</author>
<published>2023-01-09T17:31:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1c7edde1b5720ddb0aff5ca8c7f605a0f92526eb'/>
<id>1c7edde1b5720ddb0aff5ca8c7f605a0f92526eb</id>
<content type='text'>
When a user expected receive page is unmapped, it should be
immediately removed from hardware rather than depend on a
reaction from user space.

Fixes: 2677a7680e77 ("IB/hfi1: Fix memory leak during unexpected shutdown")
Signed-off-by: Dean Luick &lt;dean.luick@cornelisnetworks.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Link: https://lore.kernel.org/r/167328548663.1472310.7871808081861622659.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a user expected receive page is unmapped, it should be
immediately removed from hardware rather than depend on a
reaction from user space.

Fixes: 2677a7680e77 ("IB/hfi1: Fix memory leak during unexpected shutdown")
Signed-off-by: Dean Luick &lt;dean.luick@cornelisnetworks.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Link: https://lore.kernel.org/r/167328548663.1472310.7871808081861622659.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Fix expected receive setup error exit issues</title>
<updated>2023-01-10T10:21:50+00:00</updated>
<author>
<name>Dean Luick</name>
<email>dean.luick@cornelisnetworks.com</email>
</author>
<published>2023-01-09T17:31:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e0c4a422f5246abefbf7c178ef99a1f2dc3c5f62'/>
<id>e0c4a422f5246abefbf7c178ef99a1f2dc3c5f62</id>
<content type='text'>
Fix three error exit issues in expected receive setup.
Re-arrange error exits to increase readability.

Issues and fixes:
1. Possible missed page unpin if tidlist copyout fails and
   not all pinned pages where made part of a TID.
   Fix: Unpin the unused pages.

2. Return success with unset return values tidcnt and length
   when no pages were pinned.
   Fix: Return -ENOSPC if no pages were pinned.

3. Return success with unset return values tidcnt and length when
   no rcvarray entries available.
   Fix: Return -ENOSPC if no rcvarray entries are available.

Fixes: 7e7a436ecb6e ("staging/hfi1: Add TID entry program function body")
Fixes: 97736f36dbeb ("IB/hfi1: Validate page aligned for a given virtual addres")
Fixes: f404ca4c7ea8 ("IB/hfi1: Refactor hfi_user_exp_rcv_setup() IOCTL")
Signed-off-by: Dean Luick &lt;dean.luick@cornelisnetworks.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Link: https://lore.kernel.org/r/167328548150.1472310.1492305874804187634.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix three error exit issues in expected receive setup.
Re-arrange error exits to increase readability.

Issues and fixes:
1. Possible missed page unpin if tidlist copyout fails and
   not all pinned pages where made part of a TID.
   Fix: Unpin the unused pages.

2. Return success with unset return values tidcnt and length
   when no pages were pinned.
   Fix: Return -ENOSPC if no pages were pinned.

3. Return success with unset return values tidcnt and length when
   no rcvarray entries available.
   Fix: Return -ENOSPC if no rcvarray entries are available.

Fixes: 7e7a436ecb6e ("staging/hfi1: Add TID entry program function body")
Fixes: 97736f36dbeb ("IB/hfi1: Validate page aligned for a given virtual addres")
Fixes: f404ca4c7ea8 ("IB/hfi1: Refactor hfi_user_exp_rcv_setup() IOCTL")
Signed-off-by: Dean Luick &lt;dean.luick@cornelisnetworks.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Link: https://lore.kernel.org/r/167328548150.1472310.1492305874804187634.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Reserve user expected TIDs</title>
<updated>2023-01-10T10:21:49+00:00</updated>
<author>
<name>Dean Luick</name>
<email>dean.luick@cornelisnetworks.com</email>
</author>
<published>2023-01-09T17:31:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ecf91551cdd2925ed6d9a9d99074fa5f67b90596'/>
<id>ecf91551cdd2925ed6d9a9d99074fa5f67b90596</id>
<content type='text'>
To avoid a race, reserve the number of user expected
TIDs before setup.

Fixes: 7e7a436ecb6e ("staging/hfi1: Add TID entry program function body")
Signed-off-by: Dean Luick &lt;dean.luick@cornelisnetworks.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Link: https://lore.kernel.org/r/167328547636.1472310.7419712824785353905.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To avoid a race, reserve the number of user expected
TIDs before setup.

Fixes: 7e7a436ecb6e ("staging/hfi1: Add TID entry program function body")
Signed-off-by: Dean Luick &lt;dean.luick@cornelisnetworks.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Link: https://lore.kernel.org/r/167328547636.1472310.7419712824785353905.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Reject a zero-length user expected buffer</title>
<updated>2023-01-10T10:21:49+00:00</updated>
<author>
<name>Dean Luick</name>
<email>dean.luick@cornelisnetworks.com</email>
</author>
<published>2023-01-09T17:31:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0a0a6e80472c98947d73c3d13bcd7d101895f55d'/>
<id>0a0a6e80472c98947d73c3d13bcd7d101895f55d</id>
<content type='text'>
A zero length user buffer makes no sense and the code
does not handle it correctly.  Instead, reject a
zero length as invalid.

Fixes: 97736f36dbeb ("IB/hfi1: Validate page aligned for a given virtual addres")
Signed-off-by: Dean Luick &lt;dean.luick@cornelisnetworks.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Link: https://lore.kernel.org/r/167328547120.1472310.6362802432127399257.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A zero length user buffer makes no sense and the code
does not handle it correctly.  Instead, reject a
zero length as invalid.

Fixes: 97736f36dbeb ("IB/hfi1: Validate page aligned for a given virtual addres")
Signed-off-by: Dean Luick &lt;dean.luick@cornelisnetworks.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Link: https://lore.kernel.org/r/167328547120.1472310.6362802432127399257.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/core: Fix ib block iterator counter overflow</title>
<updated>2023-01-10T10:21:39+00:00</updated>
<author>
<name>Yonatan Nachum</name>
<email>ynachum@amazon.com</email>
</author>
<published>2023-01-09T13:37:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0afec5e9cea732cb47014655685a2a47fb180c31'/>
<id>0afec5e9cea732cb47014655685a2a47fb180c31</id>
<content type='text'>
When registering a new DMA MR after selecting the best aligned page size
for it, we iterate over the given sglist to split each entry to smaller,
aligned to the selected page size, DMA blocks.

In given circumstances where the sg entry and page size fit certain
sizes and the sg entry is not aligned to the selected page size, the
total size of the aligned pages we need to cover the sg entry is &gt;= 4GB.
Under this circumstances, while iterating page aligned blocks, the
counter responsible for counting how much we advanced from the start of
the sg entry is overflowed because its type is u32 and we pass 4GB in
size. This can lead to an infinite loop inside the iterator function
because the overflow prevents the counter to be larger
than the size of the sg entry.

Fix the presented problem by changing the advancement condition to
eliminate overflow.

Backtrace:
[  192.374329] efa_reg_user_mr_dmabuf
[  192.376783] efa_register_mr
[  192.382579] pgsz_bitmap 0xfffff000 rounddown 0x80000000
[  192.386423] pg_sz [0x80000000] umem_length[0xc0000000]
[  192.392657] start 0x0 length 0xc0000000 params.page_shift 31 params.page_num 3
[  192.399559] hp_cnt[3], pages_in_hp[524288]
[  192.403690] umem-&gt;sgt_append.sgt.nents[1]
[  192.407905] number entries: [1], pg_bit: [31]
[  192.411397] biter-&gt;__sg_nents [1] biter-&gt;__sg [0000000008b0c5d8]
[  192.415601] biter-&gt;__sg_advance [665837568] sg_dma_len[3221225472]
[  192.419823] biter-&gt;__sg_nents [1] biter-&gt;__sg [0000000008b0c5d8]
[  192.423976] biter-&gt;__sg_advance [2813321216] sg_dma_len[3221225472]
[  192.428243] biter-&gt;__sg_nents [1] biter-&gt;__sg [0000000008b0c5d8]
[  192.432397] biter-&gt;__sg_advance [665837568] sg_dma_len[3221225472]

Fixes: a808273a495c ("RDMA/verbs: Add a DMA iterator to return aligned contiguous memory blocks")
Signed-off-by: Yonatan Nachum &lt;ynachum@amazon.com&gt;
Link: https://lore.kernel.org/r/20230109133711.13678-1-ynachum@amazon.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When registering a new DMA MR after selecting the best aligned page size
for it, we iterate over the given sglist to split each entry to smaller,
aligned to the selected page size, DMA blocks.

In given circumstances where the sg entry and page size fit certain
sizes and the sg entry is not aligned to the selected page size, the
total size of the aligned pages we need to cover the sg entry is &gt;= 4GB.
Under this circumstances, while iterating page aligned blocks, the
counter responsible for counting how much we advanced from the start of
the sg entry is overflowed because its type is u32 and we pass 4GB in
size. This can lead to an infinite loop inside the iterator function
because the overflow prevents the counter to be larger
than the size of the sg entry.

Fix the presented problem by changing the advancement condition to
eliminate overflow.

Backtrace:
[  192.374329] efa_reg_user_mr_dmabuf
[  192.376783] efa_register_mr
[  192.382579] pgsz_bitmap 0xfffff000 rounddown 0x80000000
[  192.386423] pg_sz [0x80000000] umem_length[0xc0000000]
[  192.392657] start 0x0 length 0xc0000000 params.page_shift 31 params.page_num 3
[  192.399559] hp_cnt[3], pages_in_hp[524288]
[  192.403690] umem-&gt;sgt_append.sgt.nents[1]
[  192.407905] number entries: [1], pg_bit: [31]
[  192.411397] biter-&gt;__sg_nents [1] biter-&gt;__sg [0000000008b0c5d8]
[  192.415601] biter-&gt;__sg_advance [665837568] sg_dma_len[3221225472]
[  192.419823] biter-&gt;__sg_nents [1] biter-&gt;__sg [0000000008b0c5d8]
[  192.423976] biter-&gt;__sg_advance [2813321216] sg_dma_len[3221225472]
[  192.428243] biter-&gt;__sg_nents [1] biter-&gt;__sg [0000000008b0c5d8]
[  192.432397] biter-&gt;__sg_advance [665837568] sg_dma_len[3221225472]

Fixes: a808273a495c ("RDMA/verbs: Add a DMA iterator to return aligned contiguous memory blocks")
Signed-off-by: Yonatan Nachum &lt;ynachum@amazon.com&gt;
Link: https://lore.kernel.org/r/20230109133711.13678-1-ynachum@amazon.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/rxe: Prevent faulty rkey generation</title>
<updated>2023-01-09T14:48:16+00:00</updated>
<author>
<name>Daisuke Matsuda</name>
<email>matsuda-daisuke@fujitsu.com</email>
</author>
<published>2022-12-20T08:08:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1aefe5c177c1922119afb4ee443ddd6ac3140b37'/>
<id>1aefe5c177c1922119afb4ee443ddd6ac3140b37</id>
<content type='text'>
If you create MRs more than 0x10000 times after loading the module,
responder starts to reply NAKs for RDMA/Atomic operations because of rkey
violation detected in check_rkey(). The root cause is that rkeys are
incremented each time a new MR is created and the value overflows into the
range reserved for MWs.

This commit also increases the value of RXE_MAX_MW that has been limited
unlike other parameters.

Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
Link: https://lore.kernel.org/r/20221220080848.253785-2-matsuda-daisuke@fujitsu.com
Signed-off-by: Daisuke Matsuda &lt;matsuda-daisuke@fujitsu.com&gt;
Tested-by: Li Zhijian &lt;lizhijian@fujitsu.com&gt;
Reviewed-by: Li Zhijian &lt;lizhijian@fujitsu.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If you create MRs more than 0x10000 times after loading the module,
responder starts to reply NAKs for RDMA/Atomic operations because of rkey
violation detected in check_rkey(). The root cause is that rkeys are
incremented each time a new MR is created and the value overflows into the
range reserved for MWs.

This commit also increases the value of RXE_MAX_MW that has been limited
unlike other parameters.

Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
Link: https://lore.kernel.org/r/20221220080848.253785-2-matsuda-daisuke@fujitsu.com
Signed-off-by: Daisuke Matsuda &lt;matsuda-daisuke@fujitsu.com&gt;
Tested-by: Li Zhijian &lt;lizhijian@fujitsu.com&gt;
Reviewed-by: Li Zhijian &lt;lizhijian@fujitsu.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/rxe: Fix inaccurate constants in rxe_type_info</title>
<updated>2023-01-09T14:48:16+00:00</updated>
<author>
<name>Daisuke Matsuda</name>
<email>matsuda-daisuke@fujitsu.com</email>
</author>
<published>2022-12-20T08:08:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3a73746b267e5c6a87c9ad26f8c6a48e44da609c'/>
<id>3a73746b267e5c6a87c9ad26f8c6a48e44da609c</id>
<content type='text'>
ibv_query_device() has reported incorrect device attributes, which are
actually not used by the device. Make the constants correspond with the
attributes shown to users.

Fixes: 3ccffe8abf2f ("RDMA/rxe: Move max_elem into rxe_type_info")
Fixes: 3225717f6dfa ("RDMA/rxe: Replace red-black trees by xarrays")
Link: https://lore.kernel.org/r/20221220080848.253785-1-matsuda-daisuke@fujitsu.com
Signed-off-by: Daisuke Matsuda &lt;matsuda-daisuke@fujitsu.com&gt;
Reviewed-by: Li Zhijian &lt;lizhijian@fujitsu.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ibv_query_device() has reported incorrect device attributes, which are
actually not used by the device. Make the constants correspond with the
attributes shown to users.

Fixes: 3ccffe8abf2f ("RDMA/rxe: Move max_elem into rxe_type_info")
Fixes: 3225717f6dfa ("RDMA/rxe: Replace red-black trees by xarrays")
Link: https://lore.kernel.org/r/20221220080848.253785-1-matsuda-daisuke@fujitsu.com
Signed-off-by: Daisuke Matsuda &lt;matsuda-daisuke@fujitsu.com&gt;
Reviewed-by: Li Zhijian &lt;lizhijian@fujitsu.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mlx5: Fix validation of max_rd_atomic caps for DC</title>
<updated>2023-01-01T07:40:35+00:00</updated>
<author>
<name>Maor Gottlieb</name>
<email>maorg@nvidia.com</email>
</author>
<published>2022-12-28T12:56:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8de8482fe5732fbef4f5af82bc0c0362c804cd1f'/>
<id>8de8482fe5732fbef4f5af82bc0c0362c804cd1f</id>
<content type='text'>
Currently, when modifying DC, we validate max_rd_atomic user attribute
against the RC cap, validate against DC. RC and DC QP types have different
device limitations.

This can cause userspace created DC QPs to malfunction.

Fixes: c32a4f296e1d ("IB/mlx5: Add support for DC Initiator QP")
Link: https://lore.kernel.org/r/0c5aee72cea188c3bb770f4207cce7abc9b6fc74.1672231736.git.leonro@nvidia.com
Signed-off-by: Maor Gottlieb &lt;maorg@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, when modifying DC, we validate max_rd_atomic user attribute
against the RC cap, validate against DC. RC and DC QP types have different
device limitations.

This can cause userspace created DC QPs to malfunction.

Fixes: c32a4f296e1d ("IB/mlx5: Add support for DC Initiator QP")
Link: https://lore.kernel.org/r/0c5aee72cea188c3bb770f4207cce7abc9b6fc74.1672231736.git.leonro@nvidia.com
Signed-off-by: Maor Gottlieb &lt;maorg@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device</title>
<updated>2023-01-01T07:40:01+00:00</updated>
<author>
<name>Shay Drory</name>
<email>shayd@nvidia.com</email>
</author>
<published>2022-12-28T12:56:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=38b50aa44495d5eb4218f0b82fc2da76505cec53'/>
<id>38b50aa44495d5eb4218f0b82fc2da76505cec53</id>
<content type='text'>
Currently, when mlx5_ib_get_hw_stats() is used for device (port_num = 0),
there is a special handling in order to use the correct counters, but,
port_num is being passed down the stack without any change.  Also, some
functions assume that port_num &gt;=1. As a result, the following oops can
occur.

 BUG: unable to handle page fault for address: ffff89510294f1a8
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] SMP
 CPU: 8 PID: 1382 Comm: devlink Tainted: G W          6.1.0-rc4_for_upstream_base_2022_11_10_16_12 #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:_raw_spin_lock+0xc/0x20
 Call Trace:
  &lt;TASK&gt;
  mlx5_ib_get_native_port_mdev+0x73/0xe0 [mlx5_ib]
  do_get_hw_stats.constprop.0+0x109/0x160 [mlx5_ib]
  mlx5_ib_get_hw_stats+0xad/0x180 [mlx5_ib]
  ib_setup_device_attrs+0xf0/0x290 [ib_core]
  ib_register_device+0x3bb/0x510 [ib_core]
  ? atomic_notifier_chain_register+0x67/0x80
  __mlx5_ib_add+0x2b/0x80 [mlx5_ib]
  mlx5r_probe+0xb8/0x150 [mlx5_ib]
  ? auxiliary_match_id+0x6a/0x90
  auxiliary_bus_probe+0x3c/0x70
  ? driver_sysfs_add+0x6b/0x90
  really_probe+0xcd/0x380
  __driver_probe_device+0x80/0x170
  driver_probe_device+0x1e/0x90
  __device_attach_driver+0x7d/0x100
  ? driver_allows_async_probing+0x60/0x60
  ? driver_allows_async_probing+0x60/0x60
  bus_for_each_drv+0x7b/0xc0
  __device_attach+0xbc/0x200
  bus_probe_device+0x87/0xa0
  device_add+0x404/0x940
  ? dev_set_name+0x53/0x70
  __auxiliary_device_add+0x43/0x60
  add_adev+0x99/0xe0 [mlx5_core]
  mlx5_attach_device+0xc8/0x120 [mlx5_core]
  mlx5_load_one_devl_locked+0xb2/0xe0 [mlx5_core]
  devlink_reload+0x133/0x250
  devlink_nl_cmd_reload+0x480/0x570
  ? devlink_nl_pre_doit+0x44/0x2b0
  genl_family_rcv_msg_doit.isra.0+0xc2/0x110
  genl_rcv_msg+0x180/0x2b0
  ? devlink_nl_cmd_region_read_dumpit+0x540/0x540
  ? devlink_reload+0x250/0x250
  ? devlink_put+0x50/0x50
  ? genl_family_rcv_msg_doit.isra.0+0x110/0x110
  netlink_rcv_skb+0x54/0x100
  genl_rcv+0x24/0x40
  netlink_unicast+0x1f6/0x2c0
  netlink_sendmsg+0x237/0x490
  sock_sendmsg+0x33/0x40
  __sys_sendto+0x103/0x160
  ? handle_mm_fault+0x10e/0x290
  ? do_user_addr_fault+0x1c0/0x5f0
  __x64_sys_sendto+0x25/0x30
  do_syscall_64+0x3d/0x90
  entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fix it by setting port_num to 1 in order to get device status and remove
unused variable.

Fixes: aac4492ef23a ("IB/mlx5: Update counter implementation for dual port RoCE")
Link: https://lore.kernel.org/r/98b82994c3cd3fa593b8a75ed3f3901e208beb0f.1672231736.git.leonro@nvidia.com
Signed-off-by: Shay Drory &lt;shayd@nvidia.com&gt;
Reviewed-by: Patrisious Haddad &lt;phaddad@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, when mlx5_ib_get_hw_stats() is used for device (port_num = 0),
there is a special handling in order to use the correct counters, but,
port_num is being passed down the stack without any change.  Also, some
functions assume that port_num &gt;=1. As a result, the following oops can
occur.

 BUG: unable to handle page fault for address: ffff89510294f1a8
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] SMP
 CPU: 8 PID: 1382 Comm: devlink Tainted: G W          6.1.0-rc4_for_upstream_base_2022_11_10_16_12 #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:_raw_spin_lock+0xc/0x20
 Call Trace:
  &lt;TASK&gt;
  mlx5_ib_get_native_port_mdev+0x73/0xe0 [mlx5_ib]
  do_get_hw_stats.constprop.0+0x109/0x160 [mlx5_ib]
  mlx5_ib_get_hw_stats+0xad/0x180 [mlx5_ib]
  ib_setup_device_attrs+0xf0/0x290 [ib_core]
  ib_register_device+0x3bb/0x510 [ib_core]
  ? atomic_notifier_chain_register+0x67/0x80
  __mlx5_ib_add+0x2b/0x80 [mlx5_ib]
  mlx5r_probe+0xb8/0x150 [mlx5_ib]
  ? auxiliary_match_id+0x6a/0x90
  auxiliary_bus_probe+0x3c/0x70
  ? driver_sysfs_add+0x6b/0x90
  really_probe+0xcd/0x380
  __driver_probe_device+0x80/0x170
  driver_probe_device+0x1e/0x90
  __device_attach_driver+0x7d/0x100
  ? driver_allows_async_probing+0x60/0x60
  ? driver_allows_async_probing+0x60/0x60
  bus_for_each_drv+0x7b/0xc0
  __device_attach+0xbc/0x200
  bus_probe_device+0x87/0xa0
  device_add+0x404/0x940
  ? dev_set_name+0x53/0x70
  __auxiliary_device_add+0x43/0x60
  add_adev+0x99/0xe0 [mlx5_core]
  mlx5_attach_device+0xc8/0x120 [mlx5_core]
  mlx5_load_one_devl_locked+0xb2/0xe0 [mlx5_core]
  devlink_reload+0x133/0x250
  devlink_nl_cmd_reload+0x480/0x570
  ? devlink_nl_pre_doit+0x44/0x2b0
  genl_family_rcv_msg_doit.isra.0+0xc2/0x110
  genl_rcv_msg+0x180/0x2b0
  ? devlink_nl_cmd_region_read_dumpit+0x540/0x540
  ? devlink_reload+0x250/0x250
  ? devlink_put+0x50/0x50
  ? genl_family_rcv_msg_doit.isra.0+0x110/0x110
  netlink_rcv_skb+0x54/0x100
  genl_rcv+0x24/0x40
  netlink_unicast+0x1f6/0x2c0
  netlink_sendmsg+0x237/0x490
  sock_sendmsg+0x33/0x40
  __sys_sendto+0x103/0x160
  ? handle_mm_fault+0x10e/0x290
  ? do_user_addr_fault+0x1c0/0x5f0
  __x64_sys_sendto+0x25/0x30
  do_syscall_64+0x3d/0x90
  entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fix it by setting port_num to 1 in order to get device status and remove
unused variable.

Fixes: aac4492ef23a ("IB/mlx5: Update counter implementation for dual port RoCE")
Link: https://lore.kernel.org/r/98b82994c3cd3fa593b8a75ed3f3901e208beb0f.1672231736.git.leonro@nvidia.com
Signed-off-by: Shay Drory &lt;shayd@nvidia.com&gt;
Reviewed-by: Patrisious Haddad &lt;phaddad@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
