<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/infiniband, branch linux-6.8.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>RDMA/cma: Fix kmemleak in rdma_core observed during blktests nvme/rdma use siw</title>
<updated>2024-05-30T07:49:49+00:00</updated>
<author>
<name>Zhu Yanjun</name>
<email>yanjun.zhu@linux.dev</email>
</author>
<published>2024-05-10T21:12:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6564fc1818404254d1c9f7d75b403b4941516d26'/>
<id>6564fc1818404254d1c9f7d75b403b4941516d26</id>
<content type='text'>
[ Upstream commit 9c0731832d3b7420cbadba6a7f334363bc8dfb15 ]

When running blktests nvme/rdma, the following kmemleak issue will appear.

kmemleak: Kernel memory leak detector initialized (mempool available:36041)
kmemleak: Automatic memory scanning thread started
kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
kmemleak: 8 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
kmemleak: 17 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
kmemleak: 4 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

unreferenced object 0xffff88855da53400 (size 192):
  comm "rdma", pid 10630, jiffies 4296575922
  hex dump (first 32 bytes):
    37 00 00 00 00 00 00 00 c0 ff ff ff 1f 00 00 00  7...............
    10 34 a5 5d 85 88 ff ff 10 34 a5 5d 85 88 ff ff  .4.].....4.]....
  backtrace (crc 47f66721):
    [&lt;ffffffff911251bd&gt;] kmalloc_trace+0x30d/0x3b0
    [&lt;ffffffffc2640ff7&gt;] alloc_gid_entry+0x47/0x380 [ib_core]
    [&lt;ffffffffc2642206&gt;] add_modify_gid+0x166/0x930 [ib_core]
    [&lt;ffffffffc2643468&gt;] ib_cache_update.part.0+0x6d8/0x910 [ib_core]
    [&lt;ffffffffc2644e1a&gt;] ib_cache_setup_one+0x24a/0x350 [ib_core]
    [&lt;ffffffffc263949e&gt;] ib_register_device+0x9e/0x3a0 [ib_core]
    [&lt;ffffffffc2a3d389&gt;] 0xffffffffc2a3d389
    [&lt;ffffffffc2688cd8&gt;] nldev_newlink+0x2b8/0x520 [ib_core]
    [&lt;ffffffffc2645fe3&gt;] rdma_nl_rcv_msg+0x2c3/0x520 [ib_core]
    [&lt;ffffffffc264648c&gt;]
rdma_nl_rcv_skb.constprop.0.isra.0+0x23c/0x3a0 [ib_core]
    [&lt;ffffffff9270e7b5&gt;] netlink_unicast+0x445/0x710
    [&lt;ffffffff9270f1f1&gt;] netlink_sendmsg+0x761/0xc40
    [&lt;ffffffff9249db29&gt;] __sys_sendto+0x3a9/0x420
    [&lt;ffffffff9249dc8c&gt;] __x64_sys_sendto+0xdc/0x1b0
    [&lt;ffffffff92db0ad3&gt;] do_syscall_64+0x93/0x180
    [&lt;ffffffff92e00126&gt;] entry_SYSCALL_64_after_hwframe+0x71/0x79

The root cause: rdma_put_gid_attr is not called when sgid_attr is set
to ERR_PTR(-ENODEV).

Reported-and-tested-by: Yi Zhang &lt;yi.zhang@redhat.com&gt;
Closes: https://lore.kernel.org/all/19bf5745-1b3b-4b8a-81c2-20d945943aaf@linux.dev/T/
Fixes: f8ef1be816bf ("RDMA/cma: Avoid GID lookups on iWARP devices")
Reviewed-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: Zhu Yanjun &lt;yanjun.zhu@linux.dev&gt;
Link: https://lore.kernel.org/r/20240510211247.31345-1-yanjun.zhu@linux.dev
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 9c0731832d3b7420cbadba6a7f334363bc8dfb15 ]

When running blktests nvme/rdma, the following kmemleak issue will appear.

kmemleak: Kernel memory leak detector initialized (mempool available:36041)
kmemleak: Automatic memory scanning thread started
kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
kmemleak: 8 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
kmemleak: 17 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
kmemleak: 4 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

unreferenced object 0xffff88855da53400 (size 192):
  comm "rdma", pid 10630, jiffies 4296575922
  hex dump (first 32 bytes):
    37 00 00 00 00 00 00 00 c0 ff ff ff 1f 00 00 00  7...............
    10 34 a5 5d 85 88 ff ff 10 34 a5 5d 85 88 ff ff  .4.].....4.]....
  backtrace (crc 47f66721):
    [&lt;ffffffff911251bd&gt;] kmalloc_trace+0x30d/0x3b0
    [&lt;ffffffffc2640ff7&gt;] alloc_gid_entry+0x47/0x380 [ib_core]
    [&lt;ffffffffc2642206&gt;] add_modify_gid+0x166/0x930 [ib_core]
    [&lt;ffffffffc2643468&gt;] ib_cache_update.part.0+0x6d8/0x910 [ib_core]
    [&lt;ffffffffc2644e1a&gt;] ib_cache_setup_one+0x24a/0x350 [ib_core]
    [&lt;ffffffffc263949e&gt;] ib_register_device+0x9e/0x3a0 [ib_core]
    [&lt;ffffffffc2a3d389&gt;] 0xffffffffc2a3d389
    [&lt;ffffffffc2688cd8&gt;] nldev_newlink+0x2b8/0x520 [ib_core]
    [&lt;ffffffffc2645fe3&gt;] rdma_nl_rcv_msg+0x2c3/0x520 [ib_core]
    [&lt;ffffffffc264648c&gt;]
rdma_nl_rcv_skb.constprop.0.isra.0+0x23c/0x3a0 [ib_core]
    [&lt;ffffffff9270e7b5&gt;] netlink_unicast+0x445/0x710
    [&lt;ffffffff9270f1f1&gt;] netlink_sendmsg+0x761/0xc40
    [&lt;ffffffff9249db29&gt;] __sys_sendto+0x3a9/0x420
    [&lt;ffffffff9249dc8c&gt;] __x64_sys_sendto+0xdc/0x1b0
    [&lt;ffffffff92db0ad3&gt;] do_syscall_64+0x93/0x180
    [&lt;ffffffff92e00126&gt;] entry_SYSCALL_64_after_hwframe+0x71/0x79

The root cause: rdma_put_gid_attr is not called when sgid_attr is set
to ERR_PTR(-ENODEV).

Reported-and-tested-by: Yi Zhang &lt;yi.zhang@redhat.com&gt;
Closes: https://lore.kernel.org/all/19bf5745-1b3b-4b8a-81c2-20d945943aaf@linux.dev/T/
Fixes: f8ef1be816bf ("RDMA/cma: Avoid GID lookups on iWARP devices")
Reviewed-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: Zhu Yanjun &lt;yanjun.zhu@linux.dev&gt;
Link: https://lore.kernel.org/r/20240510211247.31345-1-yanjun.zhu@linux.dev
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/IPoIB: Fix format truncation compilation errors</title>
<updated>2024-05-30T07:49:49+00:00</updated>
<author>
<name>Leon Romanovsky</name>
<email>leonro@nvidia.com</email>
</author>
<published>2024-05-09T07:39:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6c4171a1dc444902b660fd9c7fe1a1d1c6087cf3'/>
<id>6c4171a1dc444902b660fd9c7fe1a1d1c6087cf3</id>
<content type='text'>
[ Upstream commit 49ca2b2ef3d003402584c68ae7b3055ba72e750a ]

Truncate the device name to store IPoIB VLAN name.

[leonro@5b4e8fba4ddd kernel]$ make -s -j 20 allmodconfig
[leonro@5b4e8fba4ddd kernel]$ make -s -j 20 W=1 drivers/infiniband/ulp/ipoib/
drivers/infiniband/ulp/ipoib/ipoib_vlan.c: In function ‘ipoib_vlan_add’:
drivers/infiniband/ulp/ipoib/ipoib_vlan.c:187:52: error: ‘%04x’
directive output may be truncated writing 4 bytes into a region of size
between 0 and 15 [-Werror=format-truncation=]
  187 |         snprintf(intf_name, sizeof(intf_name), "%s.%04x",
      |                                                    ^~~~
drivers/infiniband/ulp/ipoib/ipoib_vlan.c:187:48: note: directive
argument in the range [0, 65535]
  187 |         snprintf(intf_name, sizeof(intf_name), "%s.%04x",
      |                                                ^~~~~~~~~
drivers/infiniband/ulp/ipoib/ipoib_vlan.c:187:9: note: ‘snprintf’ output
between 6 and 21 bytes into a destination of size 16
  187 |         snprintf(intf_name, sizeof(intf_name), "%s.%04x",
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  188 |                  ppriv-&gt;dev-&gt;name, pkey);
      |                  ~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make[6]: *** [scripts/Makefile.build:244: drivers/infiniband/ulp/ipoib/ipoib_vlan.o] Error 1
make[6]: *** Waiting for unfinished jobs....

Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support")
Link: https://lore.kernel.org/r/e9d3e1fef69df4c9beaf402cc3ac342bad680791.1715240029.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 49ca2b2ef3d003402584c68ae7b3055ba72e750a ]

Truncate the device name to store IPoIB VLAN name.

[leonro@5b4e8fba4ddd kernel]$ make -s -j 20 allmodconfig
[leonro@5b4e8fba4ddd kernel]$ make -s -j 20 W=1 drivers/infiniband/ulp/ipoib/
drivers/infiniband/ulp/ipoib/ipoib_vlan.c: In function ‘ipoib_vlan_add’:
drivers/infiniband/ulp/ipoib/ipoib_vlan.c:187:52: error: ‘%04x’
directive output may be truncated writing 4 bytes into a region of size
between 0 and 15 [-Werror=format-truncation=]
  187 |         snprintf(intf_name, sizeof(intf_name), "%s.%04x",
      |                                                    ^~~~
drivers/infiniband/ulp/ipoib/ipoib_vlan.c:187:48: note: directive
argument in the range [0, 65535]
  187 |         snprintf(intf_name, sizeof(intf_name), "%s.%04x",
      |                                                ^~~~~~~~~
drivers/infiniband/ulp/ipoib/ipoib_vlan.c:187:9: note: ‘snprintf’ output
between 6 and 21 bytes into a destination of size 16
  187 |         snprintf(intf_name, sizeof(intf_name), "%s.%04x",
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  188 |                  ppriv-&gt;dev-&gt;name, pkey);
      |                  ~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make[6]: *** [scripts/Makefile.build:244: drivers/infiniband/ulp/ipoib/ipoib_vlan.o] Error 1
make[6]: *** Waiting for unfinished jobs....

Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support")
Link: https://lore.kernel.org/r/e9d3e1fef69df4c9beaf402cc3ac342bad680791.1715240029.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bnxt_re: avoid shift undefined behavior in bnxt_qplib_alloc_init_hwq</title>
<updated>2024-05-30T07:49:48+00:00</updated>
<author>
<name>Michal Schmidt</name>
<email>mschmidt@redhat.com</email>
</author>
<published>2024-05-07T10:39:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=627493443f3a8458cb55cdae1da254a7001123bc'/>
<id>627493443f3a8458cb55cdae1da254a7001123bc</id>
<content type='text'>
[ Upstream commit 78cfd17142ef70599d6409cbd709d94b3da58659 ]

Undefined behavior is triggered when bnxt_qplib_alloc_init_hwq is called
with hwq_attr-&gt;aux_depth != 0 and hwq_attr-&gt;aux_stride == 0.
In that case, "roundup_pow_of_two(hwq_attr-&gt;aux_stride)" gets called.
roundup_pow_of_two is documented as undefined for 0.

Fix it in the one caller that had this combination.

The undefined behavior was detected by UBSAN:
  UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
  shift exponent 64 is too large for 64-bit type 'long unsigned int'
  CPU: 24 PID: 1075 Comm: (udev-worker) Not tainted 6.9.0-rc6+ #4
  Hardware name: Abacus electric, s.r.o. - servis@abacus.cz Super Server/H12SSW-iN, BIOS 2.7 10/25/2023
  Call Trace:
   &lt;TASK&gt;
   dump_stack_lvl+0x5d/0x80
   ubsan_epilogue+0x5/0x30
   __ubsan_handle_shift_out_of_bounds.cold+0x61/0xec
   __roundup_pow_of_two+0x25/0x35 [bnxt_re]
   bnxt_qplib_alloc_init_hwq+0xa1/0x470 [bnxt_re]
   bnxt_qplib_create_qp+0x19e/0x840 [bnxt_re]
   bnxt_re_create_qp+0x9b1/0xcd0 [bnxt_re]
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __kmalloc+0x1b6/0x4f0
   ? create_qp.part.0+0x128/0x1c0 [ib_core]
   ? __pfx_bnxt_re_create_qp+0x10/0x10 [bnxt_re]
   create_qp.part.0+0x128/0x1c0 [ib_core]
   ib_create_qp_kernel+0x50/0xd0 [ib_core]
   create_mad_qp+0x8e/0xe0 [ib_core]
   ? __pfx_qp_event_handler+0x10/0x10 [ib_core]
   ib_mad_init_device+0x2be/0x680 [ib_core]
   add_client_context+0x10d/0x1a0 [ib_core]
   enable_device_and_get+0xe0/0x1d0 [ib_core]
   ib_register_device+0x53c/0x630 [ib_core]
   ? srso_alias_return_thunk+0x5/0xfbef5
   bnxt_re_probe+0xbd8/0xe50 [bnxt_re]
   ? __pfx_bnxt_re_probe+0x10/0x10 [bnxt_re]
   auxiliary_bus_probe+0x49/0x80
   ? driver_sysfs_add+0x57/0xc0
   really_probe+0xde/0x340
   ? pm_runtime_barrier+0x54/0x90
   ? __pfx___driver_attach+0x10/0x10
   __driver_probe_device+0x78/0x110
   driver_probe_device+0x1f/0xa0
   __driver_attach+0xba/0x1c0
   bus_for_each_dev+0x8f/0xe0
   bus_add_driver+0x146/0x220
   driver_register+0x72/0xd0
   __auxiliary_driver_register+0x6e/0xd0
   ? __pfx_bnxt_re_mod_init+0x10/0x10 [bnxt_re]
   bnxt_re_mod_init+0x3e/0xff0 [bnxt_re]
   ? __pfx_bnxt_re_mod_init+0x10/0x10 [bnxt_re]
   do_one_initcall+0x5b/0x310
   do_init_module+0x90/0x250
   init_module_from_file+0x86/0xc0
   idempotent_init_module+0x121/0x2b0
   __x64_sys_finit_module+0x5e/0xb0
   do_syscall_64+0x82/0x160
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? syscall_exit_to_user_mode_prepare+0x149/0x170
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? syscall_exit_to_user_mode+0x75/0x230
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? do_syscall_64+0x8e/0x160
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __count_memcg_events+0x69/0x100
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? count_memcg_events.constprop.0+0x1a/0x30
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? handle_mm_fault+0x1f0/0x300
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? do_user_addr_fault+0x34e/0x640
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
  RIP: 0033:0x7f4e5132821d
  Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 &lt;48&gt; 3d 01 f0 ff ff 73 01 c3 48 8b 0d e3 db 0c 00 f7 d8 64 89 01 48
  RSP: 002b:00007ffca9c906a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
  RAX: ffffffffffffffda RBX: 0000563ec8a8f130 RCX: 00007f4e5132821d
  RDX: 0000000000000000 RSI: 00007f4e518fa07d RDI: 000000000000003b
  RBP: 00007ffca9c90760 R08: 00007f4e513f6b20 R09: 00007ffca9c906f0
  R10: 0000563ec8a8faa0 R11: 0000000000000246 R12: 00007f4e518fa07d
  R13: 0000000000020000 R14: 0000563ec8409e90 R15: 0000563ec8a8fa60
   &lt;/TASK&gt;
  ---[ end trace ]---

Fixes: 0c4dcd602817 ("RDMA/bnxt_re: Refactor hardware queue memory allocation")
Signed-off-by: Michal Schmidt &lt;mschmidt@redhat.com&gt;
Link: https://lore.kernel.org/r/20240507103929.30003-1-mschmidt@redhat.com
Acked-by: Selvin Xavier &lt;selvin.xavier@broadcom.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 78cfd17142ef70599d6409cbd709d94b3da58659 ]

Undefined behavior is triggered when bnxt_qplib_alloc_init_hwq is called
with hwq_attr-&gt;aux_depth != 0 and hwq_attr-&gt;aux_stride == 0.
In that case, "roundup_pow_of_two(hwq_attr-&gt;aux_stride)" gets called.
roundup_pow_of_two is documented as undefined for 0.

Fix it in the one caller that had this combination.

The undefined behavior was detected by UBSAN:
  UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
  shift exponent 64 is too large for 64-bit type 'long unsigned int'
  CPU: 24 PID: 1075 Comm: (udev-worker) Not tainted 6.9.0-rc6+ #4
  Hardware name: Abacus electric, s.r.o. - servis@abacus.cz Super Server/H12SSW-iN, BIOS 2.7 10/25/2023
  Call Trace:
   &lt;TASK&gt;
   dump_stack_lvl+0x5d/0x80
   ubsan_epilogue+0x5/0x30
   __ubsan_handle_shift_out_of_bounds.cold+0x61/0xec
   __roundup_pow_of_two+0x25/0x35 [bnxt_re]
   bnxt_qplib_alloc_init_hwq+0xa1/0x470 [bnxt_re]
   bnxt_qplib_create_qp+0x19e/0x840 [bnxt_re]
   bnxt_re_create_qp+0x9b1/0xcd0 [bnxt_re]
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __kmalloc+0x1b6/0x4f0
   ? create_qp.part.0+0x128/0x1c0 [ib_core]
   ? __pfx_bnxt_re_create_qp+0x10/0x10 [bnxt_re]
   create_qp.part.0+0x128/0x1c0 [ib_core]
   ib_create_qp_kernel+0x50/0xd0 [ib_core]
   create_mad_qp+0x8e/0xe0 [ib_core]
   ? __pfx_qp_event_handler+0x10/0x10 [ib_core]
   ib_mad_init_device+0x2be/0x680 [ib_core]
   add_client_context+0x10d/0x1a0 [ib_core]
   enable_device_and_get+0xe0/0x1d0 [ib_core]
   ib_register_device+0x53c/0x630 [ib_core]
   ? srso_alias_return_thunk+0x5/0xfbef5
   bnxt_re_probe+0xbd8/0xe50 [bnxt_re]
   ? __pfx_bnxt_re_probe+0x10/0x10 [bnxt_re]
   auxiliary_bus_probe+0x49/0x80
   ? driver_sysfs_add+0x57/0xc0
   really_probe+0xde/0x340
   ? pm_runtime_barrier+0x54/0x90
   ? __pfx___driver_attach+0x10/0x10
   __driver_probe_device+0x78/0x110
   driver_probe_device+0x1f/0xa0
   __driver_attach+0xba/0x1c0
   bus_for_each_dev+0x8f/0xe0
   bus_add_driver+0x146/0x220
   driver_register+0x72/0xd0
   __auxiliary_driver_register+0x6e/0xd0
   ? __pfx_bnxt_re_mod_init+0x10/0x10 [bnxt_re]
   bnxt_re_mod_init+0x3e/0xff0 [bnxt_re]
   ? __pfx_bnxt_re_mod_init+0x10/0x10 [bnxt_re]
   do_one_initcall+0x5b/0x310
   do_init_module+0x90/0x250
   init_module_from_file+0x86/0xc0
   idempotent_init_module+0x121/0x2b0
   __x64_sys_finit_module+0x5e/0xb0
   do_syscall_64+0x82/0x160
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? syscall_exit_to_user_mode_prepare+0x149/0x170
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? syscall_exit_to_user_mode+0x75/0x230
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? do_syscall_64+0x8e/0x160
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __count_memcg_events+0x69/0x100
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? count_memcg_events.constprop.0+0x1a/0x30
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? handle_mm_fault+0x1f0/0x300
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? do_user_addr_fault+0x34e/0x640
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
  RIP: 0033:0x7f4e5132821d
  Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 &lt;48&gt; 3d 01 f0 ff ff 73 01 c3 48 8b 0d e3 db 0c 00 f7 d8 64 89 01 48
  RSP: 002b:00007ffca9c906a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
  RAX: ffffffffffffffda RBX: 0000563ec8a8f130 RCX: 00007f4e5132821d
  RDX: 0000000000000000 RSI: 00007f4e518fa07d RDI: 000000000000003b
  RBP: 00007ffca9c90760 R08: 00007f4e513f6b20 R09: 00007ffca9c906f0
  R10: 0000563ec8a8faa0 R11: 0000000000000246 R12: 00007f4e518fa07d
  R13: 0000000000020000 R14: 0000563ec8409e90 R15: 0000563ec8a8fa60
   &lt;/TASK&gt;
  ---[ end trace ]---

Fixes: 0c4dcd602817 ("RDMA/bnxt_re: Refactor hardware queue memory allocation")
Signed-off-by: Michal Schmidt &lt;mschmidt@redhat.com&gt;
Link: https://lore.kernel.org/r/20240507103929.30003-1-mschmidt@redhat.com
Acked-by: Selvin Xavier &lt;selvin.xavier@broadcom.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mana_ib: boundary check before installing cq callbacks</title>
<updated>2024-05-30T07:49:46+00:00</updated>
<author>
<name>Konstantin Taranov</name>
<email>kotaranov@microsoft.com</email>
</author>
<published>2024-04-26T13:12:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f12afddfb142587d786df9e3cc4862190d3e2ec8'/>
<id>f12afddfb142587d786df9e3cc4862190d3e2ec8</id>
<content type='text'>
[ Upstream commit f79edef79b6a2161f4124112f9b0c46891bb0b74 ]

Add a boundary check inside mana_ib_install_cq_cb to prevent index overflow.

Fixes: 2a31c5a7e0d8 ("RDMA/mana_ib: Introduce mana_ib_install_cq_cb helper function")
Signed-off-by: Konstantin Taranov &lt;kotaranov@microsoft.com&gt;
Link: https://lore.kernel.org/r/1714137160-5222-5-git-send-email-kotaranov@linux.microsoft.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit f79edef79b6a2161f4124112f9b0c46891bb0b74 ]

Add a boundary check inside mana_ib_install_cq_cb to prevent index overflow.

Fixes: 2a31c5a7e0d8 ("RDMA/mana_ib: Introduce mana_ib_install_cq_cb helper function")
Signed-off-by: Konstantin Taranov &lt;kotaranov@microsoft.com&gt;
Link: https://lore.kernel.org/r/1714137160-5222-5-git-send-email-kotaranov@linux.microsoft.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mana_ib: Use struct mana_ib_queue for CQs</title>
<updated>2024-05-30T07:49:46+00:00</updated>
<author>
<name>Konstantin Taranov</name>
<email>kotaranov@microsoft.com</email>
</author>
<published>2024-03-26T20:08:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=19dbfaa0d127671f8ce505371eede383be195f39'/>
<id>19dbfaa0d127671f8ce505371eede383be195f39</id>
<content type='text'>
[ Upstream commit 60a7ac0b8bec5df9764b7460ffee91fc981e8a31 ]

Use struct mana_ib_queue and its helpers for CQs

Signed-off-by: Konstantin Taranov &lt;kotaranov@microsoft.com&gt;
Link: https://lore.kernel.org/r/1711483688-24358-3-git-send-email-kotaranov@linux.microsoft.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Stable-dep-of: f79edef79b6a ("RDMA/mana_ib: boundary check before installing cq callbacks")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 60a7ac0b8bec5df9764b7460ffee91fc981e8a31 ]

Use struct mana_ib_queue and its helpers for CQs

Signed-off-by: Konstantin Taranov &lt;kotaranov@microsoft.com&gt;
Link: https://lore.kernel.org/r/1711483688-24358-3-git-send-email-kotaranov@linux.microsoft.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Stable-dep-of: f79edef79b6a ("RDMA/mana_ib: boundary check before installing cq callbacks")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mana_ib: Introduce helpers to create and destroy mana queues</title>
<updated>2024-05-30T07:49:46+00:00</updated>
<author>
<name>Konstantin Taranov</name>
<email>kotaranov@microsoft.com</email>
</author>
<published>2024-03-26T20:08:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c48ddf1268f11be45720e605f0e0077c5ce88a0a'/>
<id>c48ddf1268f11be45720e605f0e0077c5ce88a0a</id>
<content type='text'>
[ Upstream commit 46f5be7cd4bceb3a503c544b3dab7b75fe4bb96b ]

Intoduce helpers to work with mana ib queues (struct mana_ib_queue).
A queue always consists of umem, gdma_region, and id.
A queue can become a WQ or a CQ.

Signed-off-by: Konstantin Taranov &lt;kotaranov@microsoft.com&gt;
Link: https://lore.kernel.org/r/1711483688-24358-2-git-send-email-kotaranov@linux.microsoft.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Stable-dep-of: f79edef79b6a ("RDMA/mana_ib: boundary check before installing cq callbacks")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 46f5be7cd4bceb3a503c544b3dab7b75fe4bb96b ]

Intoduce helpers to work with mana ib queues (struct mana_ib_queue).
A queue always consists of umem, gdma_region, and id.
A queue can become a WQ or a CQ.

Signed-off-by: Konstantin Taranov &lt;kotaranov@microsoft.com&gt;
Link: https://lore.kernel.org/r/1711483688-24358-2-git-send-email-kotaranov@linux.microsoft.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Stable-dep-of: f79edef79b6a ("RDMA/mana_ib: boundary check before installing cq callbacks")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mlx5: Use __iowrite64_copy() for write combining stores</title>
<updated>2024-05-30T07:49:44+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2024-04-11T16:46:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2712af01abc26beaf86632fcd6268e61f6ab7896'/>
<id>2712af01abc26beaf86632fcd6268e61f6ab7896</id>
<content type='text'>
[ Upstream commit ef302283ddfceaba2657923af3f90fd58e6dff06 ]

mlx5 has a built in self-test at driver startup to evaluate if the
platform supports write combining to generate a 64 byte PCIe TLP or
not. This has proven necessary because a lot of common scenarios end up
with broken write combining (especially inside virtual machines) and there
is other way to learn this information.

This self test has been consistently failing on new ARM64 CPU
designs (specifically with NVIDIA Grace's implementation of Neoverse
V2). The C loop around writeq() generates some pretty terrible ARM64
assembly, but historically this has worked on a lot of existing ARM64 CPUs
till now.

We see it succeed about 1 time in 10,000 on the worst effected
systems. The CPU architects speculate that the load instructions
interspersed with the stores makes the WC buffers statistically flush too
often and thus the generation of large TLPs becomes infrequent. This makes
the boot up test unreliable in that it indicates no write-combining,
however userspace would be fine since it uses a ST4 instruction.

Further, S390 has similar issues where only the special zpci_memcpy_toio()
will actually generate large TLPs, and the open coded loop does not
trigger it at all.

Fix both ARM64 and S390 by switching to __iowrite64_copy() which now
provides architecture specific variants that have a high change of
generating a large TLP with write combining. x86 continues to use a
similar writeq loop in the generate __iowrite64_copy().

Fixes: 11f552e21755 ("IB/mlx5: Test write combining support")
Link: https://lore.kernel.org/r/6-v3-1893cd8b9369+1925-mlx5_arm_wc_jgg@nvidia.com
Tested-by: Niklas Schnelle &lt;schnelle@linux.ibm.com&gt;
Acked-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit ef302283ddfceaba2657923af3f90fd58e6dff06 ]

mlx5 has a built in self-test at driver startup to evaluate if the
platform supports write combining to generate a 64 byte PCIe TLP or
not. This has proven necessary because a lot of common scenarios end up
with broken write combining (especially inside virtual machines) and there
is other way to learn this information.

This self test has been consistently failing on new ARM64 CPU
designs (specifically with NVIDIA Grace's implementation of Neoverse
V2). The C loop around writeq() generates some pretty terrible ARM64
assembly, but historically this has worked on a lot of existing ARM64 CPUs
till now.

We see it succeed about 1 time in 10,000 on the worst effected
systems. The CPU architects speculate that the load instructions
interspersed with the stores makes the WC buffers statistically flush too
often and thus the generation of large TLPs becomes infrequent. This makes
the boot up test unreliable in that it indicates no write-combining,
however userspace would be fine since it uses a ST4 instruction.

Further, S390 has similar issues where only the special zpci_memcpy_toio()
will actually generate large TLPs, and the open coded loop does not
trigger it at all.

Fix both ARM64 and S390 by switching to __iowrite64_copy() which now
provides architecture specific variants that have a high change of
generating a large TLP with write combining. x86 continues to use a
similar writeq loop in the generate __iowrite64_copy().

Fixes: 11f552e21755 ("IB/mlx5: Test write combining support")
Link: https://lore.kernel.org/r/6-v3-1893cd8b9369+1925-mlx5_arm_wc_jgg@nvidia.com
Tested-by: Niklas Schnelle &lt;schnelle@linux.ibm.com&gt;
Acked-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/rxe: Fix incorrect rxe_put in error path</title>
<updated>2024-05-30T07:49:44+00:00</updated>
<author>
<name>Bob Pearson</name>
<email>rpearsonhpe@gmail.com</email>
</author>
<published>2024-03-29T14:55:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d337bfdd665bc48e6d1dd4824ed2203011b0dd19'/>
<id>d337bfdd665bc48e6d1dd4824ed2203011b0dd19</id>
<content type='text'>
[ Upstream commit 8776618dbbd1b6f210b31509507e1aad461d6435 ]

In rxe_send() a ref is taken on the qp to keep it alive until the
kfree_skb() has a chance to call the skb destructor rxe_skb_tx_dtor()
which drops the reference. If the packet has an incorrect protocol the
error path just calls kfree_skb() which will call the destructor which
will drop the ref. Currently the driver also calls rxe_put() which is
incorrect. Additionally since the packets sent to rxe_send() are under the
control of the driver and it only ever produces IPV4 or IPV6 packets the
simplest fix is to remove all the code in this block.

Link: https://lore.kernel.org/r/20240329145513.35381-12-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson &lt;rpearsonhpe@gmail.com&gt;
Fixes: 9eb7f8e44d13 ("IB/rxe: Move refcounting earlier in rxe_send()")
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 8776618dbbd1b6f210b31509507e1aad461d6435 ]

In rxe_send() a ref is taken on the qp to keep it alive until the
kfree_skb() has a chance to call the skb destructor rxe_skb_tx_dtor()
which drops the reference. If the packet has an incorrect protocol the
error path just calls kfree_skb() which will call the destructor which
will drop the ref. Currently the driver also calls rxe_put() which is
incorrect. Additionally since the packets sent to rxe_send() are under the
control of the driver and it only ever produces IPV4 or IPV6 packets the
simplest fix is to remove all the code in this block.

Link: https://lore.kernel.org/r/20240329145513.35381-12-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson &lt;rpearsonhpe@gmail.com&gt;
Fixes: 9eb7f8e44d13 ("IB/rxe: Move refcounting earlier in rxe_send()")
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/rxe: Allow good work requests to be executed</title>
<updated>2024-05-30T07:49:44+00:00</updated>
<author>
<name>Bob Pearson</name>
<email>rpearsonhpe@gmail.com</email>
</author>
<published>2024-03-29T14:55:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ba82921a886183d0a8085130dbf64ac1c51d363e'/>
<id>ba82921a886183d0a8085130dbf64ac1c51d363e</id>
<content type='text'>
[ Upstream commit b703374837a8f8422fa3f1edcf65505421a65a6a ]

A previous commit incorrectly added an 'if(!err)' before scheduling the
requester task in rxe_post_send_kernel(). But if there were send wrs
successfully added to the send queue before a bad wr they might never get
executed.

This commit fixes this by scheduling the requester task if any wqes were
successfully posted in rxe_post_send_kernel() in rxe_verbs.c.

Link: https://lore.kernel.org/r/20240329145513.35381-5-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson &lt;rpearsonhpe@gmail.com&gt;
Fixes: 5bf944f24129 ("RDMA/rxe: Add error messages")
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit b703374837a8f8422fa3f1edcf65505421a65a6a ]

A previous commit incorrectly added an 'if(!err)' before scheduling the
requester task in rxe_post_send_kernel(). But if there were send wrs
successfully added to the send queue before a bad wr they might never get
executed.

This commit fixes this by scheduling the requester task if any wqes were
successfully posted in rxe_post_send_kernel() in rxe_verbs.c.

Link: https://lore.kernel.org/r/20240329145513.35381-5-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson &lt;rpearsonhpe@gmail.com&gt;
Fixes: 5bf944f24129 ("RDMA/rxe: Add error messages")
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/rxe: Fix seg fault in rxe_comp_queue_pkt</title>
<updated>2024-05-30T07:49:44+00:00</updated>
<author>
<name>Bob Pearson</name>
<email>rpearsonhpe@gmail.com</email>
</author>
<published>2024-03-29T14:55:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bbad88f111a1829f366c189aa48e7e58e57553fc'/>
<id>bbad88f111a1829f366c189aa48e7e58e57553fc</id>
<content type='text'>
[ Upstream commit 2b23b6097303ed0ba5f4bc036a1c07b6027af5c6 ]

In rxe_comp_queue_pkt() an incoming response packet skb is enqueued to the
resp_pkts queue and then a decision is made whether to run the completer
task inline or schedule it. Finally the skb is dereferenced to bump a 'hw'
performance counter. This is wrong because if the completer task is
already running in a separate thread it may have already processed the skb
and freed it which can cause a seg fault.  This has been observed
infrequently in testing at high scale.

This patch fixes this by changing the order of enqueuing the packet until
after the counter is accessed.

Link: https://lore.kernel.org/r/20240329145513.35381-4-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson &lt;rpearsonhpe@gmail.com&gt;
Fixes: 0b1e5b99a48b ("IB/rxe: Add port protocol stats")
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2b23b6097303ed0ba5f4bc036a1c07b6027af5c6 ]

In rxe_comp_queue_pkt() an incoming response packet skb is enqueued to the
resp_pkts queue and then a decision is made whether to run the completer
task inline or schedule it. Finally the skb is dereferenced to bump a 'hw'
performance counter. This is wrong because if the completer task is
already running in a separate thread it may have already processed the skb
and freed it which can cause a seg fault.  This has been observed
infrequently in testing at high scale.

This patch fixes this by changing the order of enqueuing the packet until
after the counter is accessed.

Link: https://lore.kernel.org/r/20240329145513.35381-4-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson &lt;rpearsonhpe@gmail.com&gt;
Fixes: 0b1e5b99a48b ("IB/rxe: Add port protocol stats")
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
