<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/infiniband, branch v5.4.151</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>RDMA/efa: Remove double QP type assignment</title>
<updated>2021-09-22T10:26:23+00:00</updated>
<author>
<name>Leon Romanovsky</name>
<email>leonro@nvidia.com</email>
</author>
<published>2021-07-23T11:39:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=25ed0498915a02d2a7f14dbb19ba4197c5f58cf7'/>
<id>25ed0498915a02d2a7f14dbb19ba4197c5f58cf7</id>
<content type='text'>
[ Upstream commit f9193d266347fe9bed5c173e7a1bf96268142a79 ]

The QP type is set by the IB/core and shouldn't be set in the driver.

Fixes: 40909f664d27 ("RDMA/efa: Add EFA verbs implementation")
Link: https://lore.kernel.org/r/838c40134c1590167b888ca06ad51071139ff2ae.1627040189.git.leonro@nvidia.com
Acked-by: Gal Pressman &lt;galpress@amazon.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit f9193d266347fe9bed5c173e7a1bf96268142a79 ]

The QP type is set by the IB/core and shouldn't be set in the driver.

Fixes: 40909f664d27 ("RDMA/efa: Add EFA verbs implementation")
Link: https://lore.kernel.org/r/838c40134c1590167b888ca06ad51071139ff2ae.1627040189.git.leonro@nvidia.com
Acked-by: Gal Pressman &lt;galpress@amazon.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/iwcm: Release resources if iw_cm module initialization fails</title>
<updated>2021-09-22T10:26:23+00:00</updated>
<author>
<name>Leon Romanovsky</name>
<email>leonro@nvidia.com</email>
</author>
<published>2021-07-23T14:08:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fb42b9801e0ad772d1e19608ed86c754d0ba722f'/>
<id>fb42b9801e0ad772d1e19608ed86c754d0ba722f</id>
<content type='text'>
[ Upstream commit e677b72a0647249370f2635862bf0241c86f66ad ]

The failure during iw_cm module initialization partially left the system
with unreleased memory and other resources. Rewrite the module init/exit
routines in such way that netlink commands will be opened only after
successful initialization.

Fixes: b493d91d333e ("iwcm: common code for port mapper")
Link: https://lore.kernel.org/r/b01239f99cb1a3e6d2b0694c242d89e6410bcd93.1627048781.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit e677b72a0647249370f2635862bf0241c86f66ad ]

The failure during iw_cm module initialization partially left the system
with unreleased memory and other resources. Rewrite the module init/exit
routines in such way that netlink commands will be opened only after
successful initialization.

Fixes: b493d91d333e ("iwcm: common code for port mapper")
Link: https://lore.kernel.org/r/b01239f99cb1a3e6d2b0694c242d89e6410bcd93.1627048781.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Adjust pkey entry in index 0</title>
<updated>2021-09-22T10:26:23+00:00</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@cornelisnetworks.com</email>
</author>
<published>2021-07-15T16:04:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7930b1f98dd884cfa54d2749e2c194fe8d2abe9b'/>
<id>7930b1f98dd884cfa54d2749e2c194fe8d2abe9b</id>
<content type='text'>
[ Upstream commit 62004871e1fa7f9a60797595c03477af5b5ec36f ]

It is possible for the primary IPoIB network device associated with any
RDMA device to fail to join certain multicast groups preventing IPv6
neighbor discovery and possibly other network ULPs from working
correctly. The IPv4 broadcast group is not affected as the IPoIB network
device handles joining that multicast group directly.

This is because the primary IPoIB network device uses the pkey at ndex 0
in the associated RDMA device's pkey table. Anytime the pkey value of
index 0 changes, the primary IPoIB network device automatically modifies
it's broadcast address (i.e. /sys/class/net/[ib0]/broadcast), since the
broadcast address includes the pkey value, and then bounces carrier. This
includes initial pkey assignment, such as when the pkey at index 0
transitions from the opa default of invalid (0x0000) to some value such as
the OPA default pkey for Virtual Fabric 0: 0x8001 or when the fabric
manager is restarted with a configuration change causing the pkey at index
0 to change. Many network ULPs are not sensitive to the carrier bounce and
are not expecting the broadcast address to change including the linux IPv6
stack.  This problem does not affect IPoIB child network devices as their
pkey value is constant for all time.

To mitigate this issue, change the default pkey in at index 0 to 0x8001 to
cover the predominant case and avoid issues as ipoib comes up and the FM
sweeps.

At some point, ipoib multicast support should automatically fix
non-broadcast addresses as it does with the primary broadcast address.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20210715160445.142451.47651.stgit@awfm-01.cornelisnetworks.com
Suggested-by: Josh Collier &lt;josh.d.collier@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@cornelisnetworks.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 62004871e1fa7f9a60797595c03477af5b5ec36f ]

It is possible for the primary IPoIB network device associated with any
RDMA device to fail to join certain multicast groups preventing IPv6
neighbor discovery and possibly other network ULPs from working
correctly. The IPv4 broadcast group is not affected as the IPoIB network
device handles joining that multicast group directly.

This is because the primary IPoIB network device uses the pkey at ndex 0
in the associated RDMA device's pkey table. Anytime the pkey value of
index 0 changes, the primary IPoIB network device automatically modifies
it's broadcast address (i.e. /sys/class/net/[ib0]/broadcast), since the
broadcast address includes the pkey value, and then bounces carrier. This
includes initial pkey assignment, such as when the pkey at index 0
transitions from the opa default of invalid (0x0000) to some value such as
the OPA default pkey for Virtual Fabric 0: 0x8001 or when the fabric
manager is restarted with a configuration change causing the pkey at index
0 to change. Many network ULPs are not sensitive to the carrier bounce and
are not expecting the broadcast address to change including the linux IPv6
stack.  This problem does not affect IPoIB child network devices as their
pkey value is constant for all time.

To mitigate this issue, change the default pkey in at index 0 to 0x8001 to
cover the predominant case and avoid issues as ipoib comes up and the FM
sweeps.

At some point, ipoib multicast support should automatically fix
non-broadcast addresses as it does with the primary broadcast address.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20210715160445.142451.47651.stgit@awfm-01.cornelisnetworks.com
Suggested-by: Josh Collier &lt;josh.d.collier@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@cornelisnetworks.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/efa: Free IRQ vectors on error flow</title>
<updated>2021-09-03T08:08:13+00:00</updated>
<author>
<name>Gal Pressman</name>
<email>galpress@amazon.com</email>
</author>
<published>2021-08-11T15:11:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bb8ca7e2e67e1dce66bd4e605b35f9ee01e286b9'/>
<id>bb8ca7e2e67e1dce66bd4e605b35f9ee01e286b9</id>
<content type='text'>
[ Upstream commit dbe986bdfd6dfe6ef24b833767fff4151e024357 ]

Make sure to free the IRQ vectors in case the allocation doesn't return
the expected number of IRQs.

Fixes: b7f5e880f377 ("RDMA/efa: Add the efa module")
Link: https://lore.kernel.org/r/20210811151131.39138-2-galpress@amazon.com
Reviewed-by: Firas JahJah &lt;firasj@amazon.com&gt;
Reviewed-by: Yossi Leybovich &lt;sleybo@amazon.com&gt;
Signed-off-by: Gal Pressman &lt;galpress@amazon.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit dbe986bdfd6dfe6ef24b833767fff4151e024357 ]

Make sure to free the IRQ vectors in case the allocation doesn't return
the expected number of IRQs.

Fixes: b7f5e880f377 ("RDMA/efa: Add the efa module")
Link: https://lore.kernel.org/r/20210811151131.39138-2-galpress@amazon.com
Reviewed-by: Firas JahJah &lt;firasj@amazon.com&gt;
Reviewed-by: Yossi Leybovich &lt;sleybo@amazon.com&gt;
Signed-off-by: Gal Pressman &lt;galpress@amazon.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Fix possible null-pointer dereference in _extend_sdma_tx_descs()</title>
<updated>2021-09-03T08:08:13+00:00</updated>
<author>
<name>Tuo Li</name>
<email>islituo@gmail.com</email>
</author>
<published>2021-08-06T13:30:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8a21e84334ec338ac3476ef8bc368984aa5cc083'/>
<id>8a21e84334ec338ac3476ef8bc368984aa5cc083</id>
<content type='text'>
[ Upstream commit cbe71c61992c38f72c2b625b2ef25916b9f0d060 ]

kmalloc_array() is called to allocate memory for tx-&gt;descp. If it fails,
the function __sdma_txclean() is called:
  __sdma_txclean(dd, tx);

However, in the function __sdma_txclean(), tx-descp is dereferenced if
tx-&gt;num_desc is not zero:
  sdma_unmap_desc(dd, &amp;tx-&gt;descp[0]);

To fix this possible null-pointer dereference, assign the return value of
kmalloc_array() to a local variable descp, and then assign it to tx-&gt;descp
if it is not NULL. Otherwise, go to enomem.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20210806133029.194964-1-islituo@gmail.com
Reported-by: TOTE Robot &lt;oslab@tsinghua.edu.cn&gt;
Signed-off-by: Tuo Li &lt;islituo@gmail.com&gt;
Tested-by: Mike Marciniszyn &lt;mike.marciniszyn@cornelisnetworks.com&gt;
Acked-by: Mike Marciniszyn &lt;mike.marciniszyn@cornelisnetworks.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit cbe71c61992c38f72c2b625b2ef25916b9f0d060 ]

kmalloc_array() is called to allocate memory for tx-&gt;descp. If it fails,
the function __sdma_txclean() is called:
  __sdma_txclean(dd, tx);

However, in the function __sdma_txclean(), tx-descp is dereferenced if
tx-&gt;num_desc is not zero:
  sdma_unmap_desc(dd, &amp;tx-&gt;descp[0]);

To fix this possible null-pointer dereference, assign the return value of
kmalloc_array() to a local variable descp, and then assign it to tx-&gt;descp
if it is not NULL. Otherwise, go to enomem.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20210806133029.194964-1-islituo@gmail.com
Reported-by: TOTE Robot &lt;oslab@tsinghua.edu.cn&gt;
Signed-off-by: Tuo Li &lt;islituo@gmail.com&gt;
Tested-by: Mike Marciniszyn &lt;mike.marciniszyn@cornelisnetworks.com&gt;
Acked-by: Mike Marciniszyn &lt;mike.marciniszyn@cornelisnetworks.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/bnxt_re: Add missing spin lock initialization</title>
<updated>2021-09-03T08:08:13+00:00</updated>
<author>
<name>Naresh Kumar PBS</name>
<email>nareshkumar.pbs@broadcom.com</email>
</author>
<published>2021-08-19T03:25:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=944a50f56f1bb45dc529419fc9fea0880ac33d5c'/>
<id>944a50f56f1bb45dc529419fc9fea0880ac33d5c</id>
<content type='text'>
[ Upstream commit 17f2569dce1848080825b8336e6b7c6900193b44 ]

Add the missing initialization of srq lock.

Fixes: 37cb11acf1f7 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Link: https://lore.kernel.org/r/1629343553-5843-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Naresh Kumar PBS &lt;nareshkumar.pbs@broadcom.com&gt;
Signed-off-by: Selvin Xavier &lt;selvin.xavier@broadcom.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 17f2569dce1848080825b8336e6b7c6900193b44 ]

Add the missing initialization of srq lock.

Fixes: 37cb11acf1f7 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Link: https://lore.kernel.org/r/1629343553-5843-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Naresh Kumar PBS &lt;nareshkumar.pbs@broadcom.com&gt;
Signed-off-by: Selvin Xavier &lt;selvin.xavier@broadcom.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/cma: Fix rdma_resolve_route() memory leak</title>
<updated>2021-07-19T06:53:13+00:00</updated>
<author>
<name>Gerd Rausch</name>
<email>gerd.rausch@oracle.com</email>
</author>
<published>2021-06-24T18:55:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=032c68b4f5be128a2167f35b558b7cec88fe4972'/>
<id>032c68b4f5be128a2167f35b558b7cec88fe4972</id>
<content type='text'>
[ Upstream commit 74f160ead74bfe5f2b38afb4fcf86189f9ff40c9 ]

Fix a memory leak when "mda_resolve_route() is called more than once on
the same "rdma_cm_id".

This is possible if cma_query_handler() triggers the
RDMA_CM_EVENT_ROUTE_ERROR flow which puts the state machine back and
allows rdma_resolve_route() to be called again.

Link: https://lore.kernel.org/r/f6662b7b-bdb7-2706-1e12-47c61d3474b6@oracle.com
Signed-off-by: Gerd Rausch &lt;gerd.rausch@oracle.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 74f160ead74bfe5f2b38afb4fcf86189f9ff40c9 ]

Fix a memory leak when "mda_resolve_route() is called more than once on
the same "rdma_cm_id".

This is possible if cma_query_handler() triggers the
RDMA_CM_EVENT_ROUTE_ERROR flow which puts the state machine back and
allows rdma_resolve_route() to be called again.

Link: https://lore.kernel.org/r/f6662b7b-bdb7-2706-1e12-47c61d3474b6@oracle.com
Signed-off-by: Gerd Rausch &lt;gerd.rausch@oracle.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/rxe: Don't overwrite errno from ib_umem_get()</title>
<updated>2021-07-19T06:53:12+00:00</updated>
<author>
<name>Xiao Yang</name>
<email>yangx.jy@fujitsu.com</email>
</author>
<published>2021-06-21T07:14:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9fd9734e573931c9df1083a6d494a760a61e769e'/>
<id>9fd9734e573931c9df1083a6d494a760a61e769e</id>
<content type='text'>
[ Upstream commit 20ec0a6d6016aa28b9b3299be18baef1a0f91cd2 ]

rxe_mr_init_user() always returns the fixed -EINVAL when ib_umem_get()
fails so it's hard for user to know which actual error happens in
ib_umem_get(). For example, ib_umem_get() will return -EOPNOTSUPP when
trying to pin pages on a DAX file.

Return actual error as mlx4/mlx5 does.

Link: https://lore.kernel.org/r/20210621071456.4259-1-ice_yangxiao@163.com
Signed-off-by: Xiao Yang &lt;yangx.jy@fujitsu.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 20ec0a6d6016aa28b9b3299be18baef1a0f91cd2 ]

rxe_mr_init_user() always returns the fixed -EINVAL when ib_umem_get()
fails so it's hard for user to know which actual error happens in
ib_umem_get(). For example, ib_umem_get() will return -EOPNOTSUPP when
trying to pin pages on a DAX file.

Return actual error as mlx4/mlx5 does.

Link: https://lore.kernel.org/r/20210621071456.4259-1-ice_yangxiao@163.com
Signed-off-by: Xiao Yang &lt;yangx.jy@fujitsu.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/cxgb4: Fix missing error code in create_qp()</title>
<updated>2021-07-19T06:53:09+00:00</updated>
<author>
<name>Jiapeng Chong</name>
<email>jiapeng.chong@linux.alibaba.com</email>
</author>
<published>2021-06-01T11:07:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=57ef44f357254464a87d0fbd448707cade47d04f'/>
<id>57ef44f357254464a87d0fbd448707cade47d04f</id>
<content type='text'>
[ Upstream commit aeb27bb76ad8197eb47890b1ff470d5faf8ec9a5 ]

The error code is missing in this code scenario so 0 will be returned. Add
the error code '-EINVAL' to the return value 'ret'.

Eliminates the follow smatch warning:

drivers/infiniband/hw/cxgb4/qp.c:298 create_qp() warn: missing error code 'ret'.

Link: https://lore.kernel.org/r/1622545669-20625-1-git-send-email-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot &lt;abaci@linux.alibaba.com&gt;
Signed-off-by: Jiapeng Chong &lt;jiapeng.chong@linux.alibaba.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit aeb27bb76ad8197eb47890b1ff470d5faf8ec9a5 ]

The error code is missing in this code scenario so 0 will be returned. Add
the error code '-EINVAL' to the return value 'ret'.

Eliminates the follow smatch warning:

drivers/infiniband/hw/cxgb4/qp.c:298 create_qp() warn: missing error code 'ret'.

Link: https://lore.kernel.org/r/1622545669-20625-1-git-send-email-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot &lt;abaci@linux.alibaba.com&gt;
Signed-off-by: Jiapeng Chong &lt;jiapeng.chong@linux.alibaba.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mlx5: Don't access NULL-cleared mpi pointer</title>
<updated>2021-07-14T14:53:35+00:00</updated>
<author>
<name>Leon Romanovsky</name>
<email>leonro@nvidia.com</email>
</author>
<published>2021-06-29T08:51:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=97704efb93b52c4b2b66a533bc0487e96378f7b1'/>
<id>97704efb93b52c4b2b66a533bc0487e96378f7b1</id>
<content type='text'>
[ Upstream commit 4a754d7637026b42b0c9ba5787ad5ee3bc2ff77f ]

The "dev-&gt;port[i].mp.mpi" is set to NULL during mlx5_ib_unbind_slave_port()
execution, however that field is needed to add device to unaffiliated list.

Such flow causes to the following kernel panic while unloading mlx5_ib
module in multi-port mode, hence the device should be added to the list
prior to unbind call.

 RPC: Unregistered rdma transport module.
 RPC: Unregistered rdma backchannel transport module.
 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] SMP NOPTI
 CPU: 4 PID: 1904 Comm: modprobe Not tainted 5.13.0-rc7_for_upstream_min_debug_2021_06_24_12_08 #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:mlx5_ib_cleanup_multiport_master+0x18b/0x2d0 [mlx5_ib]
 Code: 00 04 0f 85 c4 00 00 00 48 89 df e8 ef fa ff ff 48 8b 83 40 0d 00 00 48 8b 15 b9 e8 05 00 4a 8b 44 28 20 48 89 05 ad e8 05 00 &lt;48&gt; c7 00 d0 57 c5 a0 48 89 50 08 48 89 02 39 ab 88 0a 00 00 0f 86
 RSP: 0018:ffff888116ee3df8 EFLAGS: 00010296
 RAX: 0000000000000000 RBX: ffff8881154f6000 RCX: 0000000000000080
 RDX: ffffffffa0c557d0 RSI: ffff88810b69d200 RDI: 000000000002d8a0
 RBP: 0000000000000002 R08: ffff888110780408 R09: 0000000000000000
 R10: ffff88812452e1c0 R11: fffffffffff7e028 R12: 0000000000000000
 R13: 0000000000000080 R14: ffff888102c58000 R15: 0000000000000000
 FS:  00007f884393a740(0000) GS:ffff8882f5a00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 00000001249f6004 CR4: 0000000000370ea0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  mlx5_ib_stage_init_cleanup+0x16/0xd0 [mlx5_ib]
  __mlx5_ib_remove+0x33/0x90 [mlx5_ib]
  mlx5r_remove+0x22/0x30 [mlx5_ib]
  auxiliary_bus_remove+0x18/0x30
  __device_release_driver+0x177/0x220
  driver_detach+0xc4/0x100
  bus_remove_driver+0x58/0xd0
  auxiliary_driver_unregister+0x12/0x20
  mlx5_ib_cleanup+0x13/0x897 [mlx5_ib]
  __x64_sys_delete_module+0x154/0x230
  ? exit_to_user_mode_prepare+0x104/0x140
  do_syscall_64+0x3f/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f8842e095c7
 Code: 73 01 c3 48 8b 0d d9 48 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 &lt;48&gt; 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 48 2c 00 f7 d8 64 89 01 48
 RSP: 002b:00007ffc68f6e758 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
 RAX: ffffffffffffffda RBX: 00005638207929c0 RCX: 00007f8842e095c7
 RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000563820792a28
 RBP: 00005638207929c0 R08: 00007ffc68f6d701 R09: 0000000000000000
 R10: 00007f8842e82880 R11: 0000000000000206 R12: 0000563820792a28
 R13: 0000000000000001 R14: 0000563820792a28 R15: 00007ffc68f6fb40
 Modules linked in: xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_ipoib ib_cm ib_umad mlx5_ib(-) mlx4_ib ib_uverbs ib_core mlx4_en mlx4_core mlx5_core ptp pps_core [last unloaded: rpcrdma]
 CR2: 0000000000000000
 ---[ end trace a0bb7e20804e9e9b ]---

Fixes: 7ce6095e3bff ("RDMA/mlx5: Don't add slave port to unaffiliated list")
Link: https://lore.kernel.org/r/899ac1b33a995be5ec0e16a4765c4e43c2b1ba5b.1624956444.git.leonro@nvidia.com
Reviewed-by: Itay Aveksis &lt;itayav@nvidia.com&gt;
Reviewed-by: Maor Gottlieb &lt;maorg@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 4a754d7637026b42b0c9ba5787ad5ee3bc2ff77f ]

The "dev-&gt;port[i].mp.mpi" is set to NULL during mlx5_ib_unbind_slave_port()
execution, however that field is needed to add device to unaffiliated list.

Such flow causes to the following kernel panic while unloading mlx5_ib
module in multi-port mode, hence the device should be added to the list
prior to unbind call.

 RPC: Unregistered rdma transport module.
 RPC: Unregistered rdma backchannel transport module.
 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] SMP NOPTI
 CPU: 4 PID: 1904 Comm: modprobe Not tainted 5.13.0-rc7_for_upstream_min_debug_2021_06_24_12_08 #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:mlx5_ib_cleanup_multiport_master+0x18b/0x2d0 [mlx5_ib]
 Code: 00 04 0f 85 c4 00 00 00 48 89 df e8 ef fa ff ff 48 8b 83 40 0d 00 00 48 8b 15 b9 e8 05 00 4a 8b 44 28 20 48 89 05 ad e8 05 00 &lt;48&gt; c7 00 d0 57 c5 a0 48 89 50 08 48 89 02 39 ab 88 0a 00 00 0f 86
 RSP: 0018:ffff888116ee3df8 EFLAGS: 00010296
 RAX: 0000000000000000 RBX: ffff8881154f6000 RCX: 0000000000000080
 RDX: ffffffffa0c557d0 RSI: ffff88810b69d200 RDI: 000000000002d8a0
 RBP: 0000000000000002 R08: ffff888110780408 R09: 0000000000000000
 R10: ffff88812452e1c0 R11: fffffffffff7e028 R12: 0000000000000000
 R13: 0000000000000080 R14: ffff888102c58000 R15: 0000000000000000
 FS:  00007f884393a740(0000) GS:ffff8882f5a00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 00000001249f6004 CR4: 0000000000370ea0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  mlx5_ib_stage_init_cleanup+0x16/0xd0 [mlx5_ib]
  __mlx5_ib_remove+0x33/0x90 [mlx5_ib]
  mlx5r_remove+0x22/0x30 [mlx5_ib]
  auxiliary_bus_remove+0x18/0x30
  __device_release_driver+0x177/0x220
  driver_detach+0xc4/0x100
  bus_remove_driver+0x58/0xd0
  auxiliary_driver_unregister+0x12/0x20
  mlx5_ib_cleanup+0x13/0x897 [mlx5_ib]
  __x64_sys_delete_module+0x154/0x230
  ? exit_to_user_mode_prepare+0x104/0x140
  do_syscall_64+0x3f/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f8842e095c7
 Code: 73 01 c3 48 8b 0d d9 48 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 &lt;48&gt; 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 48 2c 00 f7 d8 64 89 01 48
 RSP: 002b:00007ffc68f6e758 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
 RAX: ffffffffffffffda RBX: 00005638207929c0 RCX: 00007f8842e095c7
 RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000563820792a28
 RBP: 00005638207929c0 R08: 00007ffc68f6d701 R09: 0000000000000000
 R10: 00007f8842e82880 R11: 0000000000000206 R12: 0000563820792a28
 R13: 0000000000000001 R14: 0000563820792a28 R15: 00007ffc68f6fb40
 Modules linked in: xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_ipoib ib_cm ib_umad mlx5_ib(-) mlx4_ib ib_uverbs ib_core mlx4_en mlx4_core mlx5_core ptp pps_core [last unloaded: rpcrdma]
 CR2: 0000000000000000
 ---[ end trace a0bb7e20804e9e9b ]---

Fixes: 7ce6095e3bff ("RDMA/mlx5: Don't add slave port to unaffiliated list")
Link: https://lore.kernel.org/r/899ac1b33a995be5ec0e16a4765c4e43c2b1ba5b.1624956444.git.leonro@nvidia.com
Reviewed-by: Itay Aveksis &lt;itayav@nvidia.com&gt;
Reviewed-by: Maor Gottlieb &lt;maorg@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
