<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/net/ethernet, branch linux-5.6.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>net/mlx5: Disable reload while removing the device</title>
<updated>2020-06-17T14:42:02+00:00</updated>
<author>
<name>Parav Pandit</name>
<email>parav@mellanox.com</email>
</author>
<published>2020-05-14T10:12:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=dd7eb4c0e0e91e3d818f39ce6170d719569746c1'/>
<id>dd7eb4c0e0e91e3d818f39ce6170d719569746c1</id>
<content type='text'>
[ Upstream commit 60904cd349abc98cb888fc28d1ca55a8e2cf87b3 ]

While unregistration is in progress, user might be reloading the
interface.
This can race with unregistration in below flow which uses the
resources which are getting disabled by reload flow.

Hence, disable the devlink reloading first when removing the device.

     CPU0                                   CPU1
     ----                                   ----
local_pci_remove()                  devlink_mutex
  remove_one()                       devlink_nl_cmd_reload()
    mlx5_unregister_device()           devlink_reload()
                                       ops-&gt;reload_down()
                                         mlx5_unload_one()

Fixes: 4383cfcc65e7 ("net/mlx5: Add devlink reload")
Signed-off-by: Parav Pandit &lt;parav@mellanox.com&gt;
Reviewed-by: Moshe Shemesh &lt;moshe@mellanox.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 60904cd349abc98cb888fc28d1ca55a8e2cf87b3 ]

While unregistration is in progress, user might be reloading the
interface.
This can race with unregistration in below flow which uses the
resources which are getting disabled by reload flow.

Hence, disable the devlink reloading first when removing the device.

     CPU0                                   CPU1
     ----                                   ----
local_pci_remove()                  devlink_mutex
  remove_one()                       devlink_nl_cmd_reload()
    mlx5_unregister_device()           devlink_reload()
                                       ops-&gt;reload_down()
                                         mlx5_unload_one()

Fixes: 4383cfcc65e7 ("net/mlx5: Add devlink reload")
Signed-off-by: Parav Pandit &lt;parav@mellanox.com&gt;
Reviewed-by: Moshe Shemesh &lt;moshe@mellanox.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: macb: Only disable NAPI on the actual error path</title>
<updated>2020-06-17T14:42:02+00:00</updated>
<author>
<name>Charles Keepax</name>
<email>ckeepax@opensource.cirrus.com</email>
</author>
<published>2020-06-15T13:18:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=13f6e6b4de818242f974e51126720250df51a85b'/>
<id>13f6e6b4de818242f974e51126720250df51a85b</id>
<content type='text'>
[ Upstream commit 939a5bf7c9b7a1ad9c5d3481c93766a522773531 ]

A recent change added a disable to NAPI into macb_open, this was
intended to only happen on the error path but accidentally applies
to all paths. This causes NAPI to be disabled on the success path, which
leads to the network to no longer functioning.

Fixes: 014406babc1f ("net: cadence: macb: disable NAPI on error")
Signed-off-by: Charles Keepax &lt;ckeepax@opensource.cirrus.com&gt;
Tested-by: Corentin Labbe &lt;clabbe@baylibre.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 939a5bf7c9b7a1ad9c5d3481c93766a522773531 ]

A recent change added a disable to NAPI into macb_open, this was
intended to only happen on the error path but accidentally applies
to all paths. This causes NAPI to be disabled on the success path, which
leads to the network to no longer functioning.

Fixes: 014406babc1f ("net: cadence: macb: disable NAPI on error")
Signed-off-by: Charles Keepax &lt;ckeepax@opensource.cirrus.com&gt;
Tested-by: Corentin Labbe &lt;clabbe@baylibre.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: cadence: macb: disable NAPI on error</title>
<updated>2020-06-17T14:42:02+00:00</updated>
<author>
<name>Corentin Labbe</name>
<email>clabbe@baylibre.com</email>
</author>
<published>2020-06-10T09:53:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=76fa3dea81366c5ebdc3cbf5e27c560b1aee21bc'/>
<id>76fa3dea81366c5ebdc3cbf5e27c560b1aee21bc</id>
<content type='text'>
[ Upstream commit 014406babc1f5f887a08737566b5b356c7018242 ]

When the PHY is not working, the macb driver crash on a second try to
setup it.
[   78.545994] macb e000b000.ethernet eth0: Could not attach PHY (-19)
ifconfig: SIOCSIFFLAGS: No such device
[   78.655457] ------------[ cut here ]------------
[   78.656014] kernel BUG at /linux-next/include/linux/netdevice.h:521!
[   78.656504] Internal error: Oops - BUG: 0 [#1] SMP ARM
[   78.657079] Modules linked in:
[   78.657795] CPU: 0 PID: 122 Comm: ifconfig Not tainted 5.7.0-next-20200609 #1
[   78.658202] Hardware name: Xilinx Zynq Platform
[   78.659632] PC is at macb_open+0x220/0x294
[   78.660160] LR is at 0x0
[   78.660373] pc : [&lt;c0b0a634&gt;]    lr : [&lt;00000000&gt;]    psr: 60000013
[   78.660716] sp : c89ffd70  ip : c8a28800  fp : c199bac0
[   78.661040] r10: 00000000  r9 : c8838540  r8 : c8838568
[   78.661362] r7 : 00000001  r6 : c8838000  r5 : c883c000  r4 : 00000000
[   78.661724] r3 : 00000010  r2 : 00000000  r1 : 00000000  r0 : 00000000
[   78.662187] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   78.662635] Control: 10c5387d  Table: 08b64059  DAC: 00000051
[   78.663035] Process ifconfig (pid: 122, stack limit = 0x(ptrval))
[   78.663476] Stack: (0xc89ffd70 to 0xc8a00000)
[   78.664121] fd60:                                     00000000 c89fe000 c8838000 c89fe000
[   78.664866] fd80: 00000000 c11ff9ac c8838028 00000000 00000000 c0de6f2c 00000001 c1804eec
[   78.665579] fda0: c19b8178 c8838000 00000000 ca760866 c8838000 00000001 00001043 c89fe000
[   78.666355] fdc0: 00001002 c0de72f4 c89fe000 c0de8dc0 00008914 c89fe000 c199bac0 ca760866
[   78.667111] fde0: c89ffddc c8838000 00001002 00000000 c8838138 c881010c 00008914 c0de7364
[   78.667862] fe00: 00000000 c89ffe70 c89fe000 ffffffff c881010c c0e8bd48 00000003 00000000
[   78.668601] fe20: c8838000 c8810100 39c1118f 00039c11 c89a0960 00001043 00000000 000a26d0
[   78.669343] fe40: b6f43000 ca760866 c89a0960 00000051 befe6c50 00008914 c8b2a3c0 befe6c50
[   78.670086] fe60: 00000003 ee610500 00000000 c0e8ef58 30687465 00000000 00000000 00000000
[   78.670865] fe80: 00001043 00000000 000a26d0 b6f43000 c89a0600 ee40ae7c c8870d00 c0ddabf4
[   78.671593] fea0: c89ffeec c0ddabf4 c89ffeec c199bac0 00008913 c0ddac48 c89ffeec c89fe000
[   78.672324] fec0: befe6c50 ca760866 befe6c50 00008914 c89fe000 befe6c50 c8b2a3c0 c0dc00e4
[   78.673088] fee0: c89a0480 00000201 00000cc0 30687465 00000000 00000000 00000000 00001002
[   78.673822] ff00: 00000000 000a26d0 b6f43000 ca760866 00008914 c8b2a3c0 000a0ec4 c8b2a3c0
[   78.674576] ff20: befe6c50 c04b21bc 000d5004 00000817 c89a0480 c0315f94 00000000 00000003
[   78.675415] ff40: c19a2bc8 c8a3cc00 c89fe000 00000255 00000000 00000000 00000000 000d5000
[   78.676182] ff60: 000f6000 c180b2a0 00000817 c0315e64 000d5004 c89fffb0 b6ec0c30 ca760866
[   78.676928] ff80: 00000000 000b609b befe6c50 000a0ec4 00000036 c03002c4 c89fe000 00000036
[   78.677673] ffa0: 00000000 c03000c0 000b609b befe6c50 00000003 00008914 befe6c50 000b609b
[   78.678415] ffc0: 000b609b befe6c50 000a0ec4 00000036 befe6e0c befe6f1a 000d5150 00000000
[   78.679154] ffe0: 000d41e4 befe6bf4 00019648 b6e4509c 20000010 00000003 00000000 00000000
[   78.681059] [&lt;c0b0a634&gt;] (macb_open) from [&lt;c0de6f2c&gt;] (__dev_open+0xd0/0x154)
[   78.681571] [&lt;c0de6f2c&gt;] (__dev_open) from [&lt;c0de72f4&gt;] (__dev_change_flags+0x16c/0x1c4)
[   78.682015] [&lt;c0de72f4&gt;] (__dev_change_flags) from [&lt;c0de7364&gt;] (dev_change_flags+0x18/0x48)
[   78.682493] [&lt;c0de7364&gt;] (dev_change_flags) from [&lt;c0e8bd48&gt;] (devinet_ioctl+0x5e4/0x75c)
[   78.682945] [&lt;c0e8bd48&gt;] (devinet_ioctl) from [&lt;c0e8ef58&gt;] (inet_ioctl+0x1f0/0x3b4)
[   78.683381] [&lt;c0e8ef58&gt;] (inet_ioctl) from [&lt;c0dc00e4&gt;] (sock_ioctl+0x39c/0x664)
[   78.683818] [&lt;c0dc00e4&gt;] (sock_ioctl) from [&lt;c04b21bc&gt;] (ksys_ioctl+0x2d8/0x9c0)
[   78.684343] [&lt;c04b21bc&gt;] (ksys_ioctl) from [&lt;c03000c0&gt;] (ret_fast_syscall+0x0/0x54)
[   78.684789] Exception stack(0xc89fffa8 to 0xc89ffff0)
[   78.685346] ffa0:                   000b609b befe6c50 00000003 00008914 befe6c50 000b609b
[   78.686106] ffc0: 000b609b befe6c50 000a0ec4 00000036 befe6e0c befe6f1a 000d5150 00000000
[   78.686710] ffe0: 000d41e4 befe6bf4 00019648 b6e4509c
[   78.687582] Code: 9a000003 e5983078 e3130001 1affffef (e7f001f2)
[   78.688788] ---[ end trace e3f2f6ab69754eae ]---

This is due to NAPI left enabled if macb_phylink_connect() fail.

Fixes: 7897b071ac3b ("net: macb: convert to phylink")
Signed-off-by: Corentin Labbe &lt;clabbe@baylibre.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 014406babc1f5f887a08737566b5b356c7018242 ]

When the PHY is not working, the macb driver crash on a second try to
setup it.
[   78.545994] macb e000b000.ethernet eth0: Could not attach PHY (-19)
ifconfig: SIOCSIFFLAGS: No such device
[   78.655457] ------------[ cut here ]------------
[   78.656014] kernel BUG at /linux-next/include/linux/netdevice.h:521!
[   78.656504] Internal error: Oops - BUG: 0 [#1] SMP ARM
[   78.657079] Modules linked in:
[   78.657795] CPU: 0 PID: 122 Comm: ifconfig Not tainted 5.7.0-next-20200609 #1
[   78.658202] Hardware name: Xilinx Zynq Platform
[   78.659632] PC is at macb_open+0x220/0x294
[   78.660160] LR is at 0x0
[   78.660373] pc : [&lt;c0b0a634&gt;]    lr : [&lt;00000000&gt;]    psr: 60000013
[   78.660716] sp : c89ffd70  ip : c8a28800  fp : c199bac0
[   78.661040] r10: 00000000  r9 : c8838540  r8 : c8838568
[   78.661362] r7 : 00000001  r6 : c8838000  r5 : c883c000  r4 : 00000000
[   78.661724] r3 : 00000010  r2 : 00000000  r1 : 00000000  r0 : 00000000
[   78.662187] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   78.662635] Control: 10c5387d  Table: 08b64059  DAC: 00000051
[   78.663035] Process ifconfig (pid: 122, stack limit = 0x(ptrval))
[   78.663476] Stack: (0xc89ffd70 to 0xc8a00000)
[   78.664121] fd60:                                     00000000 c89fe000 c8838000 c89fe000
[   78.664866] fd80: 00000000 c11ff9ac c8838028 00000000 00000000 c0de6f2c 00000001 c1804eec
[   78.665579] fda0: c19b8178 c8838000 00000000 ca760866 c8838000 00000001 00001043 c89fe000
[   78.666355] fdc0: 00001002 c0de72f4 c89fe000 c0de8dc0 00008914 c89fe000 c199bac0 ca760866
[   78.667111] fde0: c89ffddc c8838000 00001002 00000000 c8838138 c881010c 00008914 c0de7364
[   78.667862] fe00: 00000000 c89ffe70 c89fe000 ffffffff c881010c c0e8bd48 00000003 00000000
[   78.668601] fe20: c8838000 c8810100 39c1118f 00039c11 c89a0960 00001043 00000000 000a26d0
[   78.669343] fe40: b6f43000 ca760866 c89a0960 00000051 befe6c50 00008914 c8b2a3c0 befe6c50
[   78.670086] fe60: 00000003 ee610500 00000000 c0e8ef58 30687465 00000000 00000000 00000000
[   78.670865] fe80: 00001043 00000000 000a26d0 b6f43000 c89a0600 ee40ae7c c8870d00 c0ddabf4
[   78.671593] fea0: c89ffeec c0ddabf4 c89ffeec c199bac0 00008913 c0ddac48 c89ffeec c89fe000
[   78.672324] fec0: befe6c50 ca760866 befe6c50 00008914 c89fe000 befe6c50 c8b2a3c0 c0dc00e4
[   78.673088] fee0: c89a0480 00000201 00000cc0 30687465 00000000 00000000 00000000 00001002
[   78.673822] ff00: 00000000 000a26d0 b6f43000 ca760866 00008914 c8b2a3c0 000a0ec4 c8b2a3c0
[   78.674576] ff20: befe6c50 c04b21bc 000d5004 00000817 c89a0480 c0315f94 00000000 00000003
[   78.675415] ff40: c19a2bc8 c8a3cc00 c89fe000 00000255 00000000 00000000 00000000 000d5000
[   78.676182] ff60: 000f6000 c180b2a0 00000817 c0315e64 000d5004 c89fffb0 b6ec0c30 ca760866
[   78.676928] ff80: 00000000 000b609b befe6c50 000a0ec4 00000036 c03002c4 c89fe000 00000036
[   78.677673] ffa0: 00000000 c03000c0 000b609b befe6c50 00000003 00008914 befe6c50 000b609b
[   78.678415] ffc0: 000b609b befe6c50 000a0ec4 00000036 befe6e0c befe6f1a 000d5150 00000000
[   78.679154] ffe0: 000d41e4 befe6bf4 00019648 b6e4509c 20000010 00000003 00000000 00000000
[   78.681059] [&lt;c0b0a634&gt;] (macb_open) from [&lt;c0de6f2c&gt;] (__dev_open+0xd0/0x154)
[   78.681571] [&lt;c0de6f2c&gt;] (__dev_open) from [&lt;c0de72f4&gt;] (__dev_change_flags+0x16c/0x1c4)
[   78.682015] [&lt;c0de72f4&gt;] (__dev_change_flags) from [&lt;c0de7364&gt;] (dev_change_flags+0x18/0x48)
[   78.682493] [&lt;c0de7364&gt;] (dev_change_flags) from [&lt;c0e8bd48&gt;] (devinet_ioctl+0x5e4/0x75c)
[   78.682945] [&lt;c0e8bd48&gt;] (devinet_ioctl) from [&lt;c0e8ef58&gt;] (inet_ioctl+0x1f0/0x3b4)
[   78.683381] [&lt;c0e8ef58&gt;] (inet_ioctl) from [&lt;c0dc00e4&gt;] (sock_ioctl+0x39c/0x664)
[   78.683818] [&lt;c0dc00e4&gt;] (sock_ioctl) from [&lt;c04b21bc&gt;] (ksys_ioctl+0x2d8/0x9c0)
[   78.684343] [&lt;c04b21bc&gt;] (ksys_ioctl) from [&lt;c03000c0&gt;] (ret_fast_syscall+0x0/0x54)
[   78.684789] Exception stack(0xc89fffa8 to 0xc89ffff0)
[   78.685346] ffa0:                   000b609b befe6c50 00000003 00008914 befe6c50 000b609b
[   78.686106] ffc0: 000b609b befe6c50 000a0ec4 00000036 befe6e0c befe6f1a 000d5150 00000000
[   78.686710] ffe0: 000d41e4 befe6bf4 00019648 b6e4509c
[   78.687582] Code: 9a000003 e5983078 e3130001 1affffef (e7f001f2)
[   78.688788] ---[ end trace e3f2f6ab69754eae ]---

This is due to NAPI left enabled if macb_phylink_connect() fail.

Fixes: 7897b071ac3b ("net: macb: convert to phylink")
Signed-off-by: Corentin Labbe &lt;clabbe@baylibre.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/mlx5e: Fix repeated XSK usage on one channel</title>
<updated>2020-06-17T14:42:02+00:00</updated>
<author>
<name>Maxim Mikityanskiy</name>
<email>maximmi@mellanox.com</email>
</author>
<published>2020-06-01T13:03:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e81a5958fab4e6bc1170d91699994849a1d3d46b'/>
<id>e81a5958fab4e6bc1170d91699994849a1d3d46b</id>
<content type='text'>
[ Upstream commit 36d45fb9d2fdf348d778bfe73f0427db1c6f9bc7 ]

After an XSK is closed, the relevant structures in the channel are not
zeroed. If an XSK is opened the second time on the same channel without
recreating channels, the stray values in the structures will lead to
incorrect operation of queues, which causes CQE errors, and the new
socket doesn't work at all.

This patch fixes the issue by explicitly zeroing XSK-related structs in
the channel on XSK close. Note that those structs are zeroed on channel
creation, and usually a configuration change (XDP program is set)
happens on XSK open, which leads to recreating channels, so typical XSK
usecases don't suffer from this issue. However, if XSKs are opened and
closed on the same channel without removing the XDP program, this bug
reproduces.

Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy &lt;maximmi@mellanox.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 36d45fb9d2fdf348d778bfe73f0427db1c6f9bc7 ]

After an XSK is closed, the relevant structures in the channel are not
zeroed. If an XSK is opened the second time on the same channel without
recreating channels, the stray values in the structures will lead to
incorrect operation of queues, which causes CQE errors, and the new
socket doesn't work at all.

This patch fixes the issue by explicitly zeroing XSK-related structs in
the channel on XSK close. Note that those structs are zeroed on channel
creation, and usually a configuration change (XDP program is set)
happens on XSK open, which leads to recreating channels, so typical XSK
usecases don't suffer from this issue. However, if XSKs are opened and
closed on the same channel without removing the XDP program, this bug
reproduces.

Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy &lt;maximmi@mellanox.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/mlx5: Fix fatal error handling during device load</title>
<updated>2020-06-17T14:42:02+00:00</updated>
<author>
<name>Shay Drory</name>
<email>shayd@mellanox.com</email>
</author>
<published>2020-05-07T06:32:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=dd037573069839ff381313cbbc04912458834388'/>
<id>dd037573069839ff381313cbbc04912458834388</id>
<content type='text'>
[ Upstream commit b6e0b6bebe0732d5cac51f0791f269d2413b8980 ]

Currently, in case of fatal error during mlx5_load_one(), we cannot
enter error state until mlx5_load_one() is finished, what can take
several minutes until commands will get timeouts, because these commands
can't be processed due to the fatal error.
Fix it by setting dev-&gt;state as MLX5_DEVICE_STATE_INTERNAL_ERROR before
requesting the lock.

Fixes: c1d4d2e92ad6 ("net/mlx5: Avoid calling sleeping function by the health poll thread")
Signed-off-by: Shay Drory &lt;shayd@mellanox.com&gt;
Reviewed-by: Moshe Shemesh &lt;moshe@mellanox.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit b6e0b6bebe0732d5cac51f0791f269d2413b8980 ]

Currently, in case of fatal error during mlx5_load_one(), we cannot
enter error state until mlx5_load_one() is finished, what can take
several minutes until commands will get timeouts, because these commands
can't be processed due to the fatal error.
Fix it by setting dev-&gt;state as MLX5_DEVICE_STATE_INTERNAL_ERROR before
requesting the lock.

Fixes: c1d4d2e92ad6 ("net/mlx5: Avoid calling sleeping function by the health poll thread")
Signed-off-by: Shay Drory &lt;shayd@mellanox.com&gt;
Reviewed-by: Moshe Shemesh &lt;moshe@mellanox.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/mlx5: drain health workqueue in case of driver load error</title>
<updated>2020-06-17T14:42:02+00:00</updated>
<author>
<name>Shay Drory</name>
<email>shayd@mellanox.com</email>
</author>
<published>2020-05-06T12:59:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0de7f1f2ddccc4f5fcfc16e73380adad88dbe292'/>
<id>0de7f1f2ddccc4f5fcfc16e73380adad88dbe292</id>
<content type='text'>
[ Upstream commit 42ea9f1b5c625fad225d4ac96a7e757dd4199d9c ]

In case there is a work in the health WQ when we teardown the driver,
in driver load error flow, the health work will try to read dev-&gt;iseg,
which was already unmap in mlx5_pci_close().
Fix it by draining the health workqueue first thing in mlx5_pci_close().

Trace of the error:
BUG: unable to handle page fault for address: ffffb5b141c18014
PF: supervisor read access in kernel mode
PF: error_code(0x0000) - not-present page
PGD 1fe95d067 P4D 1fe95d067 PUD 1fe95e067 PMD 1b7823067 PTE 0
Oops: 0000 [#1] SMP PTI
CPU: 3 PID: 6755 Comm: kworker/u128:2 Not tainted 5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  04/28/2016
Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
RIP: 0010:ioread32be+0x30/0x40
Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 &lt;8b&gt; 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03
RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292
RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014
RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0
R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20
FS:  0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? mlx5_health_try_recover+0x4d/0x270 [mlx5_core]
 mlx5_fw_fatal_reporter_recover+0x16/0x20 [mlx5_core]
 devlink_health_reporter_recover+0x1c/0x50
 devlink_health_report+0xfb/0x240
 mlx5_fw_fatal_reporter_err_work+0x65/0xd0 [mlx5_core]
 process_one_work+0x1fb/0x4e0
 ? process_one_work+0x16b/0x4e0
 worker_thread+0x4f/0x3d0
 kthread+0x10d/0x140
 ? process_one_work+0x4e0/0x4e0
 ? kthread_cancel_delayed_work_sync+0x20/0x20
 ret_from_fork+0x1f/0x30
Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache 8021q garp mrp stp llc ipmi_devintf ipmi_msghandler rpcrdma rdma_ucm ib_iser rdma_cm ib_umad iw_cm ib_ipoib libiscsi scsi_transport_iscsi ib_cm mlx5_ib ib_uverbs ib_core mlx5_core sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 mlxfw crypto_simd cryptd glue_helper input_leds hyperv_fb intel_rapl_perf joydev serio_raw pci_hyperv pci_hyperv_mini mac_hid hv_balloon nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel ip_tables x_tables autofs4 hv_utils hid_generic hv_storvsc ptp hid_hyperv hid hv_netvsc hyperv_keyboard pps_core scsi_transport_fc psmouse hv_vmbus i2c_piix4 floppy pata_acpi
CR2: ffffb5b141c18014
---[ end trace b12c5503157cad24 ]---
RIP: 0010:ioread32be+0x30/0x40
Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 &lt;8b&gt; 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03
RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292
RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014
RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0
R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20
FS:  0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:38
in_atomic(): 0, irqs_disabled(): 1, pid: 6755, name: kworker/u128:2
INFO: lockdep is turned off.
CPU: 3 PID: 6755 Comm: kworker/u128:2 Tainted: G      D           5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  04/28/2016
Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
Call Trace:
 dump_stack+0x63/0x88
 ___might_sleep+0x10a/0x130
 __might_sleep+0x4a/0x80
 exit_signals+0x33/0x230
 ? blocking_notifier_call_chain+0x16/0x20
 do_exit+0xb1/0xc30
 ? kthread+0x10d/0x140
 ? process_one_work+0x4e0/0x4e0

Fixes: 52c368dc3da7 ("net/mlx5: Move health and page alloc init to mdev_init")
Signed-off-by: Shay Drory &lt;shayd@mellanox.com&gt;
Reviewed-by: Moshe Shemesh &lt;moshe@mellanox.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 42ea9f1b5c625fad225d4ac96a7e757dd4199d9c ]

In case there is a work in the health WQ when we teardown the driver,
in driver load error flow, the health work will try to read dev-&gt;iseg,
which was already unmap in mlx5_pci_close().
Fix it by draining the health workqueue first thing in mlx5_pci_close().

Trace of the error:
BUG: unable to handle page fault for address: ffffb5b141c18014
PF: supervisor read access in kernel mode
PF: error_code(0x0000) - not-present page
PGD 1fe95d067 P4D 1fe95d067 PUD 1fe95e067 PMD 1b7823067 PTE 0
Oops: 0000 [#1] SMP PTI
CPU: 3 PID: 6755 Comm: kworker/u128:2 Not tainted 5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  04/28/2016
Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
RIP: 0010:ioread32be+0x30/0x40
Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 &lt;8b&gt; 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03
RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292
RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014
RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0
R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20
FS:  0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? mlx5_health_try_recover+0x4d/0x270 [mlx5_core]
 mlx5_fw_fatal_reporter_recover+0x16/0x20 [mlx5_core]
 devlink_health_reporter_recover+0x1c/0x50
 devlink_health_report+0xfb/0x240
 mlx5_fw_fatal_reporter_err_work+0x65/0xd0 [mlx5_core]
 process_one_work+0x1fb/0x4e0
 ? process_one_work+0x16b/0x4e0
 worker_thread+0x4f/0x3d0
 kthread+0x10d/0x140
 ? process_one_work+0x4e0/0x4e0
 ? kthread_cancel_delayed_work_sync+0x20/0x20
 ret_from_fork+0x1f/0x30
Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache 8021q garp mrp stp llc ipmi_devintf ipmi_msghandler rpcrdma rdma_ucm ib_iser rdma_cm ib_umad iw_cm ib_ipoib libiscsi scsi_transport_iscsi ib_cm mlx5_ib ib_uverbs ib_core mlx5_core sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 mlxfw crypto_simd cryptd glue_helper input_leds hyperv_fb intel_rapl_perf joydev serio_raw pci_hyperv pci_hyperv_mini mac_hid hv_balloon nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel ip_tables x_tables autofs4 hv_utils hid_generic hv_storvsc ptp hid_hyperv hid hv_netvsc hyperv_keyboard pps_core scsi_transport_fc psmouse hv_vmbus i2c_piix4 floppy pata_acpi
CR2: ffffb5b141c18014
---[ end trace b12c5503157cad24 ]---
RIP: 0010:ioread32be+0x30/0x40
Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 &lt;8b&gt; 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03
RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292
RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014
RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0
R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20
FS:  0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:38
in_atomic(): 0, irqs_disabled(): 1, pid: 6755, name: kworker/u128:2
INFO: lockdep is turned off.
CPU: 3 PID: 6755 Comm: kworker/u128:2 Tainted: G      D           5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  04/28/2016
Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
Call Trace:
 dump_stack+0x63/0x88
 ___might_sleep+0x10a/0x130
 __might_sleep+0x4a/0x80
 exit_signals+0x33/0x230
 ? blocking_notifier_call_chain+0x16/0x20
 do_exit+0xb1/0xc30
 ? kthread+0x10d/0x140
 ? process_one_work+0x4e0/0x4e0

Fixes: 52c368dc3da7 ("net/mlx5: Move health and page alloc init to mdev_init")
Signed-off-by: Shay Drory &lt;shayd@mellanox.com&gt;
Reviewed-by: Moshe Shemesh &lt;moshe@mellanox.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: mvneta: do not redirect frames during reconfiguration</title>
<updated>2020-06-17T14:42:01+00:00</updated>
<author>
<name>Lorenzo Bianconi</name>
<email>lorenzo@kernel.org</email>
</author>
<published>2020-06-08T22:02:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=810805615c9735750aac945e0b6327bcd27efa4d'/>
<id>810805615c9735750aac945e0b6327bcd27efa4d</id>
<content type='text'>
[ Upstream commit 62a502cc91f97e3ffd312d9b42e8d01a137c63ff ]

Disable frames injection in mvneta_xdp_xmit routine during hw
re-configuration in order to avoid hardware hangs

Fixes: b0a43db9087a ("net: mvneta: add XDP_TX support")
Signed-off-by: Lorenzo Bianconi &lt;lorenzo@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 62a502cc91f97e3ffd312d9b42e8d01a137c63ff ]

Disable frames injection in mvneta_xdp_xmit routine during hw
re-configuration in order to avoid hardware hangs

Fixes: b0a43db9087a ("net: mvneta: add XDP_TX support")
Signed-off-by: Lorenzo Bianconi &lt;lorenzo@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drivers/net/ibmvnic: Update VNIC protocol version reporting</title>
<updated>2020-06-17T14:41:50+00:00</updated>
<author>
<name>Thomas Falcon</name>
<email>tlfalcon@linux.ibm.com</email>
</author>
<published>2020-05-28T16:19:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5cbc404409d1bfe5da3da8582fdad34e93f4393d'/>
<id>5cbc404409d1bfe5da3da8582fdad34e93f4393d</id>
<content type='text'>
[ Upstream commit 784688993ebac34dffe44a9f2fabbe126ebfd4db ]

VNIC protocol version is reported in big-endian format, but it
is not byteswapped before logging. Fix that, and remove version
comparison as only one protocol version exists at this time.

Signed-off-by: Thomas Falcon &lt;tlfalcon@linux.ibm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 784688993ebac34dffe44a9f2fabbe126ebfd4db ]

VNIC protocol version is reported in big-endian format, but it
is not byteswapped before logging. Fix that, and remove version
comparison as only one protocol version exists at this time.

Signed-off-by: Thomas Falcon &lt;tlfalcon@linux.ibm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: ena: xdp: update napi budget for DROP and ABORTED</title>
<updated>2020-06-17T14:41:48+00:00</updated>
<author>
<name>Sameeh Jubran</name>
<email>sameehj@amazon.com</email>
</author>
<published>2020-06-03T08:50:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=14a14e330bf36a9c8792ebfb79b0676409069e96'/>
<id>14a14e330bf36a9c8792ebfb79b0676409069e96</id>
<content type='text'>
[ Upstream commit 3921a81c31df6057183aeb7f7d204003bf699d6f ]

This patch fixes two issues with XDP:

1. If the XDP verdict is XDP_ABORTED we break the loop, which results in
   us handling one buffer per napi cycle instead of the total budget
   (usually 64). To overcome this simply change the xdp_verdict check to
   != XDP_PASS. When the verdict is XDP_PASS, the skb is not expected to
   be NULL.

2. Update the residual budget for XDP_DROP and XDP_ABORTED, since
   packets are handled in these cases.

Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action")
Signed-off-by: Sameeh Jubran &lt;sameehj@amazon.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 3921a81c31df6057183aeb7f7d204003bf699d6f ]

This patch fixes two issues with XDP:

1. If the XDP verdict is XDP_ABORTED we break the loop, which results in
   us handling one buffer per napi cycle instead of the total budget
   (usually 64). To overcome this simply change the xdp_verdict check to
   != XDP_PASS. When the verdict is XDP_PASS, the skb is not expected to
   be NULL.

2. Update the residual budget for XDP_DROP and XDP_ABORTED, since
   packets are handled in these cases.

Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action")
Signed-off-by: Sameeh Jubran &lt;sameehj@amazon.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: ena: xdp: XDP_TX: fix memory leak</title>
<updated>2020-06-17T14:41:48+00:00</updated>
<author>
<name>Sameeh Jubran</name>
<email>sameehj@amazon.com</email>
</author>
<published>2020-06-03T08:50:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=caf2689675433028b65cc1abd00ce9b3c7d9d391'/>
<id>caf2689675433028b65cc1abd00ce9b3c7d9d391</id>
<content type='text'>
[ Upstream commit cd07ecccba13b8bd5023ffe7be57363d07e3105f ]

When sending very high packet rate, the XDP tx queues can get full and
start dropping packets. In this case we don't free the pages which
results in ena driver draining the system memory.

Fix:
Simply free the pages when necessary.

Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action")
Signed-off-by: Sameeh Jubran &lt;sameehj@amazon.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit cd07ecccba13b8bd5023ffe7be57363d07e3105f ]

When sending very high packet rate, the XDP tx queues can get full and
start dropping packets. In this case we don't free the pages which
results in ena driver draining the system memory.

Fix:
Simply free the pages when necessary.

Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action")
Signed-off-by: Sameeh Jubran &lt;sameehj@amazon.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
