<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/core/net-sysfs.c, branch linux-5.14.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>net-sysfs: try not to restart the syscall if it will fail eventually</title>
<updated>2021-11-17T10:03:49+00:00</updated>
<author>
<name>Antoine Tenart</name>
<email>atenart@kernel.org</email>
</author>
<published>2021-10-07T14:00:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=84aa6d0e076764c475b7f1e7e3ddc39898e470b2'/>
<id>84aa6d0e076764c475b7f1e7e3ddc39898e470b2</id>
<content type='text'>
[ Upstream commit 146e5e733310379f51924111068f08a3af0db830 ]

Due to deadlocks in the networking subsystem spotted 12 years ago[1],
a workaround was put in place[2] to avoid taking the rtnl lock when it
was not available and restarting the syscall (back to VFS, letting
userspace spin). The following construction is found a lot in the net
sysfs and sysctl code:

  if (!rtnl_trylock())
          return restart_syscall();

This can be problematic when multiple userspace threads use such
interfaces in a short period, making them to spin a lot. This happens
for example when adding and moving virtual interfaces: userspace
programs listening on events, such as systemd-udevd and NetworkManager,
do trigger actions reading files in sysfs. It gets worse when a lot of
virtual interfaces are created concurrently, say when creating
containers at boot time.

Returning early without hitting the above pattern when the syscall will
fail eventually does make things better. While it is not a fix for the
issue, it does ease things.

[1] https://lore.kernel.org/netdev/49A4D5D5.5090602@trash.net/
    https://lore.kernel.org/netdev/m14oyhis31.fsf@fess.ebiederm.org/
    and https://lore.kernel.org/netdev/20090226084924.16cb3e08@nehalam/
[2] Rightfully, those deadlocks are *hard* to solve.

Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Reviewed-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 146e5e733310379f51924111068f08a3af0db830 ]

Due to deadlocks in the networking subsystem spotted 12 years ago[1],
a workaround was put in place[2] to avoid taking the rtnl lock when it
was not available and restarting the syscall (back to VFS, letting
userspace spin). The following construction is found a lot in the net
sysfs and sysctl code:

  if (!rtnl_trylock())
          return restart_syscall();

This can be problematic when multiple userspace threads use such
interfaces in a short period, making them to spin a lot. This happens
for example when adding and moving virtual interfaces: userspace
programs listening on events, such as systemd-udevd and NetworkManager,
do trigger actions reading files in sysfs. It gets worse when a lot of
virtual interfaces are created concurrently, say when creating
containers at boot time.

Returning early without hitting the above pattern when the syscall will
fail eventually does make things better. While it is not a fix for the
issue, it does ease things.

[1] https://lore.kernel.org/netdev/49A4D5D5.5090602@trash.net/
    https://lore.kernel.org/netdev/m14oyhis31.fsf@fess.ebiederm.org/
    and https://lore.kernel.org/netdev/20090226084924.16cb3e08@nehalam/
[2] Rightfully, those deadlocks are *hard* to solve.

Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Reviewed-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net-sysfs: initialize uid and gid before calling net_ns_get_ownership</title>
<updated>2021-11-02T18:50:57+00:00</updated>
<author>
<name>Xin Long</name>
<email>lucien.xin@gmail.com</email>
</author>
<published>2021-10-25T06:31:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=da279dac227a034595c99bb4294f951a08359640'/>
<id>da279dac227a034595c99bb4294f951a08359640</id>
<content type='text'>
commit f7a1e76d0f608961cc2fc681f867a834f2746bce upstream.

Currently in net_ns_get_ownership() it may not be able to set uid or gid
if make_kuid or make_kgid returns an invalid value, and an uninit-value
issue can be triggered by this.

This patch is to fix it by initializing the uid and gid before calling
net_ns_get_ownership(), as it does in kobject_get_ownership()

Fixes: e6dee9f3893c ("net-sysfs: add netdev_change_owner()")
Reported-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit f7a1e76d0f608961cc2fc681f867a834f2746bce upstream.

Currently in net_ns_get_ownership() it may not be able to set uid or gid
if make_kuid or make_kgid returns an invalid value, and an uninit-value
issue can be triggered by this.

This patch is to fix it by initializing the uid and gid before calling
net_ns_get_ownership(), as it does in kobject_get_ownership()

Fixes: e6dee9f3893c ("net-sysfs: add netdev_change_owner()")
Reported-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net-sysfs: remove possible sleep from an RCU read-side critical section</title>
<updated>2021-03-22T20:28:13+00:00</updated>
<author>
<name>Antoine Tenart</name>
<email>atenart@kernel.org</email>
</author>
<published>2021-03-22T15:43:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7f08ec6e04269ce53b664761c9108b44ed2f54ab'/>
<id>7f08ec6e04269ce53b664761c9108b44ed2f54ab</id>
<content type='text'>
xps_queue_show is mostly made of an RCU read-side critical section and
calls bitmap_zalloc with GFP_KERNEL in the middle of it. That is not
allowed as this call may sleep and such behaviours aren't allowed in RCU
read-side critical sections. Fix this by using GFP_NOWAIT instead.

Fixes: 5478fcd0f483 ("net: embed nr_ids in the xps maps")
Reported-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Suggested-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
xps_queue_show is mostly made of an RCU read-side critical section and
calls bitmap_zalloc with GFP_KERNEL in the middle of it. That is not
allowed as this call may sleep and such behaviours aren't allowed in RCU
read-side critical sections. Fix this by using GFP_NOWAIT instead.

Fixes: 5478fcd0f483 ("net: embed nr_ids in the xps maps")
Reported-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Suggested-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net-sysfs: move the xps cpus/rxqs retrieval in a common function</title>
<updated>2021-03-18T21:56:22+00:00</updated>
<author>
<name>Antoine Tenart</name>
<email>atenart@kernel.org</email>
</author>
<published>2021-03-18T18:37:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2db6cdaebac83c13acb165594b09282fa03cec89'/>
<id>2db6cdaebac83c13acb165594b09282fa03cec89</id>
<content type='text'>
Most of the xps_cpus_show and xps_rxqs_show functions share the same
logic. Having it in two different functions does not help maintenance.
This patch moves their common logic into a new function, xps_queue_show,
to improve this.

Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Most of the xps_cpus_show and xps_rxqs_show functions share the same
logic. Having it in two different functions does not help maintenance.
This patch moves their common logic into a new function, xps_queue_show,
to improve this.

Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net-sysfs: move the rtnl unlock up in the xps show helpers</title>
<updated>2021-03-18T21:56:22+00:00</updated>
<author>
<name>Antoine Tenart</name>
<email>atenart@kernel.org</email>
</author>
<published>2021-03-18T18:37:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d7be87a687cc261d663dcf97c01056f71398f9f9'/>
<id>d7be87a687cc261d663dcf97c01056f71398f9f9</id>
<content type='text'>
Now that nr_ids and num_tc are stored in the xps dev_maps, which are RCU
protected, we do not have the need to protect the maps in the rtnl lock.
Move the rtnl unlock up so we reduce the rtnl locking section.

We also increase the reference count on the subordinate device if any,
as we don't want this device to be freed while we use it (now that the
rtnl lock isn't protecting it in the whole function).

Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that nr_ids and num_tc are stored in the xps dev_maps, which are RCU
protected, we do not have the need to protect the maps in the rtnl lock.
Move the rtnl unlock up so we reduce the rtnl locking section.

We also increase the reference count on the subordinate device if any,
as we don't want this device to be freed while we use it (now that the
rtnl lock isn't protecting it in the whole function).

Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: move the xps maps to an array</title>
<updated>2021-03-18T21:56:22+00:00</updated>
<author>
<name>Antoine Tenart</name>
<email>atenart@kernel.org</email>
</author>
<published>2021-03-18T18:37:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=044ab86d431b59b88966457dbb62679f274ec442'/>
<id>044ab86d431b59b88966457dbb62679f274ec442</id>
<content type='text'>
Move the xps maps (xps_cpus_map and xps_rxqs_map) to an array in
net_device. That will simplify a lot the code removing the need for lots
of if/else conditionals as the correct map will be available using its
offset in the array.

This should not modify the xps maps behaviour in any way.

Suggested-by: Alexander Duyck &lt;alexander.duyck@gmail.com&gt;
Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move the xps maps (xps_cpus_map and xps_rxqs_map) to an array in
net_device. That will simplify a lot the code removing the need for lots
of if/else conditionals as the correct map will be available using its
offset in the array.

This should not modify the xps maps behaviour in any way.

Suggested-by: Alexander Duyck &lt;alexander.duyck@gmail.com&gt;
Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: remove the xps possible_mask</title>
<updated>2021-03-18T21:56:22+00:00</updated>
<author>
<name>Antoine Tenart</name>
<email>atenart@kernel.org</email>
</author>
<published>2021-03-18T18:37:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6f36158e058409ec5ceb4290541e77ae2648fc86'/>
<id>6f36158e058409ec5ceb4290541e77ae2648fc86</id>
<content type='text'>
Remove the xps possible_mask. It was an optimization but we can just
loop from 0 to nr_ids now that it is embedded in the xps dev_maps. That
simplifies the code a bit.

Suggested-by: Alexander Duyck &lt;alexander.duyck@gmail.com&gt;
Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove the xps possible_mask. It was an optimization but we can just
loop from 0 to nr_ids now that it is embedded in the xps dev_maps. That
simplifies the code a bit.

Suggested-by: Alexander Duyck &lt;alexander.duyck@gmail.com&gt;
Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: embed nr_ids in the xps maps</title>
<updated>2021-03-18T21:56:22+00:00</updated>
<author>
<name>Antoine Tenart</name>
<email>atenart@kernel.org</email>
</author>
<published>2021-03-18T18:37:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5478fcd0f48322e04ae6c173ad3a1959e066dc83'/>
<id>5478fcd0f48322e04ae6c173ad3a1959e066dc83</id>
<content type='text'>
Embed nr_ids (the number of cpu for the xps cpus map, and the number of
rxqs for the xps cpus map) in dev_maps. That will help not accessing out
of bound memory if those values change after dev_maps was allocated.

Suggested-by: Alexander Duyck &lt;alexander.duyck@gmail.com&gt;
Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Embed nr_ids (the number of cpu for the xps cpus map, and the number of
rxqs for the xps cpus map) in dev_maps. That will help not accessing out
of bound memory if those values change after dev_maps was allocated.

Suggested-by: Alexander Duyck &lt;alexander.duyck@gmail.com&gt;
Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: embed num_tc in the xps maps</title>
<updated>2021-03-18T21:56:22+00:00</updated>
<author>
<name>Antoine Tenart</name>
<email>atenart@kernel.org</email>
</author>
<published>2021-03-18T18:37:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=255c04a87f4381849fce9ed81e5efabf78a71a30'/>
<id>255c04a87f4381849fce9ed81e5efabf78a71a30</id>
<content type='text'>
The xps cpus/rxqs map is accessed using dev-&gt;num_tc, which is used when
allocating the map. But later updates of dev-&gt;num_tc can lead to having
a mismatch between the maps and how they're accessed. In such cases the
map values do not make any sense and out of bound accesses can occur
(that can be easily seen using KASAN).

This patch aims at fixing this by embedding num_tc into the maps, using
the value at the time the map is created. This brings two improvements:
- The maps can be accessed using the embedded num_tc, so we know for
  sure we won't have out of bound accesses.
- Checks can be made before accessing the maps so we know the values
  retrieved will make sense.

We also update __netif_set_xps_queue to conditionally copy old maps from
dev_maps in the new one only if the number of traffic classes from both
maps match.

Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The xps cpus/rxqs map is accessed using dev-&gt;num_tc, which is used when
allocating the map. But later updates of dev-&gt;num_tc can lead to having
a mismatch between the maps and how they're accessed. In such cases the
map values do not make any sense and out of bound accesses can occur
(that can be easily seen using KASAN).

This patch aims at fixing this by embedding num_tc into the maps, using
the value at the time the map is created. This brings two improvements:
- The maps can be accessed using the embedded num_tc, so we know for
  sure we won't have out of bound accesses.
- Checks can be made before accessing the maps so we know the values
  retrieved will make sense.

We also update __netif_set_xps_queue to conditionally copy old maps from
dev_maps in the new one only if the number of traffic classes from both
maps match.

Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net-sysfs: make xps_cpus_show and xps_rxqs_show consistent</title>
<updated>2021-03-18T21:56:22+00:00</updated>
<author>
<name>Antoine Tenart</name>
<email>atenart@kernel.org</email>
</author>
<published>2021-03-18T18:37:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=73f5e52b15e3aa4ef641264228cd9069b1948149'/>
<id>73f5e52b15e3aa4ef641264228cd9069b1948149</id>
<content type='text'>
Make the implementations of xps_cpus_show and xps_rxqs_show to converge,
as the two share the same logic but diverted over time. This should not
modify their behaviour but will help future changes and improve
maintenance.

Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make the implementations of xps_cpus_show and xps_rxqs_show to converge,
as the two share the same logic but diverted over time. This should not
modify their behaviour but will help future changes and improve
maintenance.

Signed-off-by: Antoine Tenart &lt;atenart@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
