<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/core/sock.c, branch v6.5</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>sock: Fix misuse of sk_under_memory_pressure()</title>
<updated>2023-08-17T18:34:36+00:00</updated>
<author>
<name>Abel Wu</name>
<email>wuyun.abel@bytedance.com</email>
</author>
<published>2023-08-16T09:12:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2d0c88e84e483982067a82073f6125490ddf3614'/>
<id>2d0c88e84e483982067a82073f6125490ddf3614</id>
<content type='text'>
The status of global socket memory pressure is updated when:

  a) __sk_mem_raise_allocated():

	enter: sk_memory_allocated(sk) &gt;  sysctl_mem[1]
	leave: sk_memory_allocated(sk) &lt;= sysctl_mem[0]

  b) __sk_mem_reduce_allocated():

	leave: sk_under_memory_pressure(sk) &amp;&amp;
		sk_memory_allocated(sk) &lt; sysctl_mem[0]

So the conditions of leaving global pressure are inconstant, which
may lead to the situation that one pressured net-memcg prevents the
global pressure from being cleared when there is indeed no global
pressure, thus the global constrains are still in effect unexpectedly
on the other sockets.

This patch fixes this by ignoring the net-memcg's pressure when
deciding whether should leave global memory pressure.

Fixes: e1aab161e013 ("socket: initial cgroup code.")
Signed-off-by: Abel Wu &lt;wuyun.abel@bytedance.com&gt;
Acked-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Link: https://lore.kernel.org/r/20230816091226.1542-1-wuyun.abel@bytedance.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The status of global socket memory pressure is updated when:

  a) __sk_mem_raise_allocated():

	enter: sk_memory_allocated(sk) &gt;  sysctl_mem[1]
	leave: sk_memory_allocated(sk) &lt;= sysctl_mem[0]

  b) __sk_mem_reduce_allocated():

	leave: sk_under_memory_pressure(sk) &amp;&amp;
		sk_memory_allocated(sk) &lt; sysctl_mem[0]

So the conditions of leaving global pressure are inconstant, which
may lead to the situation that one pressured net-memcg prevents the
global pressure from being cleared when there is indeed no global
pressure, thus the global constrains are still in effect unexpectedly
on the other sockets.

This patch fixes this by ignoring the net-memcg's pressure when
deciding whether should leave global memory pressure.

Fixes: e1aab161e013 ("socket: initial cgroup code.")
Signed-off-by: Abel Wu &lt;wuyun.abel@bytedance.com&gt;
Acked-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Link: https://lore.kernel.org/r/20230816091226.1542-1-wuyun.abel@bytedance.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/unix: use consistent error code in SO_PEERPIDFD</title>
<updated>2023-08-08T22:56:48+00:00</updated>
<author>
<name>David Rheinsberg</name>
<email>david@readahead.eu</email>
</author>
<published>2023-08-07T08:12:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b6f79e826fbd26e91e2fb28070484634cacdeb26'/>
<id>b6f79e826fbd26e91e2fb28070484634cacdeb26</id>
<content type='text'>
Change the new (unreleased) SO_PEERPIDFD sockopt to return ENODATA
rather than ESRCH if a socket type does not support remote peer-PID
queries.

Currently, SO_PEERPIDFD returns ESRCH when the socket in question is
not an AF_UNIX socket. This is quite unexpected, given that one would
assume ESRCH means the peer process already exited and thus cannot be
found. However, in that case the sockopt actually returns EINVAL (via
pidfd_prepare()). This is rather inconsistent with other syscalls, which
usually return ESRCH if a given PID refers to a non-existant process.

This changes SO_PEERPIDFD to return ENODATA instead. This is also what
SO_PEERGROUPS returns, and thus keeps a consistent behavior across
sockopts.

Note that this code is returned in 2 cases: First, if the socket type is
not AF_UNIX, and secondly if the socket was not yet connected. In both
cases ENODATA seems suitable.

Signed-off-by: David Rheinsberg &lt;david@readahead.eu&gt;
Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Acked-by: Luca Boccassi &lt;bluca@debian.org&gt;
Fixes: 7b26952a91cf ("net: core: add getsockopt SO_PEERPIDFD")
Link: https://lore.kernel.org/r/20230807081225.816199-1-david@readahead.eu
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change the new (unreleased) SO_PEERPIDFD sockopt to return ENODATA
rather than ESRCH if a socket type does not support remote peer-PID
queries.

Currently, SO_PEERPIDFD returns ESRCH when the socket in question is
not an AF_UNIX socket. This is quite unexpected, given that one would
assume ESRCH means the peer process already exited and thus cannot be
found. However, in that case the sockopt actually returns EINVAL (via
pidfd_prepare()). This is rather inconsistent with other syscalls, which
usually return ESRCH if a given PID refers to a non-existant process.

This changes SO_PEERPIDFD to return ENODATA instead. This is also what
SO_PEERGROUPS returns, and thus keeps a consistent behavior across
sockopts.

Note that this code is returned in 2 cases: First, if the socket type is
not AF_UNIX, and secondly if the socket was not yet connected. In both
cases ENODATA seems suitable.

Signed-off-by: David Rheinsberg &lt;david@readahead.eu&gt;
Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Acked-by: Luca Boccassi &lt;bluca@debian.org&gt;
Fixes: 7b26952a91cf ("net: core: add getsockopt SO_PEERPIDFD")
Link: https://lore.kernel.org/r/20230807081225.816199-1-david@readahead.eu
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: annotate data-races around sk-&gt;sk_priority</title>
<updated>2023-07-29T17:13:41+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-07-28T15:03:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8bf43be799d4b242ea552a14db10456446be843e'/>
<id>8bf43be799d4b242ea552a14db10456446be843e</id>
<content type='text'>
sk_getsockopt() runs locklessly. This means sk-&gt;sk_priority
can be read while other threads are changing its value.

Other reads also happen without socket lock being held.

Add missing annotations where needed.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
sk_getsockopt() runs locklessly. This means sk-&gt;sk_priority
can be read while other threads are changing its value.

Other reads also happen without socket lock being held.

Add missing annotations where needed.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add missing data-race annotation for sk_ll_usec</title>
<updated>2023-07-29T17:13:41+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-07-28T15:03:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e5f0d2dd3c2faa671711dac6d3ff3cef307bcfe3'/>
<id>e5f0d2dd3c2faa671711dac6d3ff3cef307bcfe3</id>
<content type='text'>
In a prior commit I forgot that sk_getsockopt() reads
sk-&gt;sk_ll_usec without holding a lock.

Fixes: 0dbffbb5335a ("net: annotate data race around sk_ll_usec")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In a prior commit I forgot that sk_getsockopt() reads
sk-&gt;sk_ll_usec without holding a lock.

Fixes: 0dbffbb5335a ("net: annotate data race around sk_ll_usec")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add missing data-race annotations around sk-&gt;sk_peek_off</title>
<updated>2023-07-29T17:13:41+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-07-28T15:03:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=11695c6e966b0ec7ed1d16777d294cef865a5c91'/>
<id>11695c6e966b0ec7ed1d16777d294cef865a5c91</id>
<content type='text'>
sk_getsockopt() runs locklessly, thus we need to annotate the read
of sk-&gt;sk_peek_off.

While we are at it, add corresponding annotations to sk_set_peek_off()
and unix_set_peek_off().

Fixes: b9bb53f3836f ("sock: convert sk_peek_offset functions to WRITE_ONCE")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
sk_getsockopt() runs locklessly, thus we need to annotate the read
of sk-&gt;sk_peek_off.

While we are at it, add corresponding annotations to sk_set_peek_off()
and unix_set_peek_off().

Fixes: b9bb53f3836f ("sock: convert sk_peek_offset functions to WRITE_ONCE")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: annotate data-races around sk-&gt;sk_mark</title>
<updated>2023-07-29T17:13:41+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-07-28T15:03:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3c5b4d69c358a9275a8de98f87caf6eda644b086'/>
<id>3c5b4d69c358a9275a8de98f87caf6eda644b086</id>
<content type='text'>
sk-&gt;sk_mark is often read while another thread could change the value.

Fixes: 4a19ec5800fc ("[NET]: Introducing socket mark socket option.")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
sk-&gt;sk_mark is often read while another thread could change the value.

Fixes: 4a19ec5800fc ("[NET]: Introducing socket mark socket option.")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add missing READ_ONCE(sk-&gt;sk_rcvbuf) annotation</title>
<updated>2023-07-29T17:13:41+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-07-28T15:03:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b4b553253091cafe9ec38994acf42795e073bef5'/>
<id>b4b553253091cafe9ec38994acf42795e073bef5</id>
<content type='text'>
In a prior commit, I forgot to change sk_getsockopt()
when reading sk-&gt;sk_rcvbuf locklessly.

Fixes: ebb3b78db7bf ("tcp: annotate sk-&gt;sk_rcvbuf lockless reads")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In a prior commit, I forgot to change sk_getsockopt()
when reading sk-&gt;sk_rcvbuf locklessly.

Fixes: ebb3b78db7bf ("tcp: annotate sk-&gt;sk_rcvbuf lockless reads")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add missing READ_ONCE(sk-&gt;sk_sndbuf) annotation</title>
<updated>2023-07-29T17:13:41+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-07-28T15:03:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=74bc084327c643499474ba75df485607da37dd6e'/>
<id>74bc084327c643499474ba75df485607da37dd6e</id>
<content type='text'>
In a prior commit, I forgot to change sk_getsockopt()
when reading sk-&gt;sk_sndbuf locklessly.

Fixes: e292f05e0df7 ("tcp: annotate sk-&gt;sk_sndbuf lockless reads")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In a prior commit, I forgot to change sk_getsockopt()
when reading sk-&gt;sk_sndbuf locklessly.

Fixes: e292f05e0df7 ("tcp: annotate sk-&gt;sk_sndbuf lockless reads")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: annotate data-races around sk-&gt;sk_{rcv|snd}timeo</title>
<updated>2023-07-29T17:13:41+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-07-28T15:03:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=285975dd674258ccb33e77a1803e8f2015e67105'/>
<id>285975dd674258ccb33e77a1803e8f2015e67105</id>
<content type='text'>
sk_getsockopt() runs without locks, we must add annotations
to sk-&gt;sk_rcvtimeo and sk-&gt;sk_sndtimeo.

In the future we might allow fetching these fields before
we lock the socket in TCP fast path.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
sk_getsockopt() runs without locks, we must add annotations
to sk-&gt;sk_rcvtimeo and sk-&gt;sk_sndtimeo.

In the future we might allow fetching these fields before
we lock the socket in TCP fast path.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add missing READ_ONCE(sk-&gt;sk_rcvlowat) annotation</title>
<updated>2023-07-29T17:13:41+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-07-28T15:03:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e6d12bdb435d23ff6c1890c852d85408a2f496ee'/>
<id>e6d12bdb435d23ff6c1890c852d85408a2f496ee</id>
<content type='text'>
In a prior commit, I forgot to change sk_getsockopt()
when reading sk-&gt;sk_rcvlowat locklessly.

Fixes: eac66402d1c3 ("net: annotate sk-&gt;sk_rcvlowat lockless reads")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In a prior commit, I forgot to change sk_getsockopt()
when reading sk-&gt;sk_rcvlowat locklessly.

Fixes: eac66402d1c3 ("net: annotate sk-&gt;sk_rcvlowat lockless reads")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
