<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/core/skbuff.c, branch v6.9</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>net: core: reject skb_copy(_expand) for fraglist GSO skbs</title>
<updated>2024-05-01T10:44:10+00:00</updated>
<author>
<name>Felix Fietkau</name>
<email>nbd@nbd.name</email>
</author>
<published>2024-04-27T18:24:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d091e579b864fa790dd6a0cd537a22c383126681'/>
<id>d091e579b864fa790dd6a0cd537a22c383126681</id>
<content type='text'>
SKB_GSO_FRAGLIST skbs must not be linearized, otherwise they become
invalid. Return NULL if such an skb is passed to skb_copy or
skb_copy_expand, in order to prevent a crash on a potential later
call to skb_gso_segment.

Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
Signed-off-by: Felix Fietkau &lt;nbd@nbd.name&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
SKB_GSO_FRAGLIST skbs must not be linearized, otherwise they become
invalid. Return NULL if such an skb is passed to skb_copy or
skb_copy_expand, in order to prevent a crash on a potential later
call to skb_gso_segment.

Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
Signed-off-by: Felix Fietkau &lt;nbd@nbd.name&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add skb_data_unref() helper</title>
<updated>2024-03-08T19:38:45+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2024-03-07T12:34:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1cface552a5b5f6e53a855de1a503ff958e2e253'/>
<id>1cface552a5b5f6e53a855de1a503ff958e2e253</id>
<content type='text'>
Similar to skb_unref(), add skb_data_unref() to save an expensive
atomic operation (and cache line dirtying) when last reference
on shinfo-&gt;dataref is released.

I saw this opportunity on hosts with RAW sockets accidentally
bound to UDP protocol, forcing an skb_clone() on all received packets.

These RAW sockets had their receive queue full, so all clone
packets were immediately dropped.

When UDP recvmsg() consumes later the original skb, skb_release_data()
is hitting atomic_sub_return() quite badly, because skb-&gt;clone
has been set permanently.

Note that this patch helps TCP TX performance, because
TCP stack also use (fast) clones.

This means that at least one of the two packets (the main skb or
its clone) will no longer have to perform this atomic operation
in skb_release_data().

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://lore.kernel.org/r/20240307123446.2302230-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Similar to skb_unref(), add skb_data_unref() to save an expensive
atomic operation (and cache line dirtying) when last reference
on shinfo-&gt;dataref is released.

I saw this opportunity on hosts with RAW sockets accidentally
bound to UDP protocol, forcing an skb_clone() on all received packets.

These RAW sockets had their receive queue full, so all clone
packets were immediately dropped.

When UDP recvmsg() consumes later the original skb, skb_release_data()
is hitting atomic_sub_return() quite badly, because skb-&gt;clone
has been set permanently.

Note that this patch helps TCP TX performance, because
TCP stack also use (fast) clones.

This means that at least one of the two packets (the main skb or
its clone) will no longer have to perform this atomic operation
in skb_release_data().

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://lore.kernel.org/r/20240307123446.2302230-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: move skbuff_cache(s) to net_hotdata</title>
<updated>2024-03-08T05:12:42+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2024-03-06T16:00:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=aa70d2d16f280efe8aa52afc25a33b2ec8d346b6'/>
<id>aa70d2d16f280efe8aa52afc25a33b2ec8d346b6</id>
<content type='text'>
skbuff_cache, skbuff_fclone_cache and skb_small_head_cache
are used in rx/tx fast paths.

Move them to net_hotdata for better cache locality.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Link: https://lore.kernel.org/r/20240306160031.874438-11-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
skbuff_cache, skbuff_fclone_cache and skb_small_head_cache
are used in rx/tx fast paths.

Move them to net_hotdata for better cache locality.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Link: https://lore.kernel.org/r/20240306160031.874438-11-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/page_alloc: modify page_frag_alloc_align() to accept align as an argument</title>
<updated>2024-03-05T10:38:14+00:00</updated>
<author>
<name>Yunsheng Lin</name>
<email>linyunsheng@huawei.com</email>
</author>
<published>2024-02-28T09:30:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=411c5f36805c02c7c412f1ad6bfa4459a1148011'/>
<id>411c5f36805c02c7c412f1ad6bfa4459a1148011</id>
<content type='text'>
napi_alloc_frag_align() and netdev_alloc_frag_align() accept
align as an argument, and they are thin wrappers around the
__napi_alloc_frag_align() and __netdev_alloc_frag_align() APIs
doing the alignment checking and align mask conversion, in order
to call page_frag_alloc_align() directly. The intention here is
to keep the alignment checking and the alignmask conversion in
in-line wrapper to avoid those kind of operations during execution
time since it can usually be handled during compile time.

We are going to use page_frag_alloc_align() in vhost_net.c, it
need the same kind of alignment checking and alignmask conversion,
so split up page_frag_alloc_align into an inline wrapper doing the
above operation, and add __page_frag_alloc_align() which is passed
with the align mask the original function expected as suggested by
Alexander.

Signed-off-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
CC: Alexander Duyck &lt;alexander.duyck@gmail.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
napi_alloc_frag_align() and netdev_alloc_frag_align() accept
align as an argument, and they are thin wrappers around the
__napi_alloc_frag_align() and __netdev_alloc_frag_align() APIs
doing the alignment checking and align mask conversion, in order
to call page_frag_alloc_align() directly. The intention here is
to keep the alignment checking and the alignmask conversion in
in-line wrapper to avoid those kind of operations during execution
time since it can usually be handled during compile time.

We are going to use page_frag_alloc_align() in vhost_net.c, it
need the same kind of alignment checking and alignmask conversion,
so split up page_frag_alloc_align into an inline wrapper doing the
above operation, and add __page_frag_alloc_align() which is passed
with the align mask the original function expected as suggested by
Alexander.

Signed-off-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
CC: Alexander Duyck &lt;alexander.duyck@gmail.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: mctp: copy skb ext data when fragmenting</title>
<updated>2024-02-22T12:32:55+00:00</updated>
<author>
<name>Jeremy Kerr</name>
<email>jk@codeconstruct.com.au</email>
</author>
<published>2024-02-19T09:51:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1394c1dec1c619a46867ed32791a29695372bff8'/>
<id>1394c1dec1c619a46867ed32791a29695372bff8</id>
<content type='text'>
If we're fragmenting on local output, the original packet may contain
ext data for the MCTP flows. We'll want this in the resulting fragment
skbs too.

So, do a skb_ext_copy() in the fragmentation path, and implement the
MCTP-specific parts of an ext copy operation.

Fixes: 67737c457281 ("mctp: Pass flow data &amp; flow release events to drivers")
Reported-by: Jian Zhang &lt;zhangjian.3032@bytedance.com&gt;
Signed-off-by: Jeremy Kerr &lt;jk@codeconstruct.com.au&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we're fragmenting on local output, the original packet may contain
ext data for the MCTP flows. We'll want this in the resulting fragment
skbs too.

So, do a skb_ext_copy() in the fragmentation path, and implement the
MCTP-specific parts of an ext copy operation.

Fixes: 67737c457281 ("mctp: Pass flow data &amp; flow release events to drivers")
Reported-by: Jian Zhang &lt;zhangjian.3032@bytedance.com&gt;
Signed-off-by: Jeremy Kerr &lt;jk@codeconstruct.com.au&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: fix pointer check in skb_pp_cow_data routine</title>
<updated>2024-02-21T02:26:25+00:00</updated>
<author>
<name>Lorenzo Bianconi</name>
<email>lorenzo@kernel.org</email>
</author>
<published>2024-02-17T11:12:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c6a28acb1a27eb42970b959ff7af3a8a077f8cce'/>
<id>c6a28acb1a27eb42970b959ff7af3a8a077f8cce</id>
<content type='text'>
Properly check page pointer returned by page_pool_dev_alloc routine in
skb_pp_cow_data() for non-linear part of the original skb.

Reported-by: Julian Wiedmann &lt;jwiedmann.dev@gmail.com&gt;
Closes: https://lore.kernel.org/netdev/cover.1707729884.git.lorenzo@kernel.org/T/#m7d189b0015a7281ed9221903902490c03ed19a7a
Fixes: e6d5dbdd20aa ("xdp: add multi-buff support for xdp running in generic mode")
Signed-off-by: Lorenzo Bianconi &lt;lorenzo@kernel.org&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Reviewed-by: Ilias Apalodimas &lt;ilias.apalodimas@linaro.org&gt;
Link: https://lore.kernel.org/r/25512af3e09befa9dcb2cf3632bdc45b807cf330.1708167716.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Properly check page pointer returned by page_pool_dev_alloc routine in
skb_pp_cow_data() for non-linear part of the original skb.

Reported-by: Julian Wiedmann &lt;jwiedmann.dev@gmail.com&gt;
Closes: https://lore.kernel.org/netdev/cover.1707729884.git.lorenzo@kernel.org/T/#m7d189b0015a7281ed9221903902490c03ed19a7a
Fixes: e6d5dbdd20aa ("xdp: add multi-buff support for xdp running in generic mode")
Signed-off-by: Lorenzo Bianconi &lt;lorenzo@kernel.org&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Reviewed-by: Ilias Apalodimas &lt;ilias.apalodimas@linaro.org&gt;
Link: https://lore.kernel.org/r/25512af3e09befa9dcb2cf3632bdc45b807cf330.1708167716.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add netmem to skb_frag_t</title>
<updated>2024-02-20T08:22:58+00:00</updated>
<author>
<name>Mina Almasry</name>
<email>almasrymina@google.com</email>
</author>
<published>2024-02-14T22:34:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=21d2e6737c9789aa9b23c8a4131cbca8260139fd'/>
<id>21d2e6737c9789aa9b23c8a4131cbca8260139fd</id>
<content type='text'>
Use struct netmem* instead of page in skb_frag_t. Currently struct
netmem* is always a struct page underneath, but the abstraction
allows efforts to add support for skb frags not backed by pages.

There is unfortunately 1 instance where the skb_frag_t is assumed to be
a exactly a bio_vec in kcm. For this case, WARN_ON_ONCE and return error
before doing a cast.

Add skb[_frag]_fill_netmem_*() and skb_add_rx_frag_netmem() helpers so
that the API can be used to create netmem skbs.

Signed-off-by: Mina Almasry &lt;almasrymina@google.com&gt;
Acked-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use struct netmem* instead of page in skb_frag_t. Currently struct
netmem* is always a struct page underneath, but the abstraction
allows efforts to add support for skb frags not backed by pages.

There is unfortunately 1 instance where the skb_frag_t is assumed to be
a exactly a bio_vec in kcm. For this case, WARN_ON_ONCE and return error
before doing a cast.

Add skb[_frag]_fill_netmem_*() and skb_add_rx_frag_netmem() helpers so
that the API can be used to create netmem skbs.

Signed-off-by: Mina Almasry &lt;almasrymina@google.com&gt;
Acked-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>page_pool: disable direct recycling based on pool-&gt;cpuid on destroy</title>
<updated>2024-02-19T19:48:00+00:00</updated>
<author>
<name>Alexander Lobakin</name>
<email>aleksander.lobakin@intel.com</email>
</author>
<published>2024-02-15T11:39:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=56ef27e3abe6d6453b1f4f6127041f3a65d7cbc9'/>
<id>56ef27e3abe6d6453b1f4f6127041f3a65d7cbc9</id>
<content type='text'>
Now that direct recycling is performed basing on pool-&gt;cpuid when set,
memory leaks are possible:

1. A pool is destroyed.
2. Alloc cache is emptied (it's done only once).
3. pool-&gt;cpuid is still set.
4. napi_pp_put_page() does direct recycling basing on pool-&gt;cpuid.
5. Now alloc cache is not empty, but it won't ever be freed.

In order to avoid that, rewrite pool-&gt;cpuid to -1 when unlinking NAPI to
make sure no direct recycling will be possible after emptying the cache.
This involves a bit of overhead as pool-&gt;cpuid now must be accessed
via READ_ONCE() to avoid partial reads.
Rename page_pool_unlink_napi() -&gt; page_pool_disable_direct_recycling()
to reflect what it actually does and unexport it.

Signed-off-by: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Reviewed-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Link: https://lore.kernel.org/r/20240215113905.96817-1-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that direct recycling is performed basing on pool-&gt;cpuid when set,
memory leaks are possible:

1. A pool is destroyed.
2. Alloc cache is emptied (it's done only once).
3. pool-&gt;cpuid is still set.
4. napi_pp_put_page() does direct recycling basing on pool-&gt;cpuid.
5. Now alloc cache is not empty, but it won't ever be freed.

In order to avoid that, rewrite pool-&gt;cpuid to -1 when unlinking NAPI to
make sure no direct recycling will be possible after emptying the cache.
This involves a bit of overhead as pool-&gt;cpuid now must be accessed
via READ_ONCE() to avoid partial reads.
Rename page_pool_unlink_napi() -&gt; page_pool_disable_direct_recycling()
to reflect what it actually does and unexport it.

Signed-off-by: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Reviewed-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Link: https://lore.kernel.org/r/20240215113905.96817-1-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>veth: rely on skb_pp_cow_data utility routine</title>
<updated>2024-02-14T03:22:30+00:00</updated>
<author>
<name>Lorenzo Bianconi</name>
<email>lorenzo@kernel.org</email>
</author>
<published>2024-02-12T09:50:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=27accb3cc08a0ec4e348356774042d5fa5f30cce'/>
<id>27accb3cc08a0ec4e348356774042d5fa5f30cce</id>
<content type='text'>
Rely on skb_pp_cow_data utility routine and remove duplicated code.

Acked-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Reviewed-by: Toke Hoiland-Jorgensen &lt;toke@redhat.com&gt;
Signed-off-by: Lorenzo Bianconi &lt;lorenzo@kernel.org&gt;
Link: https://lore.kernel.org/r/029cc14cce41cb242ee7efdcf32acc81f1ce4e9f.1707729884.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rely on skb_pp_cow_data utility routine and remove duplicated code.

Acked-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Reviewed-by: Toke Hoiland-Jorgensen &lt;toke@redhat.com&gt;
Signed-off-by: Lorenzo Bianconi &lt;lorenzo@kernel.org&gt;
Link: https://lore.kernel.org/r/029cc14cce41cb242ee7efdcf32acc81f1ce4e9f.1707729884.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>xdp: add multi-buff support for xdp running in generic mode</title>
<updated>2024-02-14T03:22:30+00:00</updated>
<author>
<name>Lorenzo Bianconi</name>
<email>lorenzo@kernel.org</email>
</author>
<published>2024-02-12T09:50:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e6d5dbdd20aa6a86974af51deb9414cd2e7794cb'/>
<id>e6d5dbdd20aa6a86974af51deb9414cd2e7794cb</id>
<content type='text'>
Similar to native xdp, do not always linearize the skb in
netif_receive_generic_xdp routine but create a non-linear xdp_buff to be
processed by the eBPF program. This allow to add multi-buffer support
for xdp running in generic mode.

Acked-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Reviewed-by: Toke Hoiland-Jorgensen &lt;toke@redhat.com&gt;
Signed-off-by: Lorenzo Bianconi &lt;lorenzo@kernel.org&gt;
Link: https://lore.kernel.org/r/1044d6412b1c3e95b40d34993fd5f37cd2f319fd.1707729884.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Similar to native xdp, do not always linearize the skb in
netif_receive_generic_xdp routine but create a non-linear xdp_buff to be
processed by the eBPF program. This allow to add multi-buffer support
for xdp running in generic mode.

Acked-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Reviewed-by: Toke Hoiland-Jorgensen &lt;toke@redhat.com&gt;
Signed-off-by: Lorenzo Bianconi &lt;lorenzo@kernel.org&gt;
Link: https://lore.kernel.org/r/1044d6412b1c3e95b40d34993fd5f37cd2f319fd.1707729884.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
