<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/core/page_pool.c, branch v6.9</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>net: page_pool: factor out page_pool recycle check</title>
<updated>2024-03-11T20:01:15+00:00</updated>
<author>
<name>Mina Almasry</name>
<email>almasrymina@google.com</email>
</author>
<published>2024-03-08T20:44:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=46f40172b68154106cae660c90c7801b61080892'/>
<id>46f40172b68154106cae660c90c7801b61080892</id>
<content type='text'>
The check is duplicated in 2 places, factor it out into a common helper.

Signed-off-by: Mina Almasry &lt;almasrymina@google.com&gt;
Reviewed-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
Reviewed-by: Ilias Apalodimas &lt;ilias.apalodimas@linaro.org&gt;
Link: https://lore.kernel.org/r/20240308204500.1112858-1-almasrymina@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The check is duplicated in 2 places, factor it out into a common helper.

Signed-off-by: Mina Almasry &lt;almasrymina@google.com&gt;
Reviewed-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
Reviewed-by: Ilias Apalodimas &lt;ilias.apalodimas@linaro.org&gt;
Link: https://lore.kernel.org/r/20240308204500.1112858-1-almasrymina@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: page_pool: fix recycle stats for system page_pool allocator</title>
<updated>2024-02-19T20:30:27+00:00</updated>
<author>
<name>Lorenzo Bianconi</name>
<email>lorenzo@kernel.org</email>
</author>
<published>2024-02-16T09:25:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f853fa5c54e7a0364a52125074dedeaf2c7ddace'/>
<id>f853fa5c54e7a0364a52125074dedeaf2c7ddace</id>
<content type='text'>
Use global percpu page_pool_recycle_stats counter for system page_pool
allocator instead of allocating a separate percpu variable for each
(also percpu) page pool instance.

Reviewed-by: Toke Hoiland-Jorgensen &lt;toke@redhat.com&gt;
Signed-off-by: Lorenzo Bianconi &lt;lorenzo@kernel.org&gt;
Reviewed-by: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Link: https://lore.kernel.org/r/87f572425e98faea3da45f76c3c68815c01a20ee.1708075412.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use global percpu page_pool_recycle_stats counter for system page_pool
allocator instead of allocating a separate percpu variable for each
(also percpu) page pool instance.

Reviewed-by: Toke Hoiland-Jorgensen &lt;toke@redhat.com&gt;
Signed-off-by: Lorenzo Bianconi &lt;lorenzo@kernel.org&gt;
Reviewed-by: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Link: https://lore.kernel.org/r/87f572425e98faea3da45f76c3c68815c01a20ee.1708075412.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>page_pool: disable direct recycling based on pool-&gt;cpuid on destroy</title>
<updated>2024-02-19T19:48:00+00:00</updated>
<author>
<name>Alexander Lobakin</name>
<email>aleksander.lobakin@intel.com</email>
</author>
<published>2024-02-15T11:39:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=56ef27e3abe6d6453b1f4f6127041f3a65d7cbc9'/>
<id>56ef27e3abe6d6453b1f4f6127041f3a65d7cbc9</id>
<content type='text'>
Now that direct recycling is performed basing on pool-&gt;cpuid when set,
memory leaks are possible:

1. A pool is destroyed.
2. Alloc cache is emptied (it's done only once).
3. pool-&gt;cpuid is still set.
4. napi_pp_put_page() does direct recycling basing on pool-&gt;cpuid.
5. Now alloc cache is not empty, but it won't ever be freed.

In order to avoid that, rewrite pool-&gt;cpuid to -1 when unlinking NAPI to
make sure no direct recycling will be possible after emptying the cache.
This involves a bit of overhead as pool-&gt;cpuid now must be accessed
via READ_ONCE() to avoid partial reads.
Rename page_pool_unlink_napi() -&gt; page_pool_disable_direct_recycling()
to reflect what it actually does and unexport it.

Signed-off-by: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Reviewed-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Link: https://lore.kernel.org/r/20240215113905.96817-1-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that direct recycling is performed basing on pool-&gt;cpuid when set,
memory leaks are possible:

1. A pool is destroyed.
2. Alloc cache is emptied (it's done only once).
3. pool-&gt;cpuid is still set.
4. napi_pp_put_page() does direct recycling basing on pool-&gt;cpuid.
5. Now alloc cache is not empty, but it won't ever be freed.

In order to avoid that, rewrite pool-&gt;cpuid to -1 when unlinking NAPI to
make sure no direct recycling will be possible after emptying the cache.
This involves a bit of overhead as pool-&gt;cpuid now must be accessed
via READ_ONCE() to avoid partial reads.
Rename page_pool_unlink_napi() -&gt; page_pool_disable_direct_recycling()
to reflect what it actually does and unexport it.

Signed-off-by: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Reviewed-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Link: https://lore.kernel.org/r/20240215113905.96817-1-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add generic percpu page_pool allocator</title>
<updated>2024-02-14T03:22:30+00:00</updated>
<author>
<name>Lorenzo Bianconi</name>
<email>lorenzo@kernel.org</email>
</author>
<published>2024-02-12T09:50:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2b0cfa6e49566c8fa6759734cf821aa6e8271a9e'/>
<id>2b0cfa6e49566c8fa6759734cf821aa6e8271a9e</id>
<content type='text'>
Introduce generic percpu page_pools allocator.
Moreover add page_pool_create_percpu() and cpuid filed in page_pool struct
in order to recycle the page in the page_pool "hot" cache if
napi_pp_put_page() is running on the same cpu.
This is a preliminary patch to add xdp multi-buff support for xdp running
in generic mode.

Acked-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Reviewed-by: Toke Hoiland-Jorgensen &lt;toke@redhat.com&gt;
Signed-off-by: Lorenzo Bianconi &lt;lorenzo@kernel.org&gt;
Link: https://lore.kernel.org/r/80bc4285228b6f4220cd03de1999d86e46e3fcbd.1707729884.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce generic percpu page_pools allocator.
Moreover add page_pool_create_percpu() and cpuid filed in page_pool struct
in order to recycle the page in the page_pool "hot" cache if
napi_pp_put_page() is running on the same cpu.
This is a preliminary patch to add xdp multi-buff support for xdp running
in generic mode.

Acked-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Reviewed-by: Toke Hoiland-Jorgensen &lt;toke@redhat.com&gt;
Signed-off-by: Lorenzo Bianconi &lt;lorenzo@kernel.org&gt;
Link: https://lore.kernel.org/r/80bc4285228b6f4220cd03de1999d86e46e3fcbd.1707729884.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>page_pool: halve BIAS_MAX for multiple user references of a fragment</title>
<updated>2023-12-17T10:56:33+00:00</updated>
<author>
<name>Liang Chen</name>
<email>liangchen.linux@gmail.com</email>
</author>
<published>2023-12-15T03:30:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=aaf153aecef1d1831d0d6d371d5c11cf02f0337e'/>
<id>aaf153aecef1d1831d0d6d371d5c11cf02f0337e</id>
<content type='text'>
Up to now, we were only subtracting from the number of used page fragments
to figure out when a page could be freed or recycled. A following patch
introduces support for multiple users referencing the same fragment. So
reduce the initial page fragments value to half to avoid overflowing.

Signed-off-by: Liang Chen &lt;liangchen.linux@gmail.com&gt;
Reviewed-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
Reviewed-by: Mina Almasry &lt;almarsymina@google.com&gt;
Reviewed-by: Ilias Apalodimas &lt;ilias.apalodimas@linaro.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Up to now, we were only subtracting from the number of used page fragments
to figure out when a page could be freed or recycled. A following patch
introduces support for multiple users referencing the same fragment. So
reduce the initial page fragments value to half to avoid overflowing.

Signed-off-by: Liang Chen &lt;liangchen.linux@gmail.com&gt;
Reviewed-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
Reviewed-by: Mina Almasry &lt;almarsymina@google.com&gt;
Reviewed-by: Ilias Apalodimas &lt;ilias.apalodimas@linaro.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: page_pool: factor out releasing DMA from releasing the page</title>
<updated>2023-12-14T05:59:04+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-12-08T00:52:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c3f687d8dfeb33cffbb8f47c30002babfc4895d2'/>
<id>c3f687d8dfeb33cffbb8f47c30002babfc4895d2</id>
<content type='text'>
Releasing the DMA mapping will be useful for other types
of pages, so factor it out. Make sure compiler inlines it,
to avoid any regressions.

Signed-off-by: Mina Almasry &lt;almasrymina@google.com&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Reviewed-by: Ilias Apalodimas &lt;ilias.apalodimas@linaro.org&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Releasing the DMA mapping will be useful for other types
of pages, so factor it out. Make sure compiler inlines it,
to avoid any regressions.

Signed-off-by: Mina Almasry &lt;almasrymina@google.com&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Reviewed-by: Ilias Apalodimas &lt;ilias.apalodimas@linaro.org&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>page_pool: transition to reference count management after page draining</title>
<updated>2023-12-14T02:35:16+00:00</updated>
<author>
<name>Liang Chen</name>
<email>liangchen.linux@gmail.com</email>
</author>
<published>2023-12-12T04:46:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0a149ab78ee220c75eef797abea7a29f4490e226'/>
<id>0a149ab78ee220c75eef797abea7a29f4490e226</id>
<content type='text'>
To support multiple users referencing the same fragment,
'pp_frag_count' is renamed to 'pp_ref_count', transitioning pp pages
from fragment management to reference count management after draining
based on the suggestion from [1].

The idea is that the concept of fragmenting exists before the page is
drained, and all related functions retain their current names.
However, once the page is drained, its management shifts to being
governed by 'pp_ref_count'. Therefore, all functions associated with
that lifecycle stage of a pp page are renamed.

[1]
http://lore.kernel.org/netdev/f71d9448-70c8-8793-dc9a-0eb48a570300@huawei.com

Signed-off-by: Liang Chen &lt;liangchen.linux@gmail.com&gt;
Reviewed-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
Reviewed-by: Ilias Apalodimas &lt;ilias.apalodimas@linaro.org&gt;
Reviewed-by: Mina Almasry &lt;almasrymina@google.com&gt;
Link: https://lore.kernel.org/r/20231212044614.42733-2-liangchen.linux@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To support multiple users referencing the same fragment,
'pp_frag_count' is renamed to 'pp_ref_count', transitioning pp pages
from fragment management to reference count management after draining
based on the suggestion from [1].

The idea is that the concept of fragmenting exists before the page is
drained, and all related functions retain their current names.
However, once the page is drained, its management shifts to being
governed by 'pp_ref_count'. Therefore, all functions associated with
that lifecycle stage of a pp page are renamed.

[1]
http://lore.kernel.org/netdev/f71d9448-70c8-8793-dc9a-0eb48a570300@huawei.com

Signed-off-by: Liang Chen &lt;liangchen.linux@gmail.com&gt;
Reviewed-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
Reviewed-by: Ilias Apalodimas &lt;ilias.apalodimas@linaro.org&gt;
Reviewed-by: Mina Almasry &lt;almasrymina@google.com&gt;
Link: https://lore.kernel.org/r/20231212044614.42733-2-liangchen.linux@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: page_pool: mute the periodic warning for visible page pools</title>
<updated>2023-11-28T14:48:39+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-11-26T23:07:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=be0096676e230b43730b8936ac393d155b4e3262'/>
<id>be0096676e230b43730b8936ac393d155b4e3262</id>
<content type='text'>
Mute the periodic "stalled pool shutdown" warning if the page pool
is visible to user space. Rolling out a driver using page pools
to just a few hundred hosts at Meta surfaces applications which
fail to reap their broken sockets. Obviously it's best if the
applications are fixed, but we don't generally print warnings
for application resource leaks. Admins can now depend on the
netlink interface for getting page pool info to detect buggy
apps.

While at it throw in the ID of the pool into the message,
in rare cases (pools from destroyed netns) this will make
finding the pool with a debugger easier.

Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Mute the periodic "stalled pool shutdown" warning if the page pool
is visible to user space. Rolling out a driver using page pools
to just a few hundred hosts at Meta surfaces applications which
fail to reap their broken sockets. Obviously it's best if the
applications are fixed, but we don't generally print warnings
for application resource leaks. Admins can now depend on the
netlink interface for getting page pool info to detect buggy
apps.

While at it throw in the ID of the pool into the message,
in rare cases (pools from destroyed netns) this will make
finding the pool with a debugger easier.

Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: page_pool: expose page pool stats via netlink</title>
<updated>2023-11-28T14:48:39+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-11-26T23:07:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d49010adae737638447369a4eff8f1aab736b076'/>
<id>d49010adae737638447369a4eff8f1aab736b076</id>
<content type='text'>
Dump the stats into netlink. More clever approaches
like dumping the stats per-CPU for each CPU individually
to see where the packets get consumed can be implemented
in the future.

A trimmed example from a real (but recently booted system):

$ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \
           --dump page-pool-stats-get
[{'info': {'id': 19, 'ifindex': 2},
  'alloc-empty': 48,
  'alloc-fast': 3024,
  'alloc-refill': 0,
  'alloc-slow': 48,
  'alloc-slow-high-order': 0,
  'alloc-waive': 0,
  'recycle-cache-full': 0,
  'recycle-cached': 0,
  'recycle-released-refcnt': 0,
  'recycle-ring': 0,
  'recycle-ring-full': 0},
 {'info': {'id': 18, 'ifindex': 2},
  'alloc-empty': 66,
  'alloc-fast': 11811,
  'alloc-refill': 35,
  'alloc-slow': 66,
  'alloc-slow-high-order': 0,
  'alloc-waive': 0,
  'recycle-cache-full': 1145,
  'recycle-cached': 6541,
  'recycle-released-refcnt': 0,
  'recycle-ring': 1275,
  'recycle-ring-full': 0},
 {'info': {'id': 17, 'ifindex': 2},
  'alloc-empty': 73,
  'alloc-fast': 62099,
  'alloc-refill': 413,
...

Acked-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Dump the stats into netlink. More clever approaches
like dumping the stats per-CPU for each CPU individually
to see where the packets get consumed can be implemented
in the future.

A trimmed example from a real (but recently booted system):

$ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \
           --dump page-pool-stats-get
[{'info': {'id': 19, 'ifindex': 2},
  'alloc-empty': 48,
  'alloc-fast': 3024,
  'alloc-refill': 0,
  'alloc-slow': 48,
  'alloc-slow-high-order': 0,
  'alloc-waive': 0,
  'recycle-cache-full': 0,
  'recycle-cached': 0,
  'recycle-released-refcnt': 0,
  'recycle-ring': 0,
  'recycle-ring-full': 0},
 {'info': {'id': 18, 'ifindex': 2},
  'alloc-empty': 66,
  'alloc-fast': 11811,
  'alloc-refill': 35,
  'alloc-slow': 66,
  'alloc-slow-high-order': 0,
  'alloc-waive': 0,
  'recycle-cache-full': 1145,
  'recycle-cached': 6541,
  'recycle-released-refcnt': 0,
  'recycle-ring': 1275,
  'recycle-ring-full': 0},
 {'info': {'id': 17, 'ifindex': 2},
  'alloc-empty': 73,
  'alloc-fast': 62099,
  'alloc-refill': 413,
...

Acked-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: page_pool: report when page pool was destroyed</title>
<updated>2023-11-28T14:48:39+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-11-26T23:07:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=69cb4952b6f6a226c1c0a7ca400398aaa8f75cf2'/>
<id>69cb4952b6f6a226c1c0a7ca400398aaa8f75cf2</id>
<content type='text'>
Report when page pool was destroyed. Together with the inflight
/ memory use reporting this can serve as a replacement for the
warning about leaked page pools we currently print to dmesg.

Example output for a fake leaked page pool using some hacks
in netdevsim (one "live" pool, and one "leaked" on the same dev):

$ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \
           --dump page-pool-get
[{'id': 2, 'ifindex': 3},
 {'id': 1, 'ifindex': 3, 'destroyed': 133, 'inflight': 1}]

Tested-by: Dragos Tatulea &lt;dtatulea@nvidia.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Report when page pool was destroyed. Together with the inflight
/ memory use reporting this can serve as a replacement for the
warning about leaked page pools we currently print to dmesg.

Example output for a fake leaked page pool using some hacks
in netdevsim (one "live" pool, and one "leaked" on the same dev):

$ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \
           --dump page-pool-get
[{'id': 2, 'ifindex': 3},
 {'id': 1, 'ifindex': 3, 'destroyed': 133, 'inflight': 1}]

Tested-by: Dragos Tatulea &lt;dtatulea@nvidia.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
