<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/xdp, branch v5.6</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>xsk: Publish global consumer pointers when NAPI is finished</title>
<updated>2020-02-11T14:51:11+00:00</updated>
<author>
<name>Magnus Karlsson</name>
<email>magnus.karlsson@intel.com</email>
</author>
<published>2020-02-10T15:27:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=30744a68626db6a0029aca9c646831c869c16d83'/>
<id>30744a68626db6a0029aca9c646831c869c16d83</id>
<content type='text'>
The commit 4b638f13bab4 ("xsk: Eliminate the RX batch size")
introduced a much more lazy way of updating the global consumer
pointers from the kernel side, by only doing so when running out of
entries in the fill or Tx rings (the rings consumed by the
kernel). This can result in a deadlock with the user application if
the kernel requires more than one entry to proceed and the application
cannot put these entries in the fill ring because the kernel has not
updated the global consumer pointer since the ring is not empty.

Fix this by publishing the local kernel side consumer pointer whenever
we have completed Rx or Tx processing in the kernel. This way, user
space will have an up-to-date view of the consumer pointers whenever it
gets to execute in the one core case (application and driver on the
same core), or after a certain number of packets have been processed
in the two core case (application and driver on different cores).

A side effect of this patch is that the one core case gets better
performance, but the two core case gets worse. The reason that the one
core case improves is that updating the global consumer pointer is
relatively cheap since the application by definition is not running
when the kernel is (they are on the same core) and it is beneficial
for the application, once it gets to run, to have pointers that are
as up to date as possible since it then can operate on more packets
and buffers. In the two core case, the most important performance
aspect is to minimize the number of accesses to the global pointers
since they are shared between two cores and bounces between the caches
of those cores. This patch results in more updates to global state,
which means lower performance in the two core case.

Fixes: 4b638f13bab4 ("xsk: Eliminate the RX batch size")
Reported-by: Ryan Goodfellow &lt;rgoodfel@isi.edu&gt;
Reported-by: Maxim Mikityanskiy &lt;maximmi@mellanox.com&gt;
Signed-off-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Jonathan Lemon &lt;jonathan.lemon@gmail.com&gt;
Acked-by: Maxim Mikityanskiy &lt;maximmi@mellanox.com&gt;
Link: https://lore.kernel.org/bpf/1581348432-6747-1-git-send-email-magnus.karlsson@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The commit 4b638f13bab4 ("xsk: Eliminate the RX batch size")
introduced a much more lazy way of updating the global consumer
pointers from the kernel side, by only doing so when running out of
entries in the fill or Tx rings (the rings consumed by the
kernel). This can result in a deadlock with the user application if
the kernel requires more than one entry to proceed and the application
cannot put these entries in the fill ring because the kernel has not
updated the global consumer pointer since the ring is not empty.

Fix this by publishing the local kernel side consumer pointer whenever
we have completed Rx or Tx processing in the kernel. This way, user
space will have an up-to-date view of the consumer pointers whenever it
gets to execute in the one core case (application and driver on the
same core), or after a certain number of packets have been processed
in the two core case (application and driver on different cores).

A side effect of this patch is that the one core case gets better
performance, but the two core case gets worse. The reason that the one
core case improves is that updating the global consumer pointer is
relatively cheap since the application by definition is not running
when the kernel is (they are on the same core) and it is beneficial
for the application, once it gets to run, to have pointers that are
as up to date as possible since it then can operate on more packets
and buffers. In the two core case, the most important performance
aspect is to minimize the number of accesses to the global pointers
since they are shared between two cores and bounces between the caches
of those cores. This patch results in more updates to global state,
which means lower performance in the two core case.

Fixes: 4b638f13bab4 ("xsk: Eliminate the RX batch size")
Reported-by: Ryan Goodfellow &lt;rgoodfel@isi.edu&gt;
Reported-by: Maxim Mikityanskiy &lt;maximmi@mellanox.com&gt;
Signed-off-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Jonathan Lemon &lt;jonathan.lemon@gmail.com&gt;
Acked-by: Maxim Mikityanskiy &lt;maximmi@mellanox.com&gt;
Link: https://lore.kernel.org/bpf/1581348432-6747-1-git-send-email-magnus.karlsson@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, tree-wide: rename put_user_page*() to unpin_user_page*()</title>
<updated>2020-01-31T18:30:38+00:00</updated>
<author>
<name>John Hubbard</name>
<email>jhubbard@nvidia.com</email>
</author>
<published>2020-01-31T06:13:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f1f6a7dd9b53aafd81b696b9017036e7b08e57ea'/>
<id>f1f6a7dd9b53aafd81b696b9017036e7b08e57ea</id>
<content type='text'>
In order to provide a clearer, more symmetric API for pinning and
unpinning DMA pages.  This way, pin_user_pages*() calls match up with
unpin_user_pages*() calls, and the API is a lot closer to being
self-explanatory.

Link: http://lkml.kernel.org/r/20200107224558.2362728-23-jhubbard@nvidia.com
Signed-off-by: John Hubbard &lt;jhubbard@nvidia.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Cc: Björn Töpel &lt;bjorn.topel@intel.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Hans Verkuil &lt;hverkuil-cisco@xs4all.nl&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Jerome Glisse &lt;jglisse@redhat.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Kirill A. Shutemov &lt;kirill@shutemov.name&gt;
Cc: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to provide a clearer, more symmetric API for pinning and
unpinning DMA pages.  This way, pin_user_pages*() calls match up with
unpin_user_pages*() calls, and the API is a lot closer to being
self-explanatory.

Link: http://lkml.kernel.org/r/20200107224558.2362728-23-jhubbard@nvidia.com
Signed-off-by: John Hubbard &lt;jhubbard@nvidia.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Cc: Björn Töpel &lt;bjorn.topel@intel.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Hans Verkuil &lt;hverkuil-cisco@xs4all.nl&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Jerome Glisse &lt;jglisse@redhat.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Kirill A. Shutemov &lt;kirill@shutemov.name&gt;
Cc: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/xdp: set FOLL_PIN via pin_user_pages()</title>
<updated>2020-01-31T18:30:37+00:00</updated>
<author>
<name>John Hubbard</name>
<email>jhubbard@nvidia.com</email>
</author>
<published>2020-01-31T06:13:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fb48b4746a5d77f352b8c1f793e2a6f4211c8ec2'/>
<id>fb48b4746a5d77f352b8c1f793e2a6f4211c8ec2</id>
<content type='text'>
Convert net/xdp to use the new pin_longterm_pages() call, which sets
FOLL_PIN.  Setting FOLL_PIN is now required for code that requires
tracking of pinned pages.

In partial anticipation of this work, the net/xdp code was already calling
put_user_page() instead of put_page().  Therefore, in order to convert
from the get_user_pages()/put_page() model, to the
pin_user_pages()/put_user_page() model, the only change required here is
to change get_user_pages() to pin_user_pages().

Link: http://lkml.kernel.org/r/20200107224558.2362728-18-jhubbard@nvidia.com
Signed-off-by: John Hubbard &lt;jhubbard@nvidia.com&gt;
Acked-by: Björn Töpel &lt;bjorn.topel@intel.com&gt;
Cc: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Hans Verkuil &lt;hverkuil-cisco@xs4all.nl&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Jerome Glisse &lt;jglisse@redhat.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Kirill A. Shutemov &lt;kirill@shutemov.name&gt;
Cc: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Convert net/xdp to use the new pin_longterm_pages() call, which sets
FOLL_PIN.  Setting FOLL_PIN is now required for code that requires
tracking of pinned pages.

In partial anticipation of this work, the net/xdp code was already calling
put_user_page() instead of put_page().  Therefore, in order to convert
from the get_user_pages()/put_page() model, to the
pin_user_pages()/put_user_page() model, the only change required here is
to change get_user_pages() to pin_user_pages().

Link: http://lkml.kernel.org/r/20200107224558.2362728-18-jhubbard@nvidia.com
Signed-off-by: John Hubbard &lt;jhubbard@nvidia.com&gt;
Acked-by: Björn Töpel &lt;bjorn.topel@intel.com&gt;
Cc: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Hans Verkuil &lt;hverkuil-cisco@xs4all.nl&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Jerome Glisse &lt;jglisse@redhat.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Kirill A. Shutemov &lt;kirill@shutemov.name&gt;
Cc: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>xsk, net: Make sock_def_readable() have external linkage</title>
<updated>2020-01-21T23:08:52+00:00</updated>
<author>
<name>Björn Töpel</name>
<email>bjorn.topel@intel.com</email>
</author>
<published>2020-01-20T09:29:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=43a825afc91e2b06af1e8e7422198e759c2c5e20'/>
<id>43a825afc91e2b06af1e8e7422198e759c2c5e20</id>
<content type='text'>
XDP sockets use the default implementation of struct sock's
sk_data_ready callback, which is sock_def_readable(). This function
is called in the XDP socket fast-path, and involves a retpoline. By
letting sock_def_readable() have external linkage, and being called
directly, the retpoline can be avoided.

Signed-off-by: Björn Töpel &lt;bjorn.topel@intel.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20200120092917.13949-1-bjorn.topel@gmail.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
XDP sockets use the default implementation of struct sock's
sk_data_ready callback, which is sock_def_readable(). This function
is called in the XDP socket fast-path, and involves a retpoline. By
letting sock_def_readable() have external linkage, and being called
directly, the retpoline can be avoided.

Signed-off-by: Björn Töpel &lt;bjorn.topel@intel.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20200120092917.13949-1-bjorn.topel@gmail.com
</pre>
</div>
</content>
</entry>
<entry>
<title>xsk: Support allocations of large umems</title>
<updated>2020-01-15T19:41:52+00:00</updated>
<author>
<name>Magnus Karlsson</name>
<email>magnus.karlsson@intel.com</email>
</author>
<published>2020-01-14T09:49:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d3a56931f9c8644a52e7a6735954c84ea895e8c3'/>
<id>d3a56931f9c8644a52e7a6735954c84ea895e8c3</id>
<content type='text'>
When registering a umem area that is sufficiently large (&gt;1G on an
x86), kmalloc cannot be used to allocate one of the internal data
structures, as the size requested gets too large. Use kvmalloc instead
that falls back on vmalloc if the allocation is too large for kmalloc.

Also add accounting for this structure as it is triggered by a user
space action (the XDP_UMEM_REG setsockopt) and it is by far the
largest structure of kernel allocated memory in xsk.

Reported-by: Ryan Goodfellow &lt;rgoodfel@isi.edu&gt;
Signed-off-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Jonathan Lemon &lt;jonathan.lemon@gmail.com&gt;
Link: https://lore.kernel.org/bpf/1578995365-7050-1-git-send-email-magnus.karlsson@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When registering a umem area that is sufficiently large (&gt;1G on an
x86), kmalloc cannot be used to allocate one of the internal data
structures, as the size requested gets too large. Use kvmalloc instead
that falls back on vmalloc if the allocation is too large for kmalloc.

Also add accounting for this structure as it is triggered by a user
space action (the XDP_UMEM_REG setsockopt) and it is by far the
largest structure of kernel allocated memory in xsk.

Reported-by: Ryan Goodfellow &lt;rgoodfel@isi.edu&gt;
Signed-off-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Jonathan Lemon &lt;jonathan.lemon@gmail.com&gt;
Link: https://lore.kernel.org/bpf/1578995365-7050-1-git-send-email-magnus.karlsson@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next</title>
<updated>2019-12-27T22:20:10+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-12-27T22:20:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2bbc078f812d45b8decb55935dab21199bd21489'/>
<id>2bbc078f812d45b8decb55935dab21199bd21489</id>
<content type='text'>
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-12-27

The following pull-request contains BPF updates for your *net-next* tree.

We've added 127 non-merge commits during the last 17 day(s) which contain
a total of 110 files changed, 6901 insertions(+), 2721 deletions(-).

There are three merge conflicts. Conflicts and resolution looks as follows:

1) Merge conflict in net/bpf/test_run.c:

There was a tree-wide cleanup c593642c8be0 ("treewide: Use sizeof_field() macro")
which gets in the way with b590cb5f802d ("bpf: Switch to offsetofend in
BPF_PROG_TEST_RUN"):

  &lt;&lt;&lt;&lt;&lt;&lt;&lt; HEAD
          if (!range_is_zero(__skb, offsetof(struct __sk_buff, priority) +
                             sizeof_field(struct __sk_buff, priority),
  =======
          if (!range_is_zero(__skb, offsetofend(struct __sk_buff, priority),
  &gt;&gt;&gt;&gt;&gt;&gt;&gt; 7c8dce4b166113743adad131b5a24c4acc12f92c

There are a few occasions that look similar to this. Always take the chunk with
offsetofend(). Note that there is one where the fields differ in here:

  &lt;&lt;&lt;&lt;&lt;&lt;&lt; HEAD
          if (!range_is_zero(__skb, offsetof(struct __sk_buff, tstamp) +
                             sizeof_field(struct __sk_buff, tstamp),
  =======
          if (!range_is_zero(__skb, offsetofend(struct __sk_buff, gso_segs),
  &gt;&gt;&gt;&gt;&gt;&gt;&gt; 7c8dce4b166113743adad131b5a24c4acc12f92c

Just take the one with offsetofend() /and/ gso_segs. Latter is correct due to
850a88cc4096 ("bpf: Expose __sk_buff wire_len/gso_segs to BPF_PROG_TEST_RUN").

2) Merge conflict in arch/riscv/net/bpf_jit_comp.c:

(I'm keeping Bjorn in Cc here for a double-check in case I got it wrong.)

  &lt;&lt;&lt;&lt;&lt;&lt;&lt; HEAD
          if (is_13b_check(off, insn))
                  return -1;
          emit(rv_blt(tcc, RV_REG_ZERO, off &gt;&gt; 1), ctx);
  =======
          emit_branch(BPF_JSLT, RV_REG_T1, RV_REG_ZERO, off, ctx);
  &gt;&gt;&gt;&gt;&gt;&gt;&gt; 7c8dce4b166113743adad131b5a24c4acc12f92c

Result should look like:

          emit_branch(BPF_JSLT, tcc, RV_REG_ZERO, off, ctx);

3) Merge conflict in arch/riscv/include/asm/pgtable.h:

  &lt;&lt;&lt;&lt;&lt;&lt;&lt; HEAD
  =======
  #define VMALLOC_SIZE     (KERN_VIRT_SIZE &gt;&gt; 1)
  #define VMALLOC_END      (PAGE_OFFSET - 1)
  #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)

  #define BPF_JIT_REGION_SIZE     (SZ_128M)
  #define BPF_JIT_REGION_START    (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
  #define BPF_JIT_REGION_END      (VMALLOC_END)

  /*
   * Roughly size the vmemmap space to be large enough to fit enough
   * struct pages to map half the virtual address space. Then
   * position vmemmap directly below the VMALLOC region.
   */
  #define VMEMMAP_SHIFT \
          (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT)
  #define VMEMMAP_SIZE    BIT(VMEMMAP_SHIFT)
  #define VMEMMAP_END     (VMALLOC_START - 1)
  #define VMEMMAP_START   (VMALLOC_START - VMEMMAP_SIZE)

  #define vmemmap         ((struct page *)VMEMMAP_START)

  &gt;&gt;&gt;&gt;&gt;&gt;&gt; 7c8dce4b166113743adad131b5a24c4acc12f92c

Only take the BPF_* defines from there and move them higher up in the
same file. Remove the rest from the chunk. The VMALLOC_* etc defines
got moved via 01f52e16b868 ("riscv: define vmemmap before pfn_to_page
calls"). Result:

  [...]
  #define __S101  PAGE_READ_EXEC
  #define __S110  PAGE_SHARED_EXEC
  #define __S111  PAGE_SHARED_EXEC

  #define VMALLOC_SIZE     (KERN_VIRT_SIZE &gt;&gt; 1)
  #define VMALLOC_END      (PAGE_OFFSET - 1)
  #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)

  #define BPF_JIT_REGION_SIZE     (SZ_128M)
  #define BPF_JIT_REGION_START    (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
  #define BPF_JIT_REGION_END      (VMALLOC_END)

  /*
   * Roughly size the vmemmap space to be large enough to fit enough
   * struct pages to map half the virtual address space. Then
   * position vmemmap directly below the VMALLOC region.
   */
  #define VMEMMAP_SHIFT \
          (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT)
  #define VMEMMAP_SIZE    BIT(VMEMMAP_SHIFT)
  #define VMEMMAP_END     (VMALLOC_START - 1)
  #define VMEMMAP_START   (VMALLOC_START - VMEMMAP_SIZE)

  [...]

Let me know if there are any other issues.

Anyway, the main changes are:

1) Extend bpftool to produce a struct (aka "skeleton") tailored and specific
   to a provided BPF object file. This provides an alternative, simplified API
   compared to standard libbpf interaction. Also, add libbpf extern variable
   resolution for .kconfig section to import Kconfig data, from Andrii Nakryiko.

2) Add BPF dispatcher for XDP which is a mechanism to avoid indirect calls by
   generating a branch funnel as discussed back in bpfconf'19 at LSF/MM. Also,
   add various BPF riscv JIT improvements, from Björn Töpel.

3) Extend bpftool to allow matching BPF programs and maps by name,
   from Paul Chaignon.

4) Support for replacing cgroup BPF programs attached with BPF_F_ALLOW_MULTI
   flag for allowing updates without service interruption, from Andrey Ignatov.

5) Cleanup and simplification of ring access functions for AF_XDP with a
   bonus of 0-5% performance improvement, from Magnus Karlsson.

6) Enable BPF JITs for x86-64 and arm64 by default. Also, final version of
   audit support for BPF, from Daniel Borkmann and latter with Jiri Olsa.

7) Move and extend test_select_reuseport into BPF program tests under
   BPF selftests, from Jakub Sitnicki.

8) Various BPF sample improvements for xdpsock for customizing parameters
   to set up and benchmark AF_XDP, from Jay Jayatheerthan.

9) Improve libbpf to provide a ulimit hint on permission denied errors.
   Also change XDP sample programs to attach in driver mode by default,
   from Toke Høiland-Jørgensen.

10) Extend BPF test infrastructure to allow changing skb mark from tc BPF
    programs, from Nikita V. Shirokov.

11) Optimize prologue code sequence in BPF arm32 JIT, from Russell King.

12) Fix xdp_redirect_cpu BPF sample to manually attach to tracepoints after
    libbpf conversion, from Jesper Dangaard Brouer.

13) Minor misc improvements from various others.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-12-27

The following pull-request contains BPF updates for your *net-next* tree.

We've added 127 non-merge commits during the last 17 day(s) which contain
a total of 110 files changed, 6901 insertions(+), 2721 deletions(-).

There are three merge conflicts. Conflicts and resolution looks as follows:

1) Merge conflict in net/bpf/test_run.c:

There was a tree-wide cleanup c593642c8be0 ("treewide: Use sizeof_field() macro")
which gets in the way with b590cb5f802d ("bpf: Switch to offsetofend in
BPF_PROG_TEST_RUN"):

  &lt;&lt;&lt;&lt;&lt;&lt;&lt; HEAD
          if (!range_is_zero(__skb, offsetof(struct __sk_buff, priority) +
                             sizeof_field(struct __sk_buff, priority),
  =======
          if (!range_is_zero(__skb, offsetofend(struct __sk_buff, priority),
  &gt;&gt;&gt;&gt;&gt;&gt;&gt; 7c8dce4b166113743adad131b5a24c4acc12f92c

There are a few occasions that look similar to this. Always take the chunk with
offsetofend(). Note that there is one where the fields differ in here:

  &lt;&lt;&lt;&lt;&lt;&lt;&lt; HEAD
          if (!range_is_zero(__skb, offsetof(struct __sk_buff, tstamp) +
                             sizeof_field(struct __sk_buff, tstamp),
  =======
          if (!range_is_zero(__skb, offsetofend(struct __sk_buff, gso_segs),
  &gt;&gt;&gt;&gt;&gt;&gt;&gt; 7c8dce4b166113743adad131b5a24c4acc12f92c

Just take the one with offsetofend() /and/ gso_segs. Latter is correct due to
850a88cc4096 ("bpf: Expose __sk_buff wire_len/gso_segs to BPF_PROG_TEST_RUN").

2) Merge conflict in arch/riscv/net/bpf_jit_comp.c:

(I'm keeping Bjorn in Cc here for a double-check in case I got it wrong.)

  &lt;&lt;&lt;&lt;&lt;&lt;&lt; HEAD
          if (is_13b_check(off, insn))
                  return -1;
          emit(rv_blt(tcc, RV_REG_ZERO, off &gt;&gt; 1), ctx);
  =======
          emit_branch(BPF_JSLT, RV_REG_T1, RV_REG_ZERO, off, ctx);
  &gt;&gt;&gt;&gt;&gt;&gt;&gt; 7c8dce4b166113743adad131b5a24c4acc12f92c

Result should look like:

          emit_branch(BPF_JSLT, tcc, RV_REG_ZERO, off, ctx);

3) Merge conflict in arch/riscv/include/asm/pgtable.h:

  &lt;&lt;&lt;&lt;&lt;&lt;&lt; HEAD
  =======
  #define VMALLOC_SIZE     (KERN_VIRT_SIZE &gt;&gt; 1)
  #define VMALLOC_END      (PAGE_OFFSET - 1)
  #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)

  #define BPF_JIT_REGION_SIZE     (SZ_128M)
  #define BPF_JIT_REGION_START    (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
  #define BPF_JIT_REGION_END      (VMALLOC_END)

  /*
   * Roughly size the vmemmap space to be large enough to fit enough
   * struct pages to map half the virtual address space. Then
   * position vmemmap directly below the VMALLOC region.
   */
  #define VMEMMAP_SHIFT \
          (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT)
  #define VMEMMAP_SIZE    BIT(VMEMMAP_SHIFT)
  #define VMEMMAP_END     (VMALLOC_START - 1)
  #define VMEMMAP_START   (VMALLOC_START - VMEMMAP_SIZE)

  #define vmemmap         ((struct page *)VMEMMAP_START)

  &gt;&gt;&gt;&gt;&gt;&gt;&gt; 7c8dce4b166113743adad131b5a24c4acc12f92c

Only take the BPF_* defines from there and move them higher up in the
same file. Remove the rest from the chunk. The VMALLOC_* etc defines
got moved via 01f52e16b868 ("riscv: define vmemmap before pfn_to_page
calls"). Result:

  [...]
  #define __S101  PAGE_READ_EXEC
  #define __S110  PAGE_SHARED_EXEC
  #define __S111  PAGE_SHARED_EXEC

  #define VMALLOC_SIZE     (KERN_VIRT_SIZE &gt;&gt; 1)
  #define VMALLOC_END      (PAGE_OFFSET - 1)
  #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)

  #define BPF_JIT_REGION_SIZE     (SZ_128M)
  #define BPF_JIT_REGION_START    (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
  #define BPF_JIT_REGION_END      (VMALLOC_END)

  /*
   * Roughly size the vmemmap space to be large enough to fit enough
   * struct pages to map half the virtual address space. Then
   * position vmemmap directly below the VMALLOC region.
   */
  #define VMEMMAP_SHIFT \
          (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT)
  #define VMEMMAP_SIZE    BIT(VMEMMAP_SHIFT)
  #define VMEMMAP_END     (VMALLOC_START - 1)
  #define VMEMMAP_START   (VMALLOC_START - VMEMMAP_SIZE)

  [...]

Let me know if there are any other issues.

Anyway, the main changes are:

1) Extend bpftool to produce a struct (aka "skeleton") tailored and specific
   to a provided BPF object file. This provides an alternative, simplified API
   compared to standard libbpf interaction. Also, add libbpf extern variable
   resolution for .kconfig section to import Kconfig data, from Andrii Nakryiko.

2) Add BPF dispatcher for XDP which is a mechanism to avoid indirect calls by
   generating a branch funnel as discussed back in bpfconf'19 at LSF/MM. Also,
   add various BPF riscv JIT improvements, from Björn Töpel.

3) Extend bpftool to allow matching BPF programs and maps by name,
   from Paul Chaignon.

4) Support for replacing cgroup BPF programs attached with BPF_F_ALLOW_MULTI
   flag for allowing updates without service interruption, from Andrey Ignatov.

5) Cleanup and simplification of ring access functions for AF_XDP with a
   bonus of 0-5% performance improvement, from Magnus Karlsson.

6) Enable BPF JITs for x86-64 and arm64 by default. Also, final version of
   audit support for BPF, from Daniel Borkmann and latter with Jiri Olsa.

7) Move and extend test_select_reuseport into BPF program tests under
   BPF selftests, from Jakub Sitnicki.

8) Various BPF sample improvements for xdpsock for customizing parameters
   to set up and benchmark AF_XDP, from Jay Jayatheerthan.

9) Improve libbpf to provide a ulimit hint on permission denied errors.
   Also change XDP sample programs to attach in driver mode by default,
   from Toke Høiland-Jørgensen.

10) Extend BPF test infrastructure to allow changing skb mark from tc BPF
    programs, from Nikita V. Shirokov.

11) Optimize prologue code sequence in BPF arm32 JIT, from Russell King.

12) Fix xdp_redirect_cpu BPF sample to manually attach to tracepoints after
    libbpf conversion, from Jesper Dangaard Brouer.

13) Minor misc improvements from various others.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>xsk: Use struct_size() helper</title>
<updated>2019-12-21T00:00:09+00:00</updated>
<author>
<name>Magnus Karlsson</name>
<email>magnus.karlsson@intel.com</email>
</author>
<published>2019-12-19T12:39:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1d9cb1f381860b529edec57cf7a08133f40366eb'/>
<id>1d9cb1f381860b529edec57cf7a08133f40366eb</id>
<content type='text'>
Improve readability and maintainability by using the struct_size()
helper when allocating the AF_XDP rings.

Signed-off-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/1576759171-28550-13-git-send-email-magnus.karlsson@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Improve readability and maintainability by using the struct_size()
helper when allocating the AF_XDP rings.

Signed-off-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/1576759171-28550-13-git-send-email-magnus.karlsson@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>xsk: Add function naming comments and reorder functions</title>
<updated>2019-12-21T00:00:09+00:00</updated>
<author>
<name>Magnus Karlsson</name>
<email>magnus.karlsson@intel.com</email>
</author>
<published>2019-12-19T12:39:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=15d8c9162ced1789d25261859a7b37db8426e409'/>
<id>15d8c9162ced1789d25261859a7b37db8426e409</id>
<content type='text'>
Add comments on how the ring access functions are named and how they
are supposed to be used for producers and consumers. The functions are
also reordered so that the consumer functions are in the beginning and
the producer functions in the end, for easier reference. Put this in a
separate patch as the diff might look a little odd, but no
functionality has changed in this patch.

Signed-off-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/1576759171-28550-12-git-send-email-magnus.karlsson@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add comments on how the ring access functions are named and how they
are supposed to be used for producers and consumers. The functions are
also reordered so that the consumer functions are in the beginning and
the producer functions in the end, for easier reference. Put this in a
separate patch as the diff might look a little odd, but no
functionality has changed in this patch.

Signed-off-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/1576759171-28550-12-git-send-email-magnus.karlsson@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>xsk: Remove unnecessary READ_ONCE of data</title>
<updated>2019-12-21T00:00:09+00:00</updated>
<author>
<name>Magnus Karlsson</name>
<email>magnus.karlsson@intel.com</email>
</author>
<published>2019-12-19T12:39:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c34787fcc90f0732ff00754057f11780007002e4'/>
<id>c34787fcc90f0732ff00754057f11780007002e4</id>
<content type='text'>
There are two unnecessary READ_ONCE of descriptor data. These are not
needed since the data is written by the producer before it signals
that the data is available by incrementing the producer pointer. As the
access to this producer pointer is serialized and the consumer always
reads the descriptor after it has read and synchronized with the
producer counter, the write of the descriptor will have fully
completed and it does not matter if the consumer has any read tearing.

Signed-off-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/1576759171-28550-11-git-send-email-magnus.karlsson@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are two unnecessary READ_ONCE of descriptor data. These are not
needed since the data is written by the producer before it signals
that the data is available by incrementing the producer pointer. As the
access to this producer pointer is serialized and the consumer always
reads the descriptor after it has read and synchronized with the
producer counter, the write of the descriptor will have fully
completed and it does not matter if the consumer has any read tearing.

Signed-off-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/1576759171-28550-11-git-send-email-magnus.karlsson@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>xsk: ixgbe: i40e: ice: mlx5: Xsk_umem_discard_addr to xsk_umem_release_addr</title>
<updated>2019-12-21T00:00:09+00:00</updated>
<author>
<name>Magnus Karlsson</name>
<email>magnus.karlsson@intel.com</email>
</author>
<published>2019-12-19T12:39:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f8509aa078de0842ec1817e8026e58620cd05d3b'/>
<id>f8509aa078de0842ec1817e8026e58620cd05d3b</id>
<content type='text'>
Change the name of xsk_umem_discard_addr to xsk_umem_release_addr to
better reflect the new naming of the AF_XDP queue manipulation
functions. As this functions is used by drivers implementing support
for AF_XDP zero-copy, it requires a name change to these drivers. The
function xsk_umem_release_addr_rq has also changed name in the same
fashion.

Signed-off-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/1576759171-28550-10-git-send-email-magnus.karlsson@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change the name of xsk_umem_discard_addr to xsk_umem_release_addr to
better reflect the new naming of the AF_XDP queue manipulation
functions. As this functions is used by drivers implementing support
for AF_XDP zero-copy, it requires a name change to these drivers. The
function xsk_umem_release_addr_rq has also changed name in the same
fashion.

Signed-off-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/1576759171-28550-10-git-send-email-magnus.karlsson@intel.com
</pre>
</div>
</content>
</entry>
</feed>
