<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/kernel/bpf/hashtab.c, branch v6.13.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>bpf: Call free_htab_elem() after htab_unlock_bucket()</title>
<updated>2024-11-11T16:18:30+00:00</updated>
<author>
<name>Hou Tao</name>
<email>houtao1@huawei.com</email>
</author>
<published>2024-11-06T06:35:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b9e9ed90b10c82a4e9d4d70a2890f06bfcdd3b78'/>
<id>b9e9ed90b10c82a4e9d4d70a2890f06bfcdd3b78</id>
<content type='text'>
For htab of maps, when the map is removed from the htab, it may hold the
last reference of the map. bpf_map_fd_put_ptr() will invoke
bpf_map_free_id() to free the id of the removed map element. However,
bpf_map_fd_put_ptr() is invoked while holding a bucket lock
(raw_spin_lock_t), and bpf_map_free_id() attempts to acquire map_idr_lock
(spinlock_t), triggering the following lockdep warning:

  =============================
  [ BUG: Invalid wait context ]
  6.11.0-rc4+ #49 Not tainted
  -----------------------------
  test_maps/4881 is trying to lock:
  ffffffff84884578 (map_idr_lock){+...}-{3:3}, at: bpf_map_free_id.part.0+0x21/0x70
  other info that might help us debug this:
  context-{5:5}
  2 locks held by test_maps/4881:
   #0: ffffffff846caf60 (rcu_read_lock){....}-{1:3}, at: bpf_fd_htab_map_update_elem+0xf9/0x270
   #1: ffff888149ced148 (&amp;htab-&gt;lockdep_key#2){....}-{2:2}, at: htab_map_update_elem+0x178/0xa80
  stack backtrace:
  CPU: 0 UID: 0 PID: 4881 Comm: test_maps Not tainted 6.11.0-rc4+ #49
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
  Call Trace:
   &lt;TASK&gt;
   dump_stack_lvl+0x6e/0xb0
   dump_stack+0x10/0x20
   __lock_acquire+0x73e/0x36c0
   lock_acquire+0x182/0x450
   _raw_spin_lock_irqsave+0x43/0x70
   bpf_map_free_id.part.0+0x21/0x70
   bpf_map_put+0xcf/0x110
   bpf_map_fd_put_ptr+0x9a/0xb0
   free_htab_elem+0x69/0xe0
   htab_map_update_elem+0x50f/0xa80
   bpf_fd_htab_map_update_elem+0x131/0x270
   htab_map_update_elem+0x50f/0xa80
   bpf_fd_htab_map_update_elem+0x131/0x270
   bpf_map_update_value+0x266/0x380
   __sys_bpf+0x21bb/0x36b0
   __x64_sys_bpf+0x45/0x60
   x64_sys_call+0x1b2a/0x20d0
   do_syscall_64+0x5d/0x100
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

One way to fix the lockdep warning is using raw_spinlock_t for
map_idr_lock as well. However, bpf_map_alloc_id() invokes
idr_alloc_cyclic() after acquiring map_idr_lock, it will trigger a
similar lockdep warning because the slab's lock (s-&gt;cpu_slab-&gt;lock) is
still a spinlock.

Instead of changing map_idr_lock's type, fix the issue by invoking
htab_put_fd_value() after htab_unlock_bucket(). However, only deferring
the invocation of htab_put_fd_value() is not enough, because the old map
pointers in htab of maps can not be saved during batched deletion.
Therefore, also defer the invocation of free_htab_elem(), so these
to-be-freed elements could be linked together similar to lru map.

There are four callers for -&gt;map_fd_put_ptr:

(1) alloc_htab_elem() (through htab_put_fd_value())
It invokes -&gt;map_fd_put_ptr() under a raw_spinlock_t. The invocation of
htab_put_fd_value() can not simply move after htab_unlock_bucket(),
because the old element has already been stashed in htab-&gt;extra_elems.
It may be reused immediately after htab_unlock_bucket() and the
invocation of htab_put_fd_value() after htab_unlock_bucket() may release
the newly-added element incorrectly. Therefore, saving the map pointer
of the old element for htab of maps before unlocking the bucket and
releasing the map_ptr after unlock. Beside the map pointer in the old
element, should do the same thing for the special fields in the old
element as well.

(2) free_htab_elem() (through htab_put_fd_value())
Its caller includes __htab_map_lookup_and_delete_elem(),
htab_map_delete_elem() and __htab_map_lookup_and_delete_batch().

For htab_map_delete_elem(), simply invoke free_htab_elem() after
htab_unlock_bucket(). For __htab_map_lookup_and_delete_batch(), just
like lru map, linking the to-be-freed element into node_to_free list
and invoking free_htab_elem() for these element after unlock. It is safe
to reuse batch_flink as the link for node_to_free, because these
elements have been removed from the hash llist.

Because htab of maps doesn't support lookup_and_delete operation,
__htab_map_lookup_and_delete_elem() doesn't have the problem, so kept
it as is.

(3) fd_htab_map_free()
It invokes -&gt;map_fd_put_ptr without raw_spinlock_t.

(4) bpf_fd_htab_map_update_elem()
It invokes -&gt;map_fd_put_ptr without raw_spinlock_t.

After moving free_htab_elem() outside htab bucket lock scope, using
pcpu_freelist_push() instead of __pcpu_freelist_push() to disable
the irq before freeing elements, and protecting the invocations of
bpf_mem_cache_free() with migrate_{disable|enable} pair.

Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Link: https://lore.kernel.org/r/20241106063542.357743-2-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For htab of maps, when the map is removed from the htab, it may hold the
last reference of the map. bpf_map_fd_put_ptr() will invoke
bpf_map_free_id() to free the id of the removed map element. However,
bpf_map_fd_put_ptr() is invoked while holding a bucket lock
(raw_spin_lock_t), and bpf_map_free_id() attempts to acquire map_idr_lock
(spinlock_t), triggering the following lockdep warning:

  =============================
  [ BUG: Invalid wait context ]
  6.11.0-rc4+ #49 Not tainted
  -----------------------------
  test_maps/4881 is trying to lock:
  ffffffff84884578 (map_idr_lock){+...}-{3:3}, at: bpf_map_free_id.part.0+0x21/0x70
  other info that might help us debug this:
  context-{5:5}
  2 locks held by test_maps/4881:
   #0: ffffffff846caf60 (rcu_read_lock){....}-{1:3}, at: bpf_fd_htab_map_update_elem+0xf9/0x270
   #1: ffff888149ced148 (&amp;htab-&gt;lockdep_key#2){....}-{2:2}, at: htab_map_update_elem+0x178/0xa80
  stack backtrace:
  CPU: 0 UID: 0 PID: 4881 Comm: test_maps Not tainted 6.11.0-rc4+ #49
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
  Call Trace:
   &lt;TASK&gt;
   dump_stack_lvl+0x6e/0xb0
   dump_stack+0x10/0x20
   __lock_acquire+0x73e/0x36c0
   lock_acquire+0x182/0x450
   _raw_spin_lock_irqsave+0x43/0x70
   bpf_map_free_id.part.0+0x21/0x70
   bpf_map_put+0xcf/0x110
   bpf_map_fd_put_ptr+0x9a/0xb0
   free_htab_elem+0x69/0xe0
   htab_map_update_elem+0x50f/0xa80
   bpf_fd_htab_map_update_elem+0x131/0x270
   htab_map_update_elem+0x50f/0xa80
   bpf_fd_htab_map_update_elem+0x131/0x270
   bpf_map_update_value+0x266/0x380
   __sys_bpf+0x21bb/0x36b0
   __x64_sys_bpf+0x45/0x60
   x64_sys_call+0x1b2a/0x20d0
   do_syscall_64+0x5d/0x100
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

One way to fix the lockdep warning is using raw_spinlock_t for
map_idr_lock as well. However, bpf_map_alloc_id() invokes
idr_alloc_cyclic() after acquiring map_idr_lock, it will trigger a
similar lockdep warning because the slab's lock (s-&gt;cpu_slab-&gt;lock) is
still a spinlock.

Instead of changing map_idr_lock's type, fix the issue by invoking
htab_put_fd_value() after htab_unlock_bucket(). However, only deferring
the invocation of htab_put_fd_value() is not enough, because the old map
pointers in htab of maps can not be saved during batched deletion.
Therefore, also defer the invocation of free_htab_elem(), so these
to-be-freed elements could be linked together similar to lru map.

There are four callers for -&gt;map_fd_put_ptr:

(1) alloc_htab_elem() (through htab_put_fd_value())
It invokes -&gt;map_fd_put_ptr() under a raw_spinlock_t. The invocation of
htab_put_fd_value() can not simply move after htab_unlock_bucket(),
because the old element has already been stashed in htab-&gt;extra_elems.
It may be reused immediately after htab_unlock_bucket() and the
invocation of htab_put_fd_value() after htab_unlock_bucket() may release
the newly-added element incorrectly. Therefore, saving the map pointer
of the old element for htab of maps before unlocking the bucket and
releasing the map_ptr after unlock. Beside the map pointer in the old
element, should do the same thing for the special fields in the old
element as well.

(2) free_htab_elem() (through htab_put_fd_value())
Its caller includes __htab_map_lookup_and_delete_elem(),
htab_map_delete_elem() and __htab_map_lookup_and_delete_batch().

For htab_map_delete_elem(), simply invoke free_htab_elem() after
htab_unlock_bucket(). For __htab_map_lookup_and_delete_batch(), just
like lru map, linking the to-be-freed element into node_to_free list
and invoking free_htab_elem() for these element after unlock. It is safe
to reuse batch_flink as the link for node_to_free, because these
elements have been removed from the hash llist.

Because htab of maps doesn't support lookup_and_delete operation,
__htab_map_lookup_and_delete_elem() doesn't have the problem, so kept
it as is.

(3) fd_htab_map_free()
It invokes -&gt;map_fd_put_ptr without raw_spinlock_t.

(4) bpf_fd_htab_map_update_elem()
It invokes -&gt;map_fd_put_ptr without raw_spinlock_t.

After moving free_htab_elem() outside htab bucket lock scope, using
pcpu_freelist_push() instead of __pcpu_freelist_push() to disable
the irq before freeing elements, and protecting the invocations of
bpf_mem_cache_free() with migrate_{disable|enable} pair.

Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Link: https://lore.kernel.org/r/20241106063542.357743-2-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Check percpu map value size first</title>
<updated>2024-09-11T20:22:37+00:00</updated>
<author>
<name>Tao Chen</name>
<email>chen.dylane@gmail.com</email>
</author>
<published>2024-09-10T14:41:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1d244784be6b01162b732a5a7d637dfc024c3203'/>
<id>1d244784be6b01162b732a5a7d637dfc024c3203</id>
<content type='text'>
Percpu map is often used, but the map value size limit often ignored,
like issue: https://github.com/iovisor/bcc/issues/2519. Actually,
percpu map value size is bound by PCPU_MIN_UNIT_SIZE, so we
can check the value size whether it exceeds PCPU_MIN_UNIT_SIZE first,
like percpu map of local_storage. Maybe the error message seems clearer
compared with "cannot allocate memory".

Signed-off-by: Jinke Han &lt;jinkehan@didiglobal.com&gt;
Signed-off-by: Tao Chen &lt;chen.dylane@gmail.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20240910144111.1464912-2-chen.dylane@gmail.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Percpu map is often used, but the map value size limit often ignored,
like issue: https://github.com/iovisor/bcc/issues/2519. Actually,
percpu map value size is bound by PCPU_MIN_UNIT_SIZE, so we
can check the value size whether it exceeds PCPU_MIN_UNIT_SIZE first,
like percpu map of local_storage. Maybe the error message seems clearer
compared with "cannot allocate memory".

Signed-off-by: Jinke Han &lt;jinkehan@didiglobal.com&gt;
Signed-off-by: Tao Chen &lt;chen.dylane@gmail.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20240910144111.1464912-2-chen.dylane@gmail.com
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Fix percpu address space issues</title>
<updated>2024-08-22T15:01:50+00:00</updated>
<author>
<name>Uros Bizjak</name>
<email>ubizjak@gmail.com</email>
</author>
<published>2024-08-11T16:13:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6d641ca50d7ec7d5e4e889c3f8ea22afebc2a403'/>
<id>6d641ca50d7ec7d5e4e889c3f8ea22afebc2a403</id>
<content type='text'>
In arraymap.c:

In bpf_array_map_seq_start() and bpf_array_map_seq_next()
cast return values from the __percpu address space to
the generic address space via uintptr_t [1].

Correct the declaration of pptr pointer in __bpf_array_map_seq_show()
to void __percpu * and cast the value from the generic address
space to the __percpu address space via uintptr_t [1].

In hashtab.c:

Assign the return value from bpf_mem_cache_alloc() to void pointer
and cast the value to void __percpu ** (void pointer to percpu void
pointer) before dereferencing.

In memalloc.c:

Explicitly declare __percpu variables.

Cast obj to void __percpu **.

In helpers.c:

Cast ptr in BPF_CALL_1 and BPF_CALL_2 from generic address space
to __percpu address space via const uintptr_t [1].

Found by GCC's named address space checks.

There were no changes in the resulting object files.

[1] https://sparse.docs.kernel.org/en/latest/annotations.html#address-space-name

Signed-off-by: Uros Bizjak &lt;ubizjak@gmail.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Cc: Martin KaFai Lau &lt;martin.lau@linux.dev&gt;
Cc: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Cc: John Fastabend &lt;john.fastabend@gmail.com&gt;
Cc: KP Singh &lt;kpsingh@kernel.org&gt;
Cc: Stanislav Fomichev &lt;sdf@fomichev.me&gt;
Cc: Hao Luo &lt;haoluo@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Link: https://lore.kernel.org/r/20240811161414.56744-1-ubizjak@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In arraymap.c:

In bpf_array_map_seq_start() and bpf_array_map_seq_next()
cast return values from the __percpu address space to
the generic address space via uintptr_t [1].

Correct the declaration of pptr pointer in __bpf_array_map_seq_show()
to void __percpu * and cast the value from the generic address
space to the __percpu address space via uintptr_t [1].

In hashtab.c:

Assign the return value from bpf_mem_cache_alloc() to void pointer
and cast the value to void __percpu ** (void pointer to percpu void
pointer) before dereferencing.

In memalloc.c:

Explicitly declare __percpu variables.

Cast obj to void __percpu **.

In helpers.c:

Cast ptr in BPF_CALL_1 and BPF_CALL_2 from generic address space
to __percpu address space via const uintptr_t [1].

Found by GCC's named address space checks.

There were no changes in the resulting object files.

[1] https://sparse.docs.kernel.org/en/latest/annotations.html#address-space-name

Signed-off-by: Uros Bizjak &lt;ubizjak@gmail.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Cc: Martin KaFai Lau &lt;martin.lau@linux.dev&gt;
Cc: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Cc: John Fastabend &lt;john.fastabend@gmail.com&gt;
Cc: KP Singh &lt;kpsingh@kernel.org&gt;
Cc: Stanislav Fomichev &lt;sdf@fomichev.me&gt;
Cc: Hao Luo &lt;haoluo@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Link: https://lore.kernel.org/r/20240811161414.56744-1-ubizjak@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Replace 8 seq_puts() calls by seq_putc() calls</title>
<updated>2024-07-29T19:53:00+00:00</updated>
<author>
<name>Markus Elfring</name>
<email>elfring@users.sourceforge.net</email>
</author>
<published>2024-07-14T14:15:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=df862de41fcde6a0a4906647b0cacec2a8db5cf3'/>
<id>df862de41fcde6a0a4906647b0cacec2a8db5cf3</id>
<content type='text'>
Single line breaks should occasionally be put into a sequence.
Thus use the corresponding function “seq_putc”.

This issue was transformed by using the Coccinelle software.

Signed-off-by: Markus Elfring &lt;elfring@users.sourceforge.net&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/bpf/e26b7df9-cd63-491f-85e8-8cabe60a85e5@web.de
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Single line breaks should occasionally be put into a sequence.
Thus use the corresponding function “seq_putc”.

This issue was transformed by using the Coccinelle software.

Signed-off-by: Markus Elfring &lt;elfring@users.sourceforge.net&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/bpf/e26b7df9-cd63-491f-85e8-8cabe60a85e5@web.de
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Do not walk twice the hash map on free</title>
<updated>2024-04-30T14:28:46+00:00</updated>
<author>
<name>Benjamin Tissoires</name>
<email>bentiss@kernel.org</email>
</author>
<published>2024-04-30T10:43:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a891711d0166133ec5120615fcf365d9745d82b2'/>
<id>a891711d0166133ec5120615fcf365d9745d82b2</id>
<content type='text'>
If someone stores both a timer and a workqueue in a hash map, on free, we
would walk it twice.

Add a check in htab_free_malloced_timers_or_wq and free the timers and
workqueues if they are present.

Fixes: 246331e3f1ea ("bpf: allow struct bpf_wq to be embedded in arraymaps and hashmaps")
Signed-off-by: Benjamin Tissoires &lt;bentiss@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Link: https://lore.kernel.org/bpf/20240430-bpf-next-v3-2-27afe7f3b17c@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If someone stores both a timer and a workqueue in a hash map, on free, we
would walk it twice.

Add a check in htab_free_malloced_timers_or_wq and free the timers and
workqueues if they are present.

Fixes: 246331e3f1ea ("bpf: allow struct bpf_wq to be embedded in arraymaps and hashmaps")
Signed-off-by: Benjamin Tissoires &lt;bentiss@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Link: https://lore.kernel.org/bpf/20240430-bpf-next-v3-2-27afe7f3b17c@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: allow struct bpf_wq to be embedded in arraymaps and hashmaps</title>
<updated>2024-04-24T01:31:25+00:00</updated>
<author>
<name>Benjamin Tissoires</name>
<email>bentiss@kernel.org</email>
</author>
<published>2024-04-20T09:09:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=246331e3f1eac905170a923f0ec76725c2558232'/>
<id>246331e3f1eac905170a923f0ec76725c2558232</id>
<content type='text'>
Currently bpf_wq_cancel_and_free() is just a placeholder as there is
no memory allocation for bpf_wq just yet.

Again, duplication of the bpf_timer approach

Signed-off-by: Benjamin Tissoires &lt;bentiss@kernel.org&gt;
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-9-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently bpf_wq_cancel_and_free() is just a placeholder as there is
no memory allocation for bpf_wq just yet.

Again, duplication of the bpf_timer approach

Signed-off-by: Benjamin Tissoires &lt;bentiss@kernel.org&gt;
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-9-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Fix typos in comments</title>
<updated>2024-04-22T15:48:08+00:00</updated>
<author>
<name>Rafael Passos</name>
<email>rafael@rcpassos.me</email>
</author>
<published>2024-04-17T18:49:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a7de265cb2d849f8986a197499ad58dca0a4f209'/>
<id>a7de265cb2d849f8986a197499ad58dca0a4f209</id>
<content type='text'>
Found the following typos in comments, and fixed them:

s/unpriviledged/unprivileged/
s/reponsible/responsible/
s/possiblities/possibilities/
s/Divison/Division/
s/precsion/precision/
s/havea/have a/
s/reponsible/responsible/
s/responsibile/responsible/
s/tigher/tighter/
s/respecitve/respective/

Signed-off-by: Rafael Passos &lt;rafael@rcpassos.me&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/6af7deb4-bb24-49e8-b3f1-8dd410597337@smtp-relay.sendinblue.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Found the following typos in comments, and fixed them:

s/unpriviledged/unprivileged/
s/reponsible/responsible/
s/possiblities/possibilities/
s/Divison/Division/
s/precsion/precision/
s/havea/have a/
s/reponsible/responsible/
s/responsibile/responsible/
s/tigher/tighter/
s/respecitve/respective/

Signed-off-by: Rafael Passos &lt;rafael@rcpassos.me&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/6af7deb4-bb24-49e8-b3f1-8dd410597337@smtp-relay.sendinblue.com
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: inline bpf_map_lookup_elem() helper for PERCPU_HASH map</title>
<updated>2024-04-03T17:29:56+00:00</updated>
<author>
<name>Andrii Nakryiko</name>
<email>andrii@kernel.org</email>
</author>
<published>2024-04-02T02:13:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0b56e637f705836b9aa51e2b1058c3c814c121a8'/>
<id>0b56e637f705836b9aa51e2b1058c3c814c121a8</id>
<content type='text'>
Using new per-CPU BPF instruction, partially inline
bpf_map_lookup_elem() helper for per-CPU hashmap BPF map. Just like for
normal HASH map, we still generate a call into __htab_map_lookup_elem(),
but after that we resolve per-CPU element address using a new
instruction, saving on extra functions calls.

Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Link: https://lore.kernel.org/r/20240402021307.1012571-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using new per-CPU BPF instruction, partially inline
bpf_map_lookup_elem() helper for per-CPU hashmap BPF map. Just like for
normal HASH map, we still generate a call into __htab_map_lookup_elem(),
but after that we resolve per-CPU element address using a new
instruction, saving on extra functions calls.

Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Link: https://lore.kernel.org/r/20240402021307.1012571-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Mitigate latency spikes associated with freeing non-preallocated htab</title>
<updated>2024-03-29T01:31:40+00:00</updated>
<author>
<name>Yafang Shao</name>
<email>laoar.shao@gmail.com</email>
</author>
<published>2024-03-27T03:20:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ee3bad033d01066348913f3220ea81987721ed53'/>
<id>ee3bad033d01066348913f3220ea81987721ed53</id>
<content type='text'>
Following the recent upgrade of one of our BPF programs, we encountered
significant latency spikes affecting other applications running on the same
host. After thorough investigation, we identified that these spikes were
primarily caused by the prolonged duration required to free a
non-preallocated htab with approximately 2 million keys.

Notably, our kernel configuration lacks the presence of CONFIG_PREEMPT. In
scenarios where kernel execution extends excessively, other threads might
be starved of CPU time, resulting in latency issues across the system. To
mitigate this, we've adopted a proactive approach by incorporating
cond_resched() calls within the kernel code. This ensures that during
lengthy kernel operations, the scheduler is invoked periodically to provide
opportunities for other threads to execute.

Signed-off-by: Yafang Shao &lt;laoar.shao@gmail.com&gt;
Acked-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Link: https://lore.kernel.org/r/20240327032022.78391-1-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Following the recent upgrade of one of our BPF programs, we encountered
significant latency spikes affecting other applications running on the same
host. After thorough investigation, we identified that these spikes were
primarily caused by the prolonged duration required to free a
non-preallocated htab with approximately 2 million keys.

Notably, our kernel configuration lacks the presence of CONFIG_PREEMPT. In
scenarios where kernel execution extends excessively, other threads might
be starved of CPU time, resulting in latency issues across the system. To
mitigate this, we've adopted a proactive approach by incorporating
cond_resched() calls within the kernel code. This ensures that during
lengthy kernel operations, the scheduler is invoked periodically to provide
opportunities for other threads to execute.

Signed-off-by: Yafang Shao &lt;laoar.shao@gmail.com&gt;
Acked-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Link: https://lore.kernel.org/r/20240327032022.78391-1-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Fix hashtab overflow check on 32-bit arches</title>
<updated>2024-03-08T04:05:56+00:00</updated>
<author>
<name>Toke Høiland-Jørgensen</name>
<email>toke@redhat.com</email>
</author>
<published>2024-03-07T12:03:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6787d916c2cf9850c97a0a3f73e08c43e7d973b1'/>
<id>6787d916c2cf9850c97a0a3f73e08c43e7d973b1</id>
<content type='text'>
The hashtab code relies on roundup_pow_of_two() to compute the number of
hash buckets, and contains an overflow check by checking if the
resulting value is 0. However, on 32-bit arches, the roundup code itself
can overflow by doing a 32-bit left-shift of an unsigned long value,
which is undefined behaviour, so it is not guaranteed to truncate
neatly. This was triggered by syzbot on the DEVMAP_HASH type, which
contains the same check, copied from the hashtab code. So apply the same
fix to hashtab, by moving the overflow check to before the roundup.

Fixes: daaf427c6ab3 ("bpf: fix arraymap NULL deref and missing overflow and zero size checks")
Signed-off-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Message-ID: &lt;20240307120340.99577-3-toke@redhat.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The hashtab code relies on roundup_pow_of_two() to compute the number of
hash buckets, and contains an overflow check by checking if the
resulting value is 0. However, on 32-bit arches, the roundup code itself
can overflow by doing a 32-bit left-shift of an unsigned long value,
which is undefined behaviour, so it is not guaranteed to truncate
neatly. This was triggered by syzbot on the DEVMAP_HASH type, which
contains the same check, copied from the hashtab code. So apply the same
fix to hashtab, by moving the overflow check to before the roundup.

Fixes: daaf427c6ab3 ("bpf: fix arraymap NULL deref and missing overflow and zero size checks")
Signed-off-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Message-ID: &lt;20240307120340.99577-3-toke@redhat.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
