<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/kernel/bpf/inode.c, branch v7.2-rc1</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge tag 'bpf-next-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next</title>
<updated>2026-06-17T08:18:14+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-06-17T08:18:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9c87e61e3c5797277407ba5eae4eac8a52be3fa3'/>
<id>9c87e61e3c5797277407ba5eae4eac8a52be3fa3</id>
<content type='text'>
Pull bpf updates from Alexei Starovoitov:
 "Major changes:

   - Recover from BPF arena page faults using a scratch page and add
     ptep_try_set() for lockless empty-slot installs on x86 and arm64.

     This allows BPF kfuncs to access arena pointers directly.

     The 'arena_direct_access' stable branch was created for this work
     and was pulled into sched-ext and bpf-next trees (Tejun Heo, Kumar
     Kartikeya Dwivedi)

   - Lift old restriction and support 6+ arguments in BPF programs and
     kfuncs on x86 and arm64 (Yonghong Song, Puranjay Mohan)

  Other features and fixes:

   - Add 24-bit BTF vlen and reclaim unused bits in the BTF UAPI to ease
     addition of new BTF kinds (Alan Maguire)

   - Raise the maximum BPF call chain depth from 8 to 16 frames (Alexei
     Starovoitov)

   - Refactor object relationship tracking in the verifier and fix a
     dynptr use-after-free bug (Amery Hung)

   - Harden the signed program loader and reject exclusive maps as inner
     maps (Daniel Borkmann)

   - Replace the verifier min/max bounds fields with a circular number
     (cnum) representation and improve 32-&gt;64 bit range refinements
     (Eduard Zingerman)

   - Introduce the arena library and runtime (libarena) with a buddy
     allocator, rbtree and SPMC queue data structures, ASAN support and
     a parallel test harness. Allow subprograms to return arena pointers
     and switch to a BTF type-tag based __arena annotation (Emil
     Tsalapatis)

   - Cache build IDs in the sleepable stackmap path and avoid faultable
     build ID reads under mm locks (Ihor Solodrai)

   - Introduce the tracing_multi link to attach a single BPF program to
     many kernel functions at once. Allow specifying the uprobe_multi
     target via FD (Jiri Olsa)

   - Extend the bpf_list family of kfuncs with bpf_list_add/del(), and
     bpf_list_is_first/is_last/empty() (Kaitao Cheng)

   - Extend the BPF syscall with common attributes support for
     prog_load, btf_load and map_create (Leon Hwang)

   - Wrap rhashtable as BPF map (Mykyta Yatsenko, Herbert Xu)

   - Add sleepable support for tracepoint programs and fix deadlocks in
     LRU map due to NMI reentry (Mykyta Yatsenko)

   - Fix OOB access in bpf_flow_keys, fix nullness analysis of inner
     arrays, enforce write checks for global subprograms (Nuoqi Gui)

   - Report the maximum combined stack depth and print a breakdown of
     instructions processed per subprogram (Paul Chaignon)

   - Add an XDP load-balancer benchmark and arm64 JIT support for stack
     arguments (Puranjay Mohan)

   - Add kfuncs to traverse over wakeup_sources (Samuel Wu)

   - Allow sleepable BPF programs to use LPM trie maps directly (Vlad
     Poenaru)

   - Many more fixes and cleanups across the verifier, BTF, sockmap,
     devmap, bpffs, security hooks, s390/riscv/loongarch JITs,
     rqspinlock, libbpf, bpftool, selftests"

* tag 'bpf-next-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (336 commits)
  selftests/bpf: Work around llvm stack overflow in crypto progs
  selftests/bpf: add test for bpf_msg_pop_data() overflow
  bpf, sockmap: fix integer overflow in bpf_msg_pop_data() bounds check
  sockmap: Fix use-after-free in udp_bpf_recvmsg()
  bpf, sockmap: keep sk_msg copy state in sync
  bpf, sockmap: Fix wrong rsge offset in bpf_msg_push_data()
  bpf, sockmap: reject overflowing copy + len in bpf_msg_push_data()
  selftsets/bpf: Retry map update on helper_fill_hashmap()
  selftests/bpf: Add test for sleepable lsm_cgroup rejection
  selftests/bpf: Add test to verify the fix for bpf_setsockopt() helper
  bpf: Fix bpf_get/setsockopt to tos for ipv4-mapped ipv6 socket
  selftests/bpf: Avoid static LLVM linking for cross builds
  selftests/bpf: Use common CFLAGS for urandom_read
  selftests/bpf: Initialize operation name before use
  tools/bpf: build: Append extra cflags
  libbpf: Initialize CFLAGS before including Makefile.include
  bpftool: Append extra host flags
  bpftool: Avoid adding EXTRA_CFLAGS to HOST_CFLAGS
  bpftool: Pass host flags to bootstrap libbpf
  selftests/bpf: correct CONFIG_PPC64 macro name in comment
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull bpf updates from Alexei Starovoitov:
 "Major changes:

   - Recover from BPF arena page faults using a scratch page and add
     ptep_try_set() for lockless empty-slot installs on x86 and arm64.

     This allows BPF kfuncs to access arena pointers directly.

     The 'arena_direct_access' stable branch was created for this work
     and was pulled into sched-ext and bpf-next trees (Tejun Heo, Kumar
     Kartikeya Dwivedi)

   - Lift old restriction and support 6+ arguments in BPF programs and
     kfuncs on x86 and arm64 (Yonghong Song, Puranjay Mohan)

  Other features and fixes:

   - Add 24-bit BTF vlen and reclaim unused bits in the BTF UAPI to ease
     addition of new BTF kinds (Alan Maguire)

   - Raise the maximum BPF call chain depth from 8 to 16 frames (Alexei
     Starovoitov)

   - Refactor object relationship tracking in the verifier and fix a
     dynptr use-after-free bug (Amery Hung)

   - Harden the signed program loader and reject exclusive maps as inner
     maps (Daniel Borkmann)

   - Replace the verifier min/max bounds fields with a circular number
     (cnum) representation and improve 32-&gt;64 bit range refinements
     (Eduard Zingerman)

   - Introduce the arena library and runtime (libarena) with a buddy
     allocator, rbtree and SPMC queue data structures, ASAN support and
     a parallel test harness. Allow subprograms to return arena pointers
     and switch to a BTF type-tag based __arena annotation (Emil
     Tsalapatis)

   - Cache build IDs in the sleepable stackmap path and avoid faultable
     build ID reads under mm locks (Ihor Solodrai)

   - Introduce the tracing_multi link to attach a single BPF program to
     many kernel functions at once. Allow specifying the uprobe_multi
     target via FD (Jiri Olsa)

   - Extend the bpf_list family of kfuncs with bpf_list_add/del(), and
     bpf_list_is_first/is_last/empty() (Kaitao Cheng)

   - Extend the BPF syscall with common attributes support for
     prog_load, btf_load and map_create (Leon Hwang)

   - Wrap rhashtable as BPF map (Mykyta Yatsenko, Herbert Xu)

   - Add sleepable support for tracepoint programs and fix deadlocks in
     LRU map due to NMI reentry (Mykyta Yatsenko)

   - Fix OOB access in bpf_flow_keys, fix nullness analysis of inner
     arrays, enforce write checks for global subprograms (Nuoqi Gui)

   - Report the maximum combined stack depth and print a breakdown of
     instructions processed per subprogram (Paul Chaignon)

   - Add an XDP load-balancer benchmark and arm64 JIT support for stack
     arguments (Puranjay Mohan)

   - Add kfuncs to traverse over wakeup_sources (Samuel Wu)

   - Allow sleepable BPF programs to use LPM trie maps directly (Vlad
     Poenaru)

   - Many more fixes and cleanups across the verifier, BTF, sockmap,
     devmap, bpffs, security hooks, s390/riscv/loongarch JITs,
     rqspinlock, libbpf, bpftool, selftests"

* tag 'bpf-next-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (336 commits)
  selftests/bpf: Work around llvm stack overflow in crypto progs
  selftests/bpf: add test for bpf_msg_pop_data() overflow
  bpf, sockmap: fix integer overflow in bpf_msg_pop_data() bounds check
  sockmap: Fix use-after-free in udp_bpf_recvmsg()
  bpf, sockmap: keep sk_msg copy state in sync
  bpf, sockmap: Fix wrong rsge offset in bpf_msg_push_data()
  bpf, sockmap: reject overflowing copy + len in bpf_msg_push_data()
  selftsets/bpf: Retry map update on helper_fill_hashmap()
  selftests/bpf: Add test for sleepable lsm_cgroup rejection
  selftests/bpf: Add test to verify the fix for bpf_setsockopt() helper
  bpf: Fix bpf_get/setsockopt to tos for ipv4-mapped ipv6 socket
  selftests/bpf: Avoid static LLVM linking for cross builds
  selftests/bpf: Use common CFLAGS for urandom_read
  selftests/bpf: Initialize operation name before use
  tools/bpf: build: Append extra cflags
  libbpf: Initialize CFLAGS before including Makefile.include
  bpftool: Append extra host flags
  bpftool: Avoid adding EXTRA_CFLAGS to HOST_CFLAGS
  bpftool: Pass host flags to bootstrap libbpf
  selftests/bpf: correct CONFIG_PPC64 macro name in comment
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Add simple xattr support to bpffs</title>
<updated>2026-06-06T13:22:44+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2026-06-02T07:40:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9722955b54307e9070994f2382ec06af3d7405e0'/>
<id>9722955b54307e9070994f2382ec06af3d7405e0</id>
<content type='text'>
Add support for extended attributes on bpffs inodes so that user space
and BPF LSM programs can attach metadata, for example, a content hash
or a security label - to a pinned object or directory. BPF LSM or user
space tooling can then uniformly look at this (e.g. security.bpf.*) in
similar way to other fs'es. The store is in-memory and non-persistent:
it lives only for the lifetime of the mount, like everything else in
bpffs. The modelling is similar to tmpfs.

bpffs serves the trusted.* and security.* namespaces; user.* is left
unsupported. As bpffs is FS_USERNS_MOUNT, security.* is reachable by
the unprivileged mounter in a user namespace, and thus we are using
the simple_xattr_set_limited infra there (trusted.* needs global
CAP_SYS_ADMIN).

bpf_fill_super() is open-coded instead of using simple_fill_super(),
because the root inode must now be allocated through bpf_fs_alloc_inode()
i.e. carry the bpf_fs_inode wrapper and come from the right cache -
which requires s_op (and s_xattr) to be installed before the first
inode is created. While at it, also harden s_iflags with SB_I_NOEXEC
and SB_I_NODEV.

bpf_fs_listxattr() is only reachable through the filesystem via
i_op-&gt;listxattr, so the BPF token inode is left untouched. Name-based
fsetxattr()/fgetxattr() on a token fd still work since the get/set
handlers are installed at the superblock.

For security.* namespace, we use simple_xattr_set_limited() but
there was no simple_xattr_add_limited() API yet which was needed
in bpf_fs_initxattrs() to avoid underflows in the accounting. The
symlink target is freed in bpf_free_inode() rather than in
bpf_destroy_inode() so that it is released only after an RCU grace
period, as an RCU path walk following the symlink may still
dereference inode-&gt;i_link in security_inode_follow_link(). Lastly,
the bpf_symlink() allocated the symlink target is switched to
GFP_KERNEL_ACCOUNT, so the string is charged to the caller's memcg.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://patch.msgid.link/20260602074012.416289-1-daniel@iogearbox.net
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add support for extended attributes on bpffs inodes so that user space
and BPF LSM programs can attach metadata, for example, a content hash
or a security label - to a pinned object or directory. BPF LSM or user
space tooling can then uniformly look at this (e.g. security.bpf.*) in
similar way to other fs'es. The store is in-memory and non-persistent:
it lives only for the lifetime of the mount, like everything else in
bpffs. The modelling is similar to tmpfs.

bpffs serves the trusted.* and security.* namespaces; user.* is left
unsupported. As bpffs is FS_USERNS_MOUNT, security.* is reachable by
the unprivileged mounter in a user namespace, and thus we are using
the simple_xattr_set_limited infra there (trusted.* needs global
CAP_SYS_ADMIN).

bpf_fill_super() is open-coded instead of using simple_fill_super(),
because the root inode must now be allocated through bpf_fs_alloc_inode()
i.e. carry the bpf_fs_inode wrapper and come from the right cache -
which requires s_op (and s_xattr) to be installed before the first
inode is created. While at it, also harden s_iflags with SB_I_NOEXEC
and SB_I_NODEV.

bpf_fs_listxattr() is only reachable through the filesystem via
i_op-&gt;listxattr, so the BPF token inode is left untouched. Name-based
fsetxattr()/fgetxattr() on a token fd still work since the get/set
handlers are installed at the superblock.

For security.* namespace, we use simple_xattr_set_limited() but
there was no simple_xattr_add_limited() API yet which was needed
in bpf_fs_initxattrs() to avoid underflows in the accounting. The
symlink target is freed in bpf_free_inode() rather than in
bpf_destroy_inode() so that it is released only after an RCU grace
period, as an RCU path walk following the symlink may still
dereference inode-&gt;i_link in security_inode_follow_link(). Lastly,
the bpf_symlink() allocated the symlink target is switched to
GFP_KERNEL_ACCOUNT, so the string is charged to the caller's memcg.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://patch.msgid.link/20260602074012.416289-1-daniel@iogearbox.net
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: fix UAF by restoring RCU-delayed inode freeing in bpffs</title>
<updated>2026-06-02T04:07:06+00:00</updated>
<author>
<name>Deepanshu Kartikey</name>
<email>kartikey406@gmail.com</email>
</author>
<published>2026-06-02T02:52:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b93c55b4932dd7e32dca8cf34a3443cc87a02906'/>
<id>b93c55b4932dd7e32dca8cf34a3443cc87a02906</id>
<content type='text'>
commit 4f375ade6aa9 ("bpf: Avoid RCU context warning when unpinning
htab with internal structs") moved inode cleanup from -&gt;free_inode()
into -&gt;destroy_inode() to avoid sleeping in RCU context when calling
bpf_any_put(). However this removed the RCU delay on freeing the
inode itself and the cached symlink body (i_link), both of which
can be accessed by RCU pathwalk (pick_link, may_lookup etc.).

This causes a use-after-free when a concurrent unlinkat() drops the
last inode reference and destroy_inode() frees the inode immediately,
while another task is still walking the path in RCU mode and reads
inode-&gt;i_opflags (offset +2) inside current_time() -&gt; is_mgtime().

KASAN reports:
  BUG: KASAN: slab-use-after-free in is_mgtime include/linux/fs.h:2313
  Read of size 2 at addr ffff8880407e4282 (offset +2 = i_opflags)

The rules (per Al Viro):
  -&gt;destroy_inode()  called immediately, can sleep, use for blocking
                     cleanup e.g. bpf_any_put()
  -&gt;free_inode()     called after RCU grace period, use for freeing
                     inode and anything RCU-accessible e.g. i_link

Fix: split the two concerns properly:
  - keep bpf_any_put() in bpf_destroy_inode() since it is blocking
    and needs to run promptly
  - introduce bpf_free_inode() to handle kfree(i_link) and
    free_inode_nonrcu() with proper RCU delay, preventing the UAF

Fixes: 4f375ade6aa9 ("bpf: Avoid RCU context warning when unpinning htab with internal structs")
Reported-by: syzbot+36e50496c8ac4bcde3f9@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=36e50496c8ac4bcde3f9
Suggested-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Link: https://lore.kernel.org/all/20260423043906.GN3518998@ZenIV/
Link: https://lore.kernel.org/all/20260602002607.110866-1-kartikey406@gmail.com/T/ [v1]
Signed-off-by: Deepanshu Kartikey &lt;kartikey406@gmail.com&gt;
Acked-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Link: https://lore.kernel.org/r/20260602025249.113828-1-kartikey406@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 4f375ade6aa9 ("bpf: Avoid RCU context warning when unpinning
htab with internal structs") moved inode cleanup from -&gt;free_inode()
into -&gt;destroy_inode() to avoid sleeping in RCU context when calling
bpf_any_put(). However this removed the RCU delay on freeing the
inode itself and the cached symlink body (i_link), both of which
can be accessed by RCU pathwalk (pick_link, may_lookup etc.).

This causes a use-after-free when a concurrent unlinkat() drops the
last inode reference and destroy_inode() frees the inode immediately,
while another task is still walking the path in RCU mode and reads
inode-&gt;i_opflags (offset +2) inside current_time() -&gt; is_mgtime().

KASAN reports:
  BUG: KASAN: slab-use-after-free in is_mgtime include/linux/fs.h:2313
  Read of size 2 at addr ffff8880407e4282 (offset +2 = i_opflags)

The rules (per Al Viro):
  -&gt;destroy_inode()  called immediately, can sleep, use for blocking
                     cleanup e.g. bpf_any_put()
  -&gt;free_inode()     called after RCU grace period, use for freeing
                     inode and anything RCU-accessible e.g. i_link

Fix: split the two concerns properly:
  - keep bpf_any_put() in bpf_destroy_inode() since it is blocking
    and needs to run promptly
  - introduce bpf_free_inode() to handle kfree(i_link) and
    free_inode_nonrcu() with proper RCU delay, preventing the UAF

Fixes: 4f375ade6aa9 ("bpf: Avoid RCU context warning when unpinning htab with internal structs")
Reported-by: syzbot+36e50496c8ac4bcde3f9@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=36e50496c8ac4bcde3f9
Suggested-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Link: https://lore.kernel.org/all/20260423043906.GN3518998@ZenIV/
Link: https://lore.kernel.org/all/20260602002607.110866-1-kartikey406@gmail.com/T/ [v1]
Signed-off-by: Deepanshu Kartikey &lt;kartikey406@gmail.com&gt;
Acked-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Link: https://lore.kernel.org/r/20260602025249.113828-1-kartikey406@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Convert 'alloc_obj' family to use the new default GFP_KERNEL argument</title>
<updated>2026-02-22T01:09:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T00:37:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43'/>
<id>bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43</id>
<content type='text'>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Optimize the performance of find_bpffs_btf_enums</title>
<updated>2026-01-14T00:21:36+00:00</updated>
<author>
<name>Donglin Peng</name>
<email>pengdonglin@xiaomi.com</email>
</author>
<published>2026-01-09T13:00:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=434bcbc837a69baa2720b2ae5baba8b6e36898c0'/>
<id>434bcbc837a69baa2720b2ae5baba8b6e36898c0</id>
<content type='text'>
Currently, vmlinux BTF is unconditionally sorted during
the build phase. The function btf_find_by_name_kind
executes the binary search branch, so find_bpffs_btf_enums
can be optimized by using btf_find_by_name_kind.

Signed-off-by: Donglin Peng &lt;pengdonglin@xiaomi.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Link: https://lore.kernel.org/bpf/20260109130003.3313716-10-dolinux.peng@gmail.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, vmlinux BTF is unconditionally sorted during
the build phase. The function btf_find_by_name_kind
executes the binary search branch, so find_bpffs_btf_enums
can be optimized by using btf_find_by_name_kind.

Signed-off-by: Donglin Peng &lt;pengdonglin@xiaomi.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Link: https://lore.kernel.org/bpf/20260109130003.3313716-10-dolinux.peng@gmail.com
</pre>
</div>
</content>
</entry>
<entry>
<title>convert bpf</title>
<updated>2025-11-16T06:35:03+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2024-05-09T16:26:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c5055286f88fdedcff9c974b4b85fbe221bf0972'/>
<id>c5055286f88fdedcff9c974b4b85fbe221bf0972</id>
<content type='text'>
object creation goes through the normal VFS paths or approximation
thereof (user_path_create()/done_path_create() in case of bpf_obj_do_pin(),
open-coded simple_{start,done}_creating() in bpf_iter_link_pin_kernel()
at mount time), removals go entirely through the normal VFS paths (and
-&gt;unlink() is simple_unlink() there).

Enough to have bpf_dentry_finalize() use d_make_persistent() instead
of dget() and we are done.

Convert bpf_iter_link_pin_kernel() to simple_{start,done}_creating(),
while we are at it.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
object creation goes through the normal VFS paths or approximation
thereof (user_path_create()/done_path_create() in case of bpf_obj_do_pin(),
open-coded simple_{start,done}_creating() in bpf_iter_link_pin_kernel()
at mount time), removals go entirely through the normal VFS paths (and
-&gt;unlink() is simple_unlink() there).

Enough to have bpf_dentry_finalize() use d_make_persistent() instead
of dget() and we are done.

Convert bpf_iter_link_pin_kernel() to simple_{start,done}_creating(),
while we are at it.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Avoid RCU context warning when unpinning htab with internal structs</title>
<updated>2025-10-10T17:10:08+00:00</updated>
<author>
<name>KaFai Wan</name>
<email>kafai.wan@linux.dev</email>
</author>
<published>2025-10-08T10:26:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4f375ade6aa9f37fd72d7a78682f639772089eed'/>
<id>4f375ade6aa9f37fd72d7a78682f639772089eed</id>
<content type='text'>
When unpinning a BPF hash table (htab or htab_lru) that contains internal
structures (timer, workqueue, or task_work) in its values, a BUG warning
is triggered:
 BUG: sleeping function called from invalid context at kernel/bpf/hashtab.c:244
 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: ksoftirqd/0
 ...

The issue arises from the interaction between BPF object unpinning and
RCU callback mechanisms:
1. BPF object unpinning uses -&gt;free_inode() which schedules cleanup via
   call_rcu(), deferring the actual freeing to an RCU callback that
   executes within the RCU_SOFTIRQ context.
2. During cleanup of hash tables containing internal structures,
   htab_map_free_internal_structs() is invoked, which includes
   cond_resched() or cond_resched_rcu() calls to yield the CPU during
   potentially long operations.

However, cond_resched() or cond_resched_rcu() cannot be safely called from
atomic RCU softirq context, leading to the BUG warning when attempting
to reschedule.

Fix this by changing from -&gt;free_inode() to -&gt;destroy_inode() and rename
bpf_free_inode() to bpf_destroy_inode() for BPF objects (prog, map, link).
This allows direct inode freeing without RCU callback scheduling,
avoiding the invalid context warning.

Reported-by: Le Chen &lt;tom2cat@sjtu.edu.cn&gt;
Closes: https://lore.kernel.org/all/1444123482.1827743.1750996347470.JavaMail.zimbra@sjtu.edu.cn/
Fixes: 68134668c17f ("bpf: Add map side support for bpf timers.")
Suggested-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: KaFai Wan &lt;kafai.wan@linux.dev&gt;
Acked-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Link: https://lore.kernel.org/r/20251008102628.808045-2-kafai.wan@linux.dev
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When unpinning a BPF hash table (htab or htab_lru) that contains internal
structures (timer, workqueue, or task_work) in its values, a BUG warning
is triggered:
 BUG: sleeping function called from invalid context at kernel/bpf/hashtab.c:244
 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: ksoftirqd/0
 ...

The issue arises from the interaction between BPF object unpinning and
RCU callback mechanisms:
1. BPF object unpinning uses -&gt;free_inode() which schedules cleanup via
   call_rcu(), deferring the actual freeing to an RCU callback that
   executes within the RCU_SOFTIRQ context.
2. During cleanup of hash tables containing internal structures,
   htab_map_free_internal_structs() is invoked, which includes
   cond_resched() or cond_resched_rcu() calls to yield the CPU during
   potentially long operations.

However, cond_resched() or cond_resched_rcu() cannot be safely called from
atomic RCU softirq context, leading to the BUG warning when attempting
to reschedule.

Fix this by changing from -&gt;free_inode() to -&gt;destroy_inode() and rename
bpf_free_inode() to bpf_destroy_inode() for BPF objects (prog, map, link).
This allows direct inode freeing without RCU callback scheduling,
avoiding the invalid context warning.

Reported-by: Le Chen &lt;tom2cat@sjtu.edu.cn&gt;
Closes: https://lore.kernel.org/all/1444123482.1827743.1750996347470.JavaMail.zimbra@sjtu.edu.cn/
Fixes: 68134668c17f ("bpf: Add map side support for bpf timers.")
Suggested-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: KaFai Wan &lt;kafai.wan@linux.dev&gt;
Acked-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Link: https://lore.kernel.org/r/20251008102628.808045-2-kafai.wan@linux.dev
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'vfs-6.18-rc1.async' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs</title>
<updated>2025-09-29T18:55:15+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-09-29T18:55:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=449c2b302c8e200558619821ced46cc13cdb9aa6'/>
<id>449c2b302c8e200558619821ced46cc13cdb9aa6</id>
<content type='text'>
Pull vfs async directory updates from Christian Brauner:
 "This contains further preparatory changes for the asynchronous directory
  locking scheme:

   - Add lookup_one_positive_killable() which allows overlayfs to
     perform lookup that won't block on a fatal signal

   - Unify the mount idmap handling in struct renamedata as a rename can
     only happen within a single mount

   - Introduce kern_path_parent() for audit which sets the path to the
     parent and returns a dentry for the target without holding any
     locks on return

   - Rename kern_path_locked() as it is only used to prepare for the
     removal of an object from the filesystem:

	kern_path_locked()    =&gt; start_removing_path()
	kern_path_create()    =&gt; start_creating_path()
	user_path_create()    =&gt; start_creating_user_path()
	user_path_locked_at() =&gt; start_removing_user_path_at()
	done_path_create()    =&gt; end_creating_path()
	NA                    =&gt; end_removing_path()"

* tag 'vfs-6.18-rc1.async' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  debugfs: rename start_creating() to debugfs_start_creating()
  VFS: rename kern_path_locked() and related functions.
  VFS/audit: introduce kern_path_parent() for audit
  VFS: unify old_mnt_idmap and new_mnt_idmap in renamedata
  VFS: discard err2 in filename_create()
  VFS/ovl: add lookup_one_positive_killable()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull vfs async directory updates from Christian Brauner:
 "This contains further preparatory changes for the asynchronous directory
  locking scheme:

   - Add lookup_one_positive_killable() which allows overlayfs to
     perform lookup that won't block on a fatal signal

   - Unify the mount idmap handling in struct renamedata as a rename can
     only happen within a single mount

   - Introduce kern_path_parent() for audit which sets the path to the
     parent and returns a dentry for the target without holding any
     locks on return

   - Rename kern_path_locked() as it is only used to prepare for the
     removal of an object from the filesystem:

	kern_path_locked()    =&gt; start_removing_path()
	kern_path_create()    =&gt; start_creating_path()
	user_path_create()    =&gt; start_creating_user_path()
	user_path_locked_at() =&gt; start_removing_user_path_at()
	done_path_create()    =&gt; end_creating_path()
	NA                    =&gt; end_removing_path()"

* tag 'vfs-6.18-rc1.async' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  debugfs: rename start_creating() to debugfs_start_creating()
  VFS: rename kern_path_locked() and related functions.
  VFS/audit: introduce kern_path_parent() for audit
  VFS: unify old_mnt_idmap and new_mnt_idmap in renamedata
  VFS: discard err2 in filename_create()
  VFS/ovl: add lookup_one_positive_killable()
</pre>
</div>
</content>
</entry>
<entry>
<title>VFS: rename kern_path_locked() and related functions.</title>
<updated>2025-09-23T10:37:36+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neil@brown.name</email>
</author>
<published>2025-09-22T04:29:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3d18f80ce181ba27f37d0ec1c550b22acb01dd49'/>
<id>3d18f80ce181ba27f37d0ec1c550b22acb01dd49</id>
<content type='text'>
kern_path_locked() is now only used to prepare for removing an object
from the filesystem (and that is the only credible reason for wanting a
positive locked dentry).  Thus it corresponds to kern_path_create() and
so should have a corresponding name.

Unfortunately the name "kern_path_create" is somewhat misleading as it
doesn't actually create anything.  The recently added
simple_start_creating() provides a better pattern I believe.  The
"start" can be matched with "end" to bracket the creating or removing.

So this patch changes names:

 kern_path_locked -&gt; start_removing_path
 kern_path_create -&gt; start_creating_path
 user_path_create -&gt; start_creating_user_path
 user_path_locked_at -&gt; start_removing_user_path_at
 done_path_create -&gt; end_creating_path

and also introduces end_removing_path() which is identical to
end_creating_path().

__start_removing_path (which was __kern_path_locked) is enhanced to
call mnt_want_write() for consistency with the start_creating_path().

Reviewed-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neil@brown.name&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
kern_path_locked() is now only used to prepare for removing an object
from the filesystem (and that is the only credible reason for wanting a
positive locked dentry).  Thus it corresponds to kern_path_create() and
so should have a corresponding name.

Unfortunately the name "kern_path_create" is somewhat misleading as it
doesn't actually create anything.  The recently added
simple_start_creating() provides a better pattern I believe.  The
"start" can be matched with "end" to bracket the creating or removing.

So this patch changes names:

 kern_path_locked -&gt; start_removing_path
 kern_path_create -&gt; start_creating_path
 user_path_create -&gt; start_creating_user_path
 user_path_locked_at -&gt; start_removing_user_path_at
 done_path_create -&gt; end_creating_path

and also introduces end_removing_path() which is identical to
end_creating_path().

__start_removing_path (which was __kern_path_locked) is enhanced to
call mnt_want_write() for consistency with the start_creating_path().

Reviewed-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neil@brown.name&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
