<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/um/kernel/skas, branch master</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>um: Replace strncpy() with strnlen()+memcpy_and_pad() in strncpy_chunk_from_user()</title>
<updated>2026-03-27T07:21:09+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-03-23T17:17:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8aae2da6104ab98799b203c10cb3e0bd719fe02b'/>
<id>8aae2da6104ab98799b203c10cb3e0bd719fe02b</id>
<content type='text'>
Replace the deprecated[1] strncpy() with strnlen() on the source
followed by memcpy_and_pad().

This function is a chunk callback for UML's strncpy_from_user()
implementation, called by buffer_op() to process userspace memory one
page at a time. The source is a kernel-mapped userspace address that
is not guaranteed to be NUL-terminated; "len" bounds how many bytes
to read from it.

By measuring the source string length first with strnlen(), we avoid
reading past the NUL terminator in the source. memcpy_and_pad() then
copies the string content and zero-fills the remainder of the chunk,
preserving the original strncpy() behavior exactly: copy up to the
first NUL, then pad with zeros to the full length.

strtomem_pad() would be the idiomatic helper for this strnlen() +
memcpy_and_pad() pattern, but it requires a compile-time-determinable
destination size (via ARRAY_SIZE()). Here the destination is a char *
into a caller-provided buffer and the chunk length is a runtime value,
so the explicit two-step is necessary.

No behavioral change: the same bytes are written to the destination
(string content followed by zero padding), the pointer advances by
the same amount, and the NUL-found return condition is unchanged.

Link: https://github.com/KSPP/linux/issues/90 [1]
Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
Link: https://patch.msgid.link/20260323171713.work.839-kees@kernel.org
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace the deprecated[1] strncpy() with strnlen() on the source
followed by memcpy_and_pad().

This function is a chunk callback for UML's strncpy_from_user()
implementation, called by buffer_op() to process userspace memory one
page at a time. The source is a kernel-mapped userspace address that
is not guaranteed to be NUL-terminated; "len" bounds how many bytes
to read from it.

By measuring the source string length first with strnlen(), we avoid
reading past the NUL terminator in the source. memcpy_and_pad() then
copies the string content and zero-fills the remainder of the chunk,
preserving the original strncpy() behavior exactly: copy up to the
first NUL, then pad with zeros to the full length.

strtomem_pad() would be the idiomatic helper for this strnlen() +
memcpy_and_pad() pattern, but it requires a compile-time-determinable
destination size (via ARRAY_SIZE()). Here the destination is a char *
into a caller-provided buffer and the chunk length is a runtime value,
so the explicit two-step is necessary.

No behavioral change: the same bytes are written to the destination
(string content followed by zero padding), the pointer advances by
the same amount, and the NUL-found return condition is unchanged.

Link: https://github.com/KSPP/linux/issues/90 [1]
Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
Link: https://patch.msgid.link/20260323171713.work.839-kees@kernel.org
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: fix address-of CMSG_DATA() rvalue in stub</title>
<updated>2026-03-21T09:41:44+00:00</updated>
<author>
<name>Marcel W. Wysocki</name>
<email>maci.stgn@gmail.com</email>
</author>
<published>2026-02-15T14:28:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4076f7329832074196e050def49d22265fce2021'/>
<id>4076f7329832074196e050def49d22265fce2021</id>
<content type='text'>
The UML stub takes the address of CMSG_DATA(fd_msg):

    fd_map = (void *)&amp;CMSG_DATA(fd_msg);

CMSG_DATA() is specified by POSIX to return unsigned char *.  Taking
its address is semantically wrong -- the intent is to get a pointer
to the control message data, which is exactly what CMSG_DATA()
already returns.

This happens to compile with glibc because glibc's primary
CMSG_DATA definition accesses a flexible array member:

    #define CMSG_DATA(cmsg) ((cmsg)-&gt;__cmsg_data)

An array lvalue can have its address taken, and &amp;array yields the
same address as array.  However, glibc also has an alternative
definition that uses pointer arithmetic (returning an rvalue), and
musl's definition always uses pointer arithmetic:

    /* musl */
    #define CMSG_DATA(cmsg) \
        ((unsigned char *)(((struct cmsghdr *)(cmsg)) + 1))

Taking the address of an rvalue is a hard error in C, so the
current code fails to compile with musl libc.

Remove the erroneous &amp; operator.  The resulting code is correct
regardless of the CMSG_DATA implementation -- it simply assigns the
data pointer, which is what the subsequent code (fd_map[--num_fds])
expects.

No functional change with glibc; fixes the build with musl.

Signed-off-by: Marcel W. Wysocki &lt;maci.stgn@gmail.com&gt;
Link: https://patch.msgid.link/20260215142803.1455757-1-maci.stgn@gmail.com
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The UML stub takes the address of CMSG_DATA(fd_msg):

    fd_map = (void *)&amp;CMSG_DATA(fd_msg);

CMSG_DATA() is specified by POSIX to return unsigned char *.  Taking
its address is semantically wrong -- the intent is to get a pointer
to the control message data, which is exactly what CMSG_DATA()
already returns.

This happens to compile with glibc because glibc's primary
CMSG_DATA definition accesses a flexible array member:

    #define CMSG_DATA(cmsg) ((cmsg)-&gt;__cmsg_data)

An array lvalue can have its address taken, and &amp;array yields the
same address as array.  However, glibc also has an alternative
definition that uses pointer arithmetic (returning an rvalue), and
musl's definition always uses pointer arithmetic:

    /* musl */
    #define CMSG_DATA(cmsg) \
        ((unsigned char *)(((struct cmsghdr *)(cmsg)) + 1))

Taking the address of an rvalue is a hard error in C, so the
current code fails to compile with musl libc.

Remove the erroneous &amp; operator.  The resulting code is correct
regardless of the CMSG_DATA implementation -- it simply assigns the
data pointer, which is what the subsequent code (fd_map[--num_fds])
expects.

No functional change with glibc; fixes the build with musl.

Signed-off-by: Marcel W. Wysocki &lt;maci.stgn@gmail.com&gt;
Link: https://patch.msgid.link/20260215142803.1455757-1-maci.stgn@gmail.com
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Fix incorrect __acquires/__releases annotations</title>
<updated>2026-01-05T15:43:32+00:00</updated>
<author>
<name>Marco Elver</name>
<email>elver@google.com</email>
</author>
<published>2025-12-19T15:40:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4f109baeea4dc6fa1426ab559159d3bb35e05343'/>
<id>4f109baeea4dc6fa1426ab559159d3bb35e05343</id>
<content type='text'>
With Clang's context analysis, the compiler is a bit more strict about
what goes into the __acquires/__releases annotations and can't refer to
non-existent variables.

On an UM build, mm_id.h is transitively included into mm_types.h, and we
can observe the following error (if context analysis is enabled in e.g.
stackdepot.c):

   In file included from lib/stackdepot.c:17:
   In file included from include/linux/debugfs.h:15:
   In file included from include/linux/fs.h:5:
   In file included from include/linux/fs/super.h:5:
   In file included from include/linux/fs/super_types.h:7:
   In file included from include/linux/list_lru.h:14:
   In file included from include/linux/xarray.h:16:
   In file included from include/linux/gfp.h:7:
   In file included from include/linux/mmzone.h:22:
   In file included from include/linux/mm_types.h:26:
   In file included from arch/um/include/asm/mmu.h:12:
&gt;&gt; arch/um/include/shared/skas/mm_id.h:24:54: error: use of undeclared identifier 'turnstile'
      24 | void enter_turnstile(struct mm_id *mm_id) __acquires(turnstile);
         |                                                      ^~~~~~~~~
   arch/um/include/shared/skas/mm_id.h:25:53: error: use of undeclared identifier 'turnstile'
      25 | void exit_turnstile(struct mm_id *mm_id) __releases(turnstile);
         |                                                     ^~~~~~~~~

One (discarded) option was to use token_context_lock(turnstile) to just
define a token with the already used name, but that would not allow the
compiler to distinguish between different mm_id-dependent instances.

Another constraint is that struct mm_id is only declared and incomplete
in the header, so even if we tried to construct an expression to get to
the mutex instance, this would fail (including more headers transitively
everywhere should also be avoided).

Instead, just declare an mm_id-dependent helper to return the mutex, and
use the mm_id-dependent call expression in the __acquires/__releases
attributes; the compiler will consider the identity of the mutex to be
the call expression. Then using __get_turnstile() in the lock/unlock
wrappers (with context analysis enabled for mmu.c) the compiler will be
able to verify the implementation of the wrappers as-is.

We leave context analysis disabled in arch/um/kernel/skas/ for now. This
change is a preparatory change to allow enabling context analysis in
subsystems that include any of the above headers.

No functional change intended.

Closes: https://lore.kernel.org/oe-kbuild-all/202512171220.vHlvhpCr-lkp@intel.com/
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Marco Elver &lt;elver@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://patch.msgid.link/20251219154418.3592607-23-elver@google.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With Clang's context analysis, the compiler is a bit more strict about
what goes into the __acquires/__releases annotations and can't refer to
non-existent variables.

On an UM build, mm_id.h is transitively included into mm_types.h, and we
can observe the following error (if context analysis is enabled in e.g.
stackdepot.c):

   In file included from lib/stackdepot.c:17:
   In file included from include/linux/debugfs.h:15:
   In file included from include/linux/fs.h:5:
   In file included from include/linux/fs/super.h:5:
   In file included from include/linux/fs/super_types.h:7:
   In file included from include/linux/list_lru.h:14:
   In file included from include/linux/xarray.h:16:
   In file included from include/linux/gfp.h:7:
   In file included from include/linux/mmzone.h:22:
   In file included from include/linux/mm_types.h:26:
   In file included from arch/um/include/asm/mmu.h:12:
&gt;&gt; arch/um/include/shared/skas/mm_id.h:24:54: error: use of undeclared identifier 'turnstile'
      24 | void enter_turnstile(struct mm_id *mm_id) __acquires(turnstile);
         |                                                      ^~~~~~~~~
   arch/um/include/shared/skas/mm_id.h:25:53: error: use of undeclared identifier 'turnstile'
      25 | void exit_turnstile(struct mm_id *mm_id) __releases(turnstile);
         |                                                     ^~~~~~~~~

One (discarded) option was to use token_context_lock(turnstile) to just
define a token with the already used name, but that would not allow the
compiler to distinguish between different mm_id-dependent instances.

Another constraint is that struct mm_id is only declared and incomplete
in the header, so even if we tried to construct an expression to get to
the mutex instance, this would fail (including more headers transitively
everywhere should also be avoided).

Instead, just declare an mm_id-dependent helper to return the mutex, and
use the mm_id-dependent call expression in the __acquires/__releases
attributes; the compiler will consider the identity of the mutex to be
the call expression. Then using __get_turnstile() in the lock/unlock
wrappers (with context analysis enabled for mmu.c) the compiler will be
able to verify the implementation of the wrappers as-is.

We leave context analysis disabled in arch/um/kernel/skas/ for now. This
change is a preparatory change to allow enabling context analysis in
subsystems that include any of the above headers.

No functional change intended.

Closes: https://lore.kernel.org/oe-kbuild-all/202512171220.vHlvhpCr-lkp@intel.com/
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Marco Elver &lt;elver@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://patch.msgid.link/20251219154418.3592607-23-elver@google.com
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Add initial SMP support</title>
<updated>2025-10-27T15:41:15+00:00</updated>
<author>
<name>Tiwei Bie</name>
<email>tiwei.btw@antgroup.com</email>
</author>
<published>2025-10-27T00:18:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1e4ee5135d814fe4785890790cec81c3132888fb'/>
<id>1e4ee5135d814fe4785890790cec81c3132888fb</id>
<content type='text'>
Add initial symmetric multi-processing (SMP) support to UML. With
this support enabled, users can tell UML to start multiple virtual
processors, each represented as a separate host thread.

In UML, kthreads and normal threads (when running in kernel mode)
can be scheduled and executed simultaneously on different virtual
processors. However, the userspace code of normal threads still
runs within their respective single-threaded stubs.

That is, SMP support is currently available both within the kernel
and across different processes, but still remains limited within
threads of the same process in userspace.

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Link: https://patch.msgid.link/20251027001815.1666872-6-tiwei.bie@linux.dev
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add initial symmetric multi-processing (SMP) support to UML. With
this support enabled, users can tell UML to start multiple virtual
processors, each represented as a separate host thread.

In UML, kthreads and normal threads (when running in kernel mode)
can be scheduled and executed simultaneously on different virtual
processors. However, the userspace code of normal threads still
runs within their respective single-threaded stubs.

That is, SMP support is currently available both within the kernel
and across different processes, but still remains limited within
threads of the same process in userspace.

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Link: https://patch.msgid.link/20251027001815.1666872-6-tiwei.bie@linux.dev
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Stop tracking stub's PID via userspace_pid[]</title>
<updated>2025-07-13T17:42:49+00:00</updated>
<author>
<name>Tiwei Bie</name>
<email>tiwei.btw@antgroup.com</email>
</author>
<published>2025-07-11T06:50:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f7e9077a1649877d4b33ce91d58711d393a63c1b'/>
<id>f7e9077a1649877d4b33ce91d58711d393a63c1b</id>
<content type='text'>
The PID of the stub process can be obtained from current_mm_id().
There is no need to track it via userspace_pid[]. Stop doing that
to simplify the code.

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Link: https://patch.msgid.link/20250711065021.2535362-4-tiwei.bie@linux.dev
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The PID of the stub process can be obtained from current_mm_id().
There is no need to track it via userspace_pid[]. Stop doing that
to simplify the code.

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Link: https://patch.msgid.link/20250711065021.2535362-4-tiwei.bie@linux.dev
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Make mm_list and mm_list_lock static</title>
<updated>2025-07-11T06:49:18+00:00</updated>
<author>
<name>Tiwei Bie</name>
<email>tiwei.btw@antgroup.com</email>
</author>
<published>2025-07-08T09:04:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=409a0c00c490d3b7c077e316a9261462241acda7'/>
<id>409a0c00c490d3b7c077e316a9261462241acda7</id>
<content type='text'>
They are only used within mmu.c. Make them static.

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Link: https://patch.msgid.link/20250708090403.1067440-3-tiwei.bie@linux.dev
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
They are only used within mmu.c. Make them static.

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Link: https://patch.msgid.link/20250708090403.1067440-3-tiwei.bie@linux.dev
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: simplify syscall header files</title>
<updated>2025-07-11T06:49:02+00:00</updated>
<author>
<name>Johannes Berg</name>
<email>johannes.berg@intel.com</email>
</author>
<published>2025-07-04T12:12:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ac1ad16f10523c2c60aef0abeb8a850ea6d06ced'/>
<id>ac1ad16f10523c2c60aef0abeb8a850ea6d06ced</id>
<content type='text'>
Since Thomas's recent commit 2af10530639b ("um/x86: Add
system call table to header file") , we now have two
extern declarations of the syscall table, one internal
and one external, and they don't even match on 32-bit.
Clean this up and remove all the extra code.

Reviewed-by: Thomas Weißschuh &lt;thomas.weissschuh@linutronix.de&gt;
Link: https://patch.msgid.link/20250704141243.a68366f6acc3.If8587a4aafdb90644fc6d0b2f5e31a2d1887915f@changeid
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since Thomas's recent commit 2af10530639b ("um/x86: Add
system call table to header file") , we now have two
extern declarations of the syscall table, one internal
and one external, and they don't even match on 32-bit.
Clean this up and remove all the extra code.

Reviewed-by: Thomas Weißschuh &lt;thomas.weissschuh@linutronix.de&gt;
Link: https://patch.msgid.link/20250704141243.a68366f6acc3.If8587a4aafdb90644fc6d0b2f5e31a2d1887915f@changeid
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: pass FD for memory operations when needed</title>
<updated>2025-06-02T14:20:10+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin.berg@intel.com</email>
</author>
<published>2025-06-02T13:00:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e92e2552858142b60238b9828d802f128e4acccd'/>
<id>e92e2552858142b60238b9828d802f128e4acccd</id>
<content type='text'>
Instead of always sharing the FDs with the userspace process, only hand
over the FDs needed for mmap when required. The idea is that userspace
might be able to force the stub into executing an mmap syscall, however,
it will not be able to manipulate the control flow sufficiently to have
access to an FD that would allow mapping arbitrary memory.

Security wise, we need to be sure that only the expected syscalls are
executed after the kernel sends FDs through the socket. This is
currently not the case, as userspace can trivially jump to the
rt_sigreturn syscall instruction to execute any syscall that the stub is
permitted to do. With this, it can trick the kernel to send the FD,
which in turn allows userspace to freely map any physical memory.

As such, this is currently *not* secure. However, in principle the
approach should be fine with a more strict SECCOMP filter and a careful
review of the stub control flow (as userspace can prepare a stack). With
some care, it is likely possible to extend the security model to SMP if
desired.

Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-8-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of always sharing the FDs with the userspace process, only hand
over the FDs needed for mmap when required. The idea is that userspace
might be able to force the stub into executing an mmap syscall, however,
it will not be able to manipulate the control flow sufficiently to have
access to an FD that would allow mapping arbitrary memory.

Security wise, we need to be sure that only the expected syscalls are
executed after the kernel sends FDs through the socket. This is
currently not the case, as userspace can trivially jump to the
rt_sigreturn syscall instruction to execute any syscall that the stub is
permitted to do. With this, it can trick the kernel to send the FD,
which in turn allows userspace to freely map any physical memory.

As such, this is currently *not* secure. However, in principle the
approach should be fine with a more strict SECCOMP filter and a careful
review of the stub control flow (as userspace can prepare a stack). With
some care, it is likely possible to extend the security model to SMP if
desired.

Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-8-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Implement kernel side of SECCOMP based process handling</title>
<updated>2025-06-02T13:17:19+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin@sipsolutions.net</email>
</author>
<published>2025-06-02T13:00:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=406d17c6c370a33cfb54067d9e205305293d4604'/>
<id>406d17c6c370a33cfb54067d9e205305293d4604</id>
<content type='text'>
This adds the kernel side of the seccomp based process handling.

Co-authored-by: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-6-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds the kernel side of the seccomp based process handling.

Co-authored-by: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-6-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Track userspace children dying in SECCOMP mode</title>
<updated>2025-06-02T13:17:19+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin@sipsolutions.net</email>
</author>
<published>2025-06-02T13:00:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8420e08fe3a594b6ffa07705ac270faa2ed452c5'/>
<id>8420e08fe3a594b6ffa07705ac270faa2ed452c5</id>
<content type='text'>
When in seccomp mode, we would hang forever on the futex if a child has
died unexpectedly. In contrast, ptrace mode will notice it and kill the
corresponding thread when it fails to run it.

Fix this issue using a new IRQ that is fired after a SIGCHLD and keeping
an (internal) list of all MMs. In the IRQ handler, find the affected MM
and set its PID to -1 as well as the futex variable to FUTEX_IN_KERN.

This, together with futex returning -EINTR after the signal is
sufficient to implement a race-free detection of a child dying.

Note that this also enables IRQ handling while starting a userspace
process. This should be safe and SECCOMP requires the IRQ in case the
process does not come up properly.

Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-5-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When in seccomp mode, we would hang forever on the futex if a child has
died unexpectedly. In contrast, ptrace mode will notice it and kill the
corresponding thread when it fails to run it.

Fix this issue using a new IRQ that is fired after a SIGCHLD and keeping
an (internal) list of all MMs. In the IRQ handler, find the affected MM
and set its PID to -1 as well as the futex variable to FUTEX_IN_KERN.

This, together with futex returning -EINTR after the signal is
sufficient to implement a race-free detection of a child dying.

Note that this also enables IRQ handling while starting a userspace
process. This should be safe and SECCOMP requires the IRQ in case the
process does not come up properly.

Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-5-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
