<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/um/os-Linux, branch v6.16</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>um: fix SECCOMP 32bit xstate register restore</title>
<updated>2025-06-04T09:40:36+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin.berg@intel.com</email>
</author>
<published>2025-06-04T08:17:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=942349413a49670e8bed246e2185fd3a053227be'/>
<id>942349413a49670e8bed246e2185fd3a053227be</id>
<content type='text'>
There was a typo that caused the extended FP state to be copied into the
wrong location on 32 bit. On 32 bit we only store the xstate internally
as that already contains everything. However, for compatibility, the
mcontext on 32 bit first contains the legacy FP state and then the
xstate.

The code copied the xstate on top of the legacy FP state instead of
using the correct offset. This offset was already calculated in the
xstate_* variables, so simply switch to those to fix the problem.

With this SECCOMP mode works on 32 bit, so lift the restriction.

Fixes: b1e1bd2e6943 ("um: Add helper functions to get/set state for SECCOMP")
Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250604081705.934112-1-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There was a typo that caused the extended FP state to be copied into the
wrong location on 32 bit. On 32 bit we only store the xstate internally
as that already contains everything. However, for compatibility, the
mcontext on 32 bit first contains the legacy FP state and then the
xstate.

The code copied the xstate on top of the legacy FP state instead of
using the correct offset. This offset was already calculated in the
xstate_* variables, so simply switch to those to fix the problem.

With this SECCOMP mode works on 32 bit, so lift the restriction.

Fixes: b1e1bd2e6943 ("um: Add helper functions to get/set state for SECCOMP")
Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250604081705.934112-1-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: pass FD for memory operations when needed</title>
<updated>2025-06-02T14:20:10+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin.berg@intel.com</email>
</author>
<published>2025-06-02T13:00:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e92e2552858142b60238b9828d802f128e4acccd'/>
<id>e92e2552858142b60238b9828d802f128e4acccd</id>
<content type='text'>
Instead of always sharing the FDs with the userspace process, only hand
over the FDs needed for mmap when required. The idea is that userspace
might be able to force the stub into executing an mmap syscall, however,
it will not be able to manipulate the control flow sufficiently to have
access to an FD that would allow mapping arbitrary memory.

Security wise, we need to be sure that only the expected syscalls are
executed after the kernel sends FDs through the socket. This is
currently not the case, as userspace can trivially jump to the
rt_sigreturn syscall instruction to execute any syscall that the stub is
permitted to do. With this, it can trick the kernel to send the FD,
which in turn allows userspace to freely map any physical memory.

As such, this is currently *not* secure. However, in principle the
approach should be fine with a more strict SECCOMP filter and a careful
review of the stub control flow (as userspace can prepare a stack). With
some care, it is likely possible to extend the security model to SMP if
desired.

Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-8-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of always sharing the FDs with the userspace process, only hand
over the FDs needed for mmap when required. The idea is that userspace
might be able to force the stub into executing an mmap syscall, however,
it will not be able to manipulate the control flow sufficiently to have
access to an FD that would allow mapping arbitrary memory.

Security wise, we need to be sure that only the expected syscalls are
executed after the kernel sends FDs through the socket. This is
currently not the case, as userspace can trivially jump to the
rt_sigreturn syscall instruction to execute any syscall that the stub is
permitted to do. With this, it can trick the kernel to send the FD,
which in turn allows userspace to freely map any physical memory.

As such, this is currently *not* secure. However, in principle the
approach should be fine with a more strict SECCOMP filter and a careful
review of the stub control flow (as userspace can prepare a stack). With
some care, it is likely possible to extend the security model to SMP if
desired.

Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-8-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Add SECCOMP support detection and initialization</title>
<updated>2025-06-02T14:20:01+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin@sipsolutions.net</email>
</author>
<published>2025-06-02T13:00:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=beddc9fb1cb161e1bf779b180750b648ff9690c7'/>
<id>beddc9fb1cb161e1bf779b180750b648ff9690c7</id>
<content type='text'>
This detects seccomp support, sets the global using_seccomp variable and
initilizes the exec registers. The support is only enabled if the
seccomp= kernel parameter is set to either "on" or "auto". With "auto" a
fallback to ptrace mode will happen if initialization failed.

Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-7-benjamin@sipsolutions.net
[extend help with Kconfig text from v2, use exit syscall instead of libc,
 remove unneeded mctx_offset assignment, disable on 32-bit for now]
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This detects seccomp support, sets the global using_seccomp variable and
initilizes the exec registers. The support is only enabled if the
seccomp= kernel parameter is set to either "on" or "auto". With "auto" a
fallback to ptrace mode will happen if initialization failed.

Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-7-benjamin@sipsolutions.net
[extend help with Kconfig text from v2, use exit syscall instead of libc,
 remove unneeded mctx_offset assignment, disable on 32-bit for now]
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Implement kernel side of SECCOMP based process handling</title>
<updated>2025-06-02T13:17:19+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin@sipsolutions.net</email>
</author>
<published>2025-06-02T13:00:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=406d17c6c370a33cfb54067d9e205305293d4604'/>
<id>406d17c6c370a33cfb54067d9e205305293d4604</id>
<content type='text'>
This adds the kernel side of the seccomp based process handling.

Co-authored-by: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-6-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds the kernel side of the seccomp based process handling.

Co-authored-by: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-6-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Track userspace children dying in SECCOMP mode</title>
<updated>2025-06-02T13:17:19+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin@sipsolutions.net</email>
</author>
<published>2025-06-02T13:00:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8420e08fe3a594b6ffa07705ac270faa2ed452c5'/>
<id>8420e08fe3a594b6ffa07705ac270faa2ed452c5</id>
<content type='text'>
When in seccomp mode, we would hang forever on the futex if a child has
died unexpectedly. In contrast, ptrace mode will notice it and kill the
corresponding thread when it fails to run it.

Fix this issue using a new IRQ that is fired after a SIGCHLD and keeping
an (internal) list of all MMs. In the IRQ handler, find the affected MM
and set its PID to -1 as well as the futex variable to FUTEX_IN_KERN.

This, together with futex returning -EINTR after the signal is
sufficient to implement a race-free detection of a child dying.

Note that this also enables IRQ handling while starting a userspace
process. This should be safe and SECCOMP requires the IRQ in case the
process does not come up properly.

Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-5-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When in seccomp mode, we would hang forever on the futex if a child has
died unexpectedly. In contrast, ptrace mode will notice it and kill the
corresponding thread when it fails to run it.

Fix this issue using a new IRQ that is fired after a SIGCHLD and keeping
an (internal) list of all MMs. In the IRQ handler, find the affected MM
and set its PID to -1 as well as the futex variable to FUTEX_IN_KERN.

This, together with futex returning -EINTR after the signal is
sufficient to implement a race-free detection of a child dying.

Note that this also enables IRQ handling while starting a userspace
process. This should be safe and SECCOMP requires the IRQ in case the
process does not come up properly.

Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20250602130052.545733-5-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Move faultinfo extraction into userspace routine</title>
<updated>2025-06-02T13:17:19+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin@sipsolutions.net</email>
</author>
<published>2025-06-02T13:00:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=247ed9e4a6869f3bf07bffd277a341a6833abdbc'/>
<id>247ed9e4a6869f3bf07bffd277a341a6833abdbc</id>
<content type='text'>
The segv handler is called slightly differently depending on whether
PTRACE_FULL_FAULTINFO is set or not (32bit vs. 64bit). The only
difference is that we don't try to pass the registers and instruction
pointer to the segv handler.

It would be good to either document or remove the difference, but I do
not know why this difference exists. And, passing NULL can even result
in a crash.

Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Link: https://patch.msgid.link/20250602130052.545733-2-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The segv handler is called slightly differently depending on whether
PTRACE_FULL_FAULTINFO is set or not (32bit vs. 64bit). The only
difference is that we don't try to pass the registers and instruction
pointer to the segv handler.

It would be good to either document or remove the difference, but I do
not know why this difference exists. And, passing NULL can even result
in a crash.

Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Link: https://patch.msgid.link/20250602130052.545733-2-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Fix tgkill compile error on old host OSes</title>
<updated>2025-06-02T09:22:55+00:00</updated>
<author>
<name>Yongting Lin</name>
<email>linyongting@gmail.com</email>
</author>
<published>2025-05-27T15:12:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fd054188999ff19746cc09f4e0f196a113964db9'/>
<id>fd054188999ff19746cc09f4e0f196a113964db9</id>
<content type='text'>
tgkill is a quite old syscall since kernel 2.5.75, but unfortunately glibc
doesn't support it before 2.30. Thus some systems fail to compile the
latest UserMode Linux.

Here is the compile error I encountered when I tried to compile UML in
my system shipped with glibc-2.28.

    CALL    scripts/checksyscalls.sh
    CC      arch/um/os-Linux/sigio.o
  In file included from arch/um/os-Linux/sigio.c:17:
  arch/um/os-Linux/sigio.c: In function ‘write_sigio_thread’:
  arch/um/os-Linux/sigio.c:49:19: error: implicit declaration of function ‘tgkill’; did you mean ‘kill’? [-Werror=implicit-function-declaration]
     CATCH_EINTR(r = tgkill(pid, pid, SIGIO));
                     ^~~~~~
  ./arch/um/include/shared/os.h:21:48: note: in definition of macro ‘CATCH_EINTR’
  #define CATCH_EINTR(expr) while ((errno = 0, ((expr) &lt; 0)) &amp;&amp; (errno == EINTR))
                                                ^~~~
  cc1: some warnings being treated as errors

Fix it by Replacing glibc call with raw syscall.

Fixes: 33c9da5dfb18 ("um: Rewrite the sigio workaround based on epoll and tgkill")
Signed-off-by: Yongting Lin &lt;linyongting@gmail.com&gt;
Link: https://patch.msgid.link/20250527151222.40371-1-linyongting@gmail.com
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
tgkill is a quite old syscall since kernel 2.5.75, but unfortunately glibc
doesn't support it before 2.30. Thus some systems fail to compile the
latest UserMode Linux.

Here is the compile error I encountered when I tried to compile UML in
my system shipped with glibc-2.28.

    CALL    scripts/checksyscalls.sh
    CC      arch/um/os-Linux/sigio.o
  In file included from arch/um/os-Linux/sigio.c:17:
  arch/um/os-Linux/sigio.c: In function ‘write_sigio_thread’:
  arch/um/os-Linux/sigio.c:49:19: error: implicit declaration of function ‘tgkill’; did you mean ‘kill’? [-Werror=implicit-function-declaration]
     CATCH_EINTR(r = tgkill(pid, pid, SIGIO));
                     ^~~~~~
  ./arch/um/include/shared/os.h:21:48: note: in definition of macro ‘CATCH_EINTR’
  #define CATCH_EINTR(expr) while ((errno = 0, ((expr) &lt; 0)) &amp;&amp; (errno == EINTR))
                                                ^~~~
  cc1: some warnings being treated as errors

Fix it by Replacing glibc call with raw syscall.

Fixes: 33c9da5dfb18 ("um: Rewrite the sigio workaround based on epoll and tgkill")
Signed-off-by: Yongting Lin &lt;linyongting@gmail.com&gt;
Link: https://patch.msgid.link/20250527151222.40371-1-linyongting@gmail.com
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Remove obsolete legacy network transports</title>
<updated>2025-05-05T08:26:59+00:00</updated>
<author>
<name>Tiwei Bie</name>
<email>tiwei.btw@antgroup.com</email>
</author>
<published>2025-05-03T05:17:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=65eaac591b752042006d3a79c0cfba47e2a9aaac'/>
<id>65eaac591b752042006d3a79c0cfba47e2a9aaac</id>
<content type='text'>
These legacy network transports were marked as obsolete in commit
40814b98a570 ("um: Mark non-vector net transports as obsolete").
More than five years have passed since then. Remove these network
transports to reduce the maintenance burden.

Suggested-by: Anton Ivanov &lt;anton.ivanov@cambridgegreys.com&gt;
Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Acked-By: Anton Ivanov &lt;anton.ivanov@cambridgegreys.com&gt;
Link: https://patch.msgid.link/20250503051710.3286595-2-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These legacy network transports were marked as obsolete in commit
40814b98a570 ("um: Mark non-vector net transports as obsolete").
More than five years have passed since then. Remove these network
transports to reduce the maintenance burden.

Suggested-by: Anton Ivanov &lt;anton.ivanov@cambridgegreys.com&gt;
Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Acked-By: Anton Ivanov &lt;anton.ivanov@cambridgegreys.com&gt;
Link: https://patch.msgid.link/20250503051710.3286595-2-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Rewrite the sigio workaround based on epoll and tgkill</title>
<updated>2025-03-20T08:28:44+00:00</updated>
<author>
<name>Tiwei Bie</name>
<email>tiwei.btw@antgroup.com</email>
</author>
<published>2025-03-15T16:19:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=33c9da5dfb18c2ff5a88d01aca2cf253cd0ac3bc'/>
<id>33c9da5dfb18c2ff5a88d01aca2cf253cd0ac3bc</id>
<content type='text'>
The existing sigio workaround implementation removes FDs from the
poll when events are triggered, requiring users to re-add them via
add_sigio_fd() after processing. This introduces a potential race
condition between FD removal in write_sigio_thread() and next_poll
update in __add_sigio_fd(), and is inefficient due to frequent FD
removal and re-addition. Rewrite the implementation based on epoll
and tgkill for improved efficiency and reliability.

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Link: https://patch.msgid.link/20250315161910.4082396-2-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The existing sigio workaround implementation removes FDs from the
poll when events are triggered, requiring users to re-add them via
add_sigio_fd() after processing. This introduces a potential race
condition between FD removal in write_sigio_thread() and next_poll
update in __add_sigio_fd(), and is inefficient due to frequent FD
removal and re-addition. Rewrite the implementation based on epoll
and tgkill for improved efficiency and reliability.

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Link: https://patch.msgid.link/20250315161910.4082396-2-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Prohibit the VM_CLONE flag in run_helper_thread()</title>
<updated>2025-03-20T08:26:38+00:00</updated>
<author>
<name>Tiwei Bie</name>
<email>tiwei.btw@antgroup.com</email>
</author>
<published>2025-03-19T13:55:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=69f52573c24de9d2919f83e3b3b396a09118b7c4'/>
<id>69f52573c24de9d2919f83e3b3b396a09118b7c4</id>
<content type='text'>
Directly creating helper threads with VM_CLONE using clone can
compromise the thread safety of errno. Since all these helper
threads have been converted to use os_run_helper_thread(), let's
prevent using this flag in run_helper_thread().

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Link: https://patch.msgid.link/20250319135523.97050-5-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Directly creating helper threads with VM_CLONE using clone can
compromise the thread safety of errno. Since all these helper
threads have been converted to use os_run_helper_thread(), let's
prevent using this flag in run_helper_thread().

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Link: https://patch.msgid.link/20250319135523.97050-5-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
