<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/um/include/shared/skas, branch for-next</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>um: Abandon the _PAGE_NEWPROT bit</title>
<updated>2024-10-23T07:52:49+00:00</updated>
<author>
<name>Tiwei Bie</name>
<email>tiwei.btw@antgroup.com</email>
</author>
<published>2024-10-11T10:23:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2717c6b649e1840328c2758a478bf4034a22ac3e'/>
<id>2717c6b649e1840328c2758a478bf4034a22ac3e</id>
<content type='text'>
When a PTE is updated in the page table, the _PAGE_NEWPAGE bit will
always be set. And the corresponding page will always be mapped or
unmapped depending on whether the PTE is present or not. The check
on the _PAGE_NEWPROT bit is not really reachable. Abandoning it will
allow us to simplify the code and remove the unreachable code.

Reviewed-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Link: https://patch.msgid.link/20241011102354.1682626-2-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a PTE is updated in the page table, the _PAGE_NEWPAGE bit will
always be set. And the corresponding page will always be mapped or
unmapped depending on whether the PTE is present or not. The check
on the _PAGE_NEWPROT bit is not really reachable. Abandoning it will
allow us to simplify the code and remove the unreachable code.

Reviewed-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Link: https://patch.msgid.link/20241011102354.1682626-2-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: use execveat to create userspace MMs</title>
<updated>2024-10-10T11:37:16+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin.berg@intel.com</email>
</author>
<published>2024-09-19T12:45:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=32e8eaf263d9be014ba1970444f745682fa9c6c0'/>
<id>32e8eaf263d9be014ba1970444f745682fa9c6c0</id>
<content type='text'>
Using clone will not undo features that have been enabled by libc. An
example of this already happening is rseq, which could cause the kernel
to read/write memory of the userspace process. In the future the
standard library might also use mseal by default to protect itself,
which would also thwart our attempts at unmapping everything.

Solve all this by taking a step back and doing an execve into a tiny
static binary that sets up the minimal environment required for the
stub without using any standard library. That way we have a clean
execution environment that is fully under the control of UML.

Note that this changes things a bit as the FDs are not anymore shared
with the kernel. Instead, we explicitly share the FDs for the physical
memory and all existing iomem regions. Doing this is fine, as iomem
regions cannot be added at runtime.

Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20240919124511.282088-3-benjamin@sipsolutions.net
[use pipe() instead of pipe2(), remove unneeded close() calls]
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using clone will not undo features that have been enabled by libc. An
example of this already happening is rseq, which could cause the kernel
to read/write memory of the userspace process. In the future the
standard library might also use mseal by default to protect itself,
which would also thwart our attempts at unmapping everything.

Solve all this by taking a step back and doing an execve into a tiny
static binary that sets up the minimal environment required for the
stub without using any standard library. That way we have a clean
execution environment that is fully under the control of UML.

Note that this changes things a bit as the FDs are not anymore shared
with the kernel. Instead, we explicitly share the FDs for the physical
memory and all existing iomem regions. Doing this is fine, as iomem
regions cannot be added at runtime.

Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20240919124511.282088-3-benjamin@sipsolutions.net
[use pipe() instead of pipe2(), remove unneeded close() calls]
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Remove the declaration of user_thread function</title>
<updated>2024-09-12T18:43:26+00:00</updated>
<author>
<name>Tiwei Bie</name>
<email>tiwei.btw@antgroup.com</email>
</author>
<published>2024-08-26T10:08:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fe6abeba24996f826473630b2054699828fd9f18'/>
<id>fe6abeba24996f826473630b2054699828fd9f18</id>
<content type='text'>
This function has never been defined since its declaration was
introduced by commit 1da177e4c3f4 ("Linux-2.6.12-rc2").

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This function has never been defined since its declaration was
introduced by commit 1da177e4c3f4 ("Linux-2.6.12-rc2").

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Remove unused mm_fd field from mm_id</title>
<updated>2024-09-12T18:36:22+00:00</updated>
<author>
<name>Tiwei Bie</name>
<email>tiwei.btw@antgroup.com</email>
</author>
<published>2024-08-26T10:08:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=59376fb2a71b81dfc018db6fe9b6fa9cd3f41ce7'/>
<id>59376fb2a71b81dfc018db6fe9b6fa9cd3f41ce7</id>
<content type='text'>
It's no longer used since the removal of the SKAS3/4 support.

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's no longer used since the removal of the SKAS3/4 support.

Signed-off-by: Tiwei Bie &lt;tiwei.btw@antgroup.com&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Remove obsoleted declaration for execute_syscall_skas</title>
<updated>2024-09-12T18:23:33+00:00</updated>
<author>
<name>Gaosheng Cui</name>
<email>cuigaosheng1@huawei.com</email>
</author>
<published>2024-08-24T12:04:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2df8c8d118c750054fdf2c047d2eb3c0ed854dc7'/>
<id>2df8c8d118c750054fdf2c047d2eb3c0ed854dc7</id>
<content type='text'>
The execute_syscall_skas() have been removed since
commit e32dacb9f481 ("[PATCH] uml: system call path cleanup"),
and now it is useless, so remove it.

Signed-off-by: Gaosheng Cui &lt;cuigaosheng1@huawei.com&gt;
Reviewed-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The execute_syscall_skas() have been removed since
commit e32dacb9f481 ("[PATCH] uml: system call path cleanup"),
and now it is useless, so remove it.

Signed-off-by: Gaosheng Cui &lt;cuigaosheng1@huawei.com&gt;
Reviewed-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: refactor TLB update handling</title>
<updated>2024-07-03T15:09:50+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin.berg@intel.com</email>
</author>
<published>2024-07-03T13:45:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bcf3d957c63d8b6d718b862fea18c5f14ce803e2'/>
<id>bcf3d957c63d8b6d718b862fea18c5f14ce803e2</id>
<content type='text'>
Conceptually, we want the memory mappings to always be up to date and
represent whatever is in the TLB. To ensure that, we need to sync them
over in the userspace case and for the kernel we need to process the
mappings.

The kernel will call flush_tlb_* if page table entries that were valid
before become invalid. Unfortunately, this is not the case if entries
are added.

As such, change both flush_tlb_* and set_ptes to track the memory range
that has to be synchronized. For the kernel, we need to execute a
flush_tlb_kern_* immediately but we can wait for the first page fault in
case of set_ptes. For userspace in contrast we only store that a range
of memory needs to be synced and do so whenever we switch to that
process.

Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20240703134536.1161108-13-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conceptually, we want the memory mappings to always be up to date and
represent whatever is in the TLB. To ensure that, we need to sync them
over in the userspace case and for the kernel we need to process the
mappings.

The kernel will call flush_tlb_* if page table entries that were valid
before become invalid. Unfortunately, this is not the case if entries
are added.

As such, change both flush_tlb_* and set_ptes to track the memory range
that has to be synchronized. For the kernel, we need to execute a
flush_tlb_kern_* immediately but we can wait for the first page fault in
case of set_ptes. For userspace in contrast we only store that a range
of memory needs to be synced and do so whenever we switch to that
process.

Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20240703134536.1161108-13-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Delay flushing syscalls until the thread is restarted</title>
<updated>2024-07-03T15:09:49+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin@sipsolutions.net</email>
</author>
<published>2024-07-03T13:45:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3c83170d7cdf3df12e430c429462776dcb52ff87'/>
<id>3c83170d7cdf3df12e430c429462776dcb52ff87</id>
<content type='text'>
As running the syscalls is expensive due to context switches, we should
do so as late as possible in case more syscalls need to be queued later
on. This will also benefit a later move to a SECCOMP enabled userspace
as in that case the need for extra context switches is removed entirely.

Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Link: https://patch.msgid.link/20240703134536.1161108-9-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As running the syscalls is expensive due to context switches, we should
do so as late as possible in case more syscalls need to be queued later
on. This will also benefit a later move to a SECCOMP enabled userspace
as in that case the need for extra context switches is removed entirely.

Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Link: https://patch.msgid.link/20240703134536.1161108-9-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: remove copy_context_skas0</title>
<updated>2024-07-03T15:09:49+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin.berg@intel.com</email>
</author>
<published>2024-07-03T13:45:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a5d2cfe749e2917266956751262328da35e0925d'/>
<id>a5d2cfe749e2917266956751262328da35e0925d</id>
<content type='text'>
The kernel flushes the memory ranges anyway for CoW and does not assume
that the userspace process has anything set up already. So, start with a
fresh process for the new mm context.

Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20240703134536.1161108-8-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The kernel flushes the memory ranges anyway for CoW and does not assume
that the userspace process has anything set up already. So, start with a
fresh process for the new mm context.

Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20240703134536.1161108-8-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: remove LDT support</title>
<updated>2024-07-03T15:09:49+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin.berg@intel.com</email>
</author>
<published>2024-07-03T13:45:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7911b650a0708a3ee3412d80838b9574b21a53c8'/>
<id>7911b650a0708a3ee3412d80838b9574b21a53c8</id>
<content type='text'>
The current LDT code has a few issues that mean it should be redone in a
different way once we always start with a fresh MM even when cloning.

In a new and better world, the kernel would just ensure its own LDT is
clear at startup. At that point, all that is needed is a simple function
to populate the LDT from another MM in arch_dup_mmap combined with some
tracking of the installed LDT entries for each MM.

Note that the old implementation was even incorrect with regard to
reading, as it copied out the LDT entries in the internal format rather
than converting them to the userspace structure.

Removal should be fine as the LDT is not used for thread-local storage
anymore.

Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20240703134536.1161108-7-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The current LDT code has a few issues that mean it should be redone in a
different way once we always start with a fresh MM even when cloning.

In a new and better world, the kernel would just ensure its own LDT is
clear at startup. At that point, all that is needed is a simple function
to populate the LDT from another MM in arch_dup_mmap combined with some
tracking of the installed LDT entries for each MM.

Note that the old implementation was even incorrect with regard to
reading, as it copied out the LDT entries in the internal format rather
than converting them to the userspace structure.

Removal should be fine as the LDT is not used for thread-local storage
anymore.

Signed-off-by: Benjamin Berg &lt;benjamin.berg@intel.com&gt;
Link: https://patch.msgid.link/20240703134536.1161108-7-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>um: Rework syscall handling</title>
<updated>2024-07-03T15:09:49+00:00</updated>
<author>
<name>Benjamin Berg</name>
<email>benjamin@sipsolutions.net</email>
</author>
<published>2024-07-03T13:45:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=76ed9158e1d474e963fc59da7a461b27a2212c5a'/>
<id>76ed9158e1d474e963fc59da7a461b27a2212c5a</id>
<content type='text'>
Rework syscall handling to be platform independent. Also create a clean
split between queueing of syscalls and flushing them out, removing the
need to keep state in the code that triggers the syscalls.

The code adds syscall_data_len to the global mm_id structure. This will
be used later to allow surrounding code to track whether syscalls still
need to run and if errors occurred.

Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Link: https://patch.msgid.link/20240703134536.1161108-5-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rework syscall handling to be platform independent. Also create a clean
split between queueing of syscalls and flushing them out, removing the
need to keep state in the code that triggers the syscalls.

The code adds syscall_data_len to the global mm_id structure. This will
be used later to allow surrounding code to track whether syscalls still
need to run and if errors occurred.

Signed-off-by: Benjamin Berg &lt;benjamin@sipsolutions.net&gt;
Link: https://patch.msgid.link/20240703134536.1161108-5-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
