diff options
| author | Christian Brauner <brauner@kernel.org> | 2026-03-06 17:28:37 +0100 |
|---|---|---|
| committer | Christian Brauner <brauner@kernel.org> | 2026-03-12 13:33:55 +0100 |
| commit | 9d4e752a24f740b31ca827bfab07010e4e7f34b0 (patch) | |
| tree | 870d6a4e6493541f3db525eccb9c4b5648d0a0dc /include/uapi/linux | |
| parent | 0209e31659d6908c6d0788c8a495b43d0a1f6f6c (diff) | |
namespace: allow creating empty mount namespaces
Add support for creating a mount namespace that contains only a copy of
the root mount from the caller's mount namespace, with none of the
child mounts. This is useful for containers and sandboxes that want to
start with a minimal mount table and populate it from scratch rather
than inheriting and then tearing down the full mount tree.
Two new flags are introduced:
- CLONE_EMPTY_MNTNS for clone3(), using the 64-bit flag space.
- UNSHARE_EMPTY_MNTNS for unshare(), reusing the
CLONE_PARENT_SETTID bit which has no meaning for unshare.
Both flags imply CLONE_NEWNS. For the unshare path,
UNSHARE_EMPTY_MNTNS is converted to CLONE_EMPTY_MNTNS in
unshare_nsproxy_namespaces() before it reaches copy_mnt_ns(), so the
mount namespace code only needs to handle a single flag.
In copy_mnt_ns(), when CLONE_EMPTY_MNTNS is set, clone_mnt() is used
instead of copy_tree() to clone only the root mount. The caller's root
and working directory are both reset to the root dentry of the new
mount.
The cleanup variables are changed from vfsmount pointers with
__free(mntput) to struct path with __free(path_put) because the empty
mount namespace path needs to release both mount and dentry references
when replacing the caller's root and pwd. In the normal (non-empty)
path only the mount component is set, and dput(NULL) is a no-op so
path_put remains correct there as well.
Link: https://patch.msgid.link/20260306-work-empty-mntns-consolidated-v1-1-6eb30529bbb0@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Diffstat (limited to 'include/uapi/linux')
| -rw-r--r-- | include/uapi/linux/sched.h | 7 |
1 files changed, 7 insertions, 0 deletions
diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h index 359a14cc76a4..4e76fce9f777 100644 --- a/include/uapi/linux/sched.h +++ b/include/uapi/linux/sched.h @@ -36,6 +36,7 @@ /* Flags for the clone3() syscall. */ #define CLONE_CLEAR_SIGHAND 0x100000000ULL /* Clear any signal handler and reset to SIG_DFL. */ #define CLONE_INTO_CGROUP 0x200000000ULL /* Clone into a specific cgroup given the right permissions. */ +#define CLONE_EMPTY_MNTNS (1ULL << 37) /* Create an empty mount namespace. */ /* * cloning flags intersect with CSIGNAL so can be used with unshare and clone3 @@ -43,6 +44,12 @@ */ #define CLONE_NEWTIME 0x00000080 /* New time namespace */ +/* + * unshare flags share the bit space with clone flags but only apply to the + * unshare syscall: + */ +#define UNSHARE_EMPTY_MNTNS 0x00100000 /* Unshare an empty mount namespace. */ + #ifndef __ASSEMBLY__ /** * struct clone_args - arguments for the clone3 syscall |
