Kernel+Userland: Split bind-mounting and re-mounting from mount syscall

These 2 are an actual separate types of syscalls, so let's stop using
special flags for bind mounting or re-mounting and instead let userspace
calling directly for this kind of actions.
This commit is contained in:
Liav A 2023-02-25 19:30:28 +02:00 committed by Andrew Kaster
parent 04b44a827a
commit 0bbd9040ef
Notes: sideshowbarker 2024-07-17 06:35:23 +09:00
10 changed files with 204 additions and 42 deletions

View file

@ -0,0 +1,39 @@
## Name
bindmount - create a bindmount from `source_fd` to a target path.
## Synopsis
```**c++
#include <LibCore/System.h>
ErrorOr<void> bindmount(int source_fd, StringView target, int flags);
```
## Description
`bindmount()` create a bindmount from `source_fd` to a target path `target`, with mount flags of `flags`.
The following `flags` are supported:
* `MS_NODEV`: Disallow opening any devices from this file system.
* `MS_NOEXEC`: Disallow executing any executables from this file system.
* `MS_NOSUID`: Ignore set-user-id bits on executables from this file system.
* `MS_RDONLY`: Mount the filesystem read-only.
* `MS_WXALLOWED`: Allow W^X protection circumvention for executables on this file system.
* `MS_AXALLOWED`: Allow anonymous executable mappings for executables on this file system.
* `MS_NOREGULAR`: Disallow opening any regular files from this file system.
These flags can be used as a security measure to limit the possible abuses of the mounted file system.
## Errors
* `EINVAL`: The `flags` value contains deprecated flags such as `MS_REMOUNT` or `MS_BIND`.
* `EPERM`: The current process does not have superuser privileges.
* `ENODEV`: The `source_fd` is not an open file descriptor to a valid filesystem inode.
All of the usual path resolution errors may also occur.
## See also
* [`mount`(2)](help://man/2/mount)

View file

@ -34,9 +34,7 @@ The following `flags` are supported:
* `MS_NODEV`: Disallow opening any devices from this file system.
* `MS_NOEXEC`: Disallow executing any executables from this file system.
* `MS_NOSUID`: Ignore set-user-id bits on executables from this file system.
* `MS_BIND`: Perform a bind-mount (see below).
* `MS_RDONLY`: Mount the filesystem read-only.
* `MS_REMOUNT`: Remount an already mounted filesystem (see below).
* `MS_WXALLOWED`: Allow W^X protection circumvention for executables on this file system.
* `MS_AXALLOWED`: Allow anonymous executable mappings for executables on this file system.
* `MS_NOREGULAR`: Disallow opening any regular files from this file system.
@ -57,11 +55,6 @@ itself, which may be useful for changing mount flags for a part of a filesystem.
### Remounting
If `MS_REMOUNT` is specified in `flags`, `source_fd` and `fs_type` are ignored,
and a remount is performed instead. `target` must point to an existing mount
point. The mount flags for that mount point are reset to `flags` (except the
`MS_REMOUNT` flag itself, which is stripped from the value).
Note that remounting a file system will only affect future operations with the
file system, not any already opened files. For example, if you open a directory
on a filesystem that's mounted with `MS_NODEV`, then remount the filesystem to
@ -74,14 +67,9 @@ in mount flags of the underlying file system. To "refresh" the working directory
to use the new mount flags after remounting a filesystem, a process can call
`chdir()` with the path to the same directory.
Similarly, to change the mount flags used by the root directory, a process can
remount the root filesystem using `MS_REMOUNT`.
However, it only have a noticeable effect if
the kernel was to launch more userspace processes directly, the way it does
launch the initial userspace process.
## Errors
* `EINVAL`: The `flags` value contains deprecated flags such as `MS_REMOUNT` or `MS_BIND`.
* `EFAULT`: The `fs_type` or `target` are invalid strings.
* `EPERM`: The current process does not have superuser privileges.
* `ENODEV`: The `fs_type` is unrecognized, or the file descriptor to source is
@ -99,3 +87,5 @@ All of the usual path resolution errors may also occur.
## See also
* [`mount`(8)](help://man/8/mount)
* [`remount`(2)](help://man/2/remount)
* [`bindmount`(2)](help://man/2/bindmount)

View file

@ -0,0 +1,39 @@
## Name
remount - remount a filesystem with new mount flags
## Synopsis
```**c++
#include <LibCore/System.h>
ErrorOr<void> remount(StringView target, int flags);
```
## Description
`remount()` mounts a filesystem that is mounted at `target` with new mount flags of `flags`.
The following `flags` are supported:
* `MS_NODEV`: Disallow opening any devices from this file system.
* `MS_NOEXEC`: Disallow executing any executables from this file system.
* `MS_NOSUID`: Ignore set-user-id bits on executables from this file system.
* `MS_RDONLY`: Mount the filesystem read-only.
* `MS_WXALLOWED`: Allow W^X protection circumvention for executables on this file system.
* `MS_AXALLOWED`: Allow anonymous executable mappings for executables on this file system.
* `MS_NOREGULAR`: Disallow opening any regular files from this file system.
These flags can be used as a security measure to limit the possible abuses of the mounted file system.
## Errors
* `EINVAL`: The `flags` value contains deprecated flags such as `MS_REMOUNT` or `MS_BIND`.
* `EPERM`: The current process does not have superuser privileges.
* `ENODEV`: No mount point was found for `target` path target.
All of the usual path resolution errors may also occur.
## See also
* [`mount`(2)](help://man/2/mount)

View file

@ -53,6 +53,7 @@ enum class NeedsBigProcessLock {
S(annotate_mapping, NeedsBigProcessLock::No) \
S(beep, NeedsBigProcessLock::No) \
S(bind, NeedsBigProcessLock::No) \
S(bindmount, NeedsBigProcessLock::No) \
S(chdir, NeedsBigProcessLock::No) \
S(chmod, NeedsBigProcessLock::No) \
S(chown, NeedsBigProcessLock::No) \
@ -153,6 +154,7 @@ enum class NeedsBigProcessLock {
S(recvfd, NeedsBigProcessLock::No) \
S(recvmsg, NeedsBigProcessLock::Yes) \
S(rename, NeedsBigProcessLock::No) \
S(remount, NeedsBigProcessLock::No) \
S(rmdir, NeedsBigProcessLock::No) \
S(scheduler_get_parameters, NeedsBigProcessLock::No) \
S(scheduler_set_parameters, NeedsBigProcessLock::No) \
@ -437,6 +439,17 @@ struct SC_mount_params {
int flags;
};
struct SC_remount_params {
StringArgument target;
int flags;
};
struct SC_bindmount_params {
StringArgument target;
int source_fd;
int flags;
};
struct SC_pledge_params {
StringArgument promises;
StringArgument execpromises;

View file

@ -454,6 +454,8 @@ public:
ErrorOr<FlatPtr> sys$jail_create(Userspace<Syscall::SC_jail_create_params*> user_params);
ErrorOr<FlatPtr> sys$jail_attach(Userspace<Syscall::SC_jail_attach_params const*> user_params);
ErrorOr<FlatPtr> sys$get_root_session_id(pid_t force_sid);
ErrorOr<FlatPtr> sys$remount(Userspace<Syscall::SC_remount_params const*> user_params);
ErrorOr<FlatPtr> sys$bindmount(Userspace<Syscall::SC_bindmount_params const*> user_params);
enum SockOrPeerName {
SockName,

View file

@ -77,6 +77,10 @@ ErrorOr<FlatPtr> Process::sys$mount(Userspace<Syscall::SC_mount_params const*> u
return EPERM;
auto params = TRY(copy_typed_from_user(user_params));
if (params.flags & MS_REMOUNT)
return EINVAL;
if (params.flags & MS_BIND)
return EINVAL;
auto source_fd = params.source_fd;
auto target = TRY(try_copy_kstring_from_user(params.target));
@ -91,25 +95,6 @@ ErrorOr<FlatPtr> Process::sys$mount(Userspace<Syscall::SC_mount_params const*> u
auto target_custody = TRY(VirtualFileSystem::the().resolve_path(credentials, target->view(), current_directory()));
if (params.flags & MS_REMOUNT) {
// We're not creating a new mount, we're updating an existing one!
TRY(VirtualFileSystem::the().remount(target_custody, params.flags & ~MS_REMOUNT));
return 0;
}
if (params.flags & MS_BIND) {
// We're doing a bind mount.
if (description_or_error.is_error())
return description_or_error.release_error();
auto description = description_or_error.release_value();
if (!description->custody()) {
// We only support bind-mounting inodes, not arbitrary files.
return ENODEV;
}
TRY(VirtualFileSystem::the().bind_mount(*description->custody(), target_custody, params.flags));
return 0;
}
RefPtr<FileSystem> fs;
if (!description_or_error.is_error()) {
@ -126,6 +111,54 @@ ErrorOr<FlatPtr> Process::sys$mount(Userspace<Syscall::SC_mount_params const*> u
return 0;
}
ErrorOr<FlatPtr> Process::sys$remount(Userspace<Syscall::SC_remount_params const*> user_params)
{
VERIFY_NO_PROCESS_BIG_LOCK(this);
TRY(require_no_promises());
auto credentials = this->credentials();
if (!credentials->is_superuser())
return EPERM;
auto params = TRY(copy_typed_from_user(user_params));
if (params.flags & MS_REMOUNT)
return EINVAL;
if (params.flags & MS_BIND)
return EINVAL;
auto target = TRY(try_copy_kstring_from_user(params.target));
auto target_custody = TRY(VirtualFileSystem::the().resolve_path(credentials, target->view(), current_directory()));
TRY(VirtualFileSystem::the().remount(target_custody, params.flags));
return 0;
}
ErrorOr<FlatPtr> Process::sys$bindmount(Userspace<Syscall::SC_bindmount_params const*> user_params)
{
VERIFY_NO_PROCESS_BIG_LOCK(this);
TRY(require_no_promises());
auto credentials = this->credentials();
if (!credentials->is_superuser())
return EPERM;
auto params = TRY(copy_typed_from_user(user_params));
if (params.flags & MS_REMOUNT)
return EINVAL;
if (params.flags & MS_BIND)
return EINVAL;
auto source_fd = params.source_fd;
auto target = TRY(try_copy_kstring_from_user(params.target));
auto target_custody = TRY(VirtualFileSystem::the().resolve_path(credentials, target->view(), current_directory()));
auto description = TRY(open_file_description(source_fd));
if (!description->custody()) {
// NOTE: We only support bind-mounting inodes, not arbitrary files.
return ENODEV;
}
TRY(VirtualFileSystem::the().bind_mount(*description->custody(), target_custody, params.flags));
return 0;
}
ErrorOr<FlatPtr> Process::sys$umount(Userspace<char const*> user_mountpoint, size_t mountpoint_length)
{
VERIFY_PROCESS_BIG_LOCK_ACQUIRED(this);

View file

@ -232,6 +232,33 @@ ErrorOr<void> ptrace_peekbuf(pid_t tid, void const* tracee_addr, Bytes destinati
HANDLE_SYSCALL_RETURN_VALUE("ptrace_peekbuf", rc, {});
}
ErrorOr<void> bindmount(int source_fd, StringView target, int flags)
{
if (target.is_null())
return Error::from_errno(EFAULT);
Syscall::SC_bindmount_params params {
{ target.characters_without_null_termination(), target.length() },
source_fd,
flags,
};
int rc = syscall(SC_bindmount, &params);
HANDLE_SYSCALL_RETURN_VALUE("bindmount", rc, {});
}
ErrorOr<void> remount(StringView target, int flags)
{
if (target.is_null())
return Error::from_errno(EFAULT);
Syscall::SC_remount_params params {
{ target.characters_without_null_termination(), target.length() },
flags
};
int rc = syscall(SC_remount, &params);
HANDLE_SYSCALL_RETURN_VALUE("remount", rc, {});
}
ErrorOr<void> mount(int source_fd, StringView target, StringView fs_type, int flags)
{
if (target.is_null() || fs_type.is_null())

View file

@ -59,6 +59,8 @@ ErrorOr<void> sendfd(int sockfd, int fd);
ErrorOr<int> recvfd(int sockfd, int options);
ErrorOr<void> ptrace_peekbuf(pid_t tid, void const* tracee_addr, Bytes destination_buf);
ErrorOr<void> mount(int source_fd, StringView target, StringView fs_type, int flags);
ErrorOr<void> bindmount(int source_fd, StringView target, int flags);
ErrorOr<void> remount(StringView target, int flags);
ErrorOr<void> umount(StringView mount_point);
ErrorOr<long> ptrace(int request, pid_t tid, void* address, void* data);
ErrorOr<void> disown(pid_t pid);

View file

@ -401,10 +401,7 @@ static ErrorOr<void> populate_devtmpfs()
static ErrorOr<void> prepare_synthetic_filesystems()
{
// FIXME: Don't hardcode the fs type as the ext2 filesystem and once there's
// more than this filesystem implementation (which is suitable for usage on
// physical storage), find a way to detect it.
TRY(Core::System::mount(-1, "/"sv, "ext2"sv, MS_REMOUNT | MS_NODEV | MS_NOSUID | MS_RDONLY));
TRY(Core::System::remount("/"sv, MS_NODEV | MS_NOSUID | MS_RDONLY));
// FIXME: Find a better way to all of this stuff, without hardcoding all of this!
TRY(Core::System::mount(-1, "/proc"sv, "proc"sv, MS_NOSUID));
TRY(Core::System::mount(-1, "/sys"sv, "sys"sv, 0));

View file

@ -95,7 +95,15 @@ static bool mount_by_line(DeprecatedString const& line)
dbgln("Mounting {} ({}) on {}", filename, fstype, mountpoint);
auto error_or_void = Core::System::mount(fd, mountpoint, fstype, flags);
ErrorOr<void> error_or_void;
if (flags & MS_BIND)
error_or_void = Core::System::bindmount(fd, mountpoint, flags & ~MS_BIND);
else if (flags & MS_REMOUNT)
error_or_void = Core::System::remount(mountpoint, flags & ~MS_REMOUNT);
else
error_or_void = Core::System::mount(fd, mountpoint, fstype, flags);
if (error_or_void.is_error()) {
warnln("Failed to mount {} (FD: {}) ({}) on {}: {}", filename, fd, fstype, mountpoint, error_or_void.error());
return false;
@ -215,15 +223,27 @@ ErrorOr<int> serenity_main(Main::Arguments arguments)
return 0;
}
if (!source.is_empty() && !mountpoint.is_empty()) {
if (fs_type.is_empty())
fs_type = "ext2"sv;
if (source.is_empty() && !mountpoint.is_empty()) {
int flags = !options.is_empty() ? parse_options(options) : 0;
if (!(flags & MS_REMOUNT))
return Error::from_string_literal("Expected valid source.");
TRY(Core::System::remount(mountpoint, flags & ~MS_REMOUNT));
return 0;
}
if (!source.is_empty() && !mountpoint.is_empty()) {
int flags = !options.is_empty() ? parse_options(options) : 0;
int const fd = TRY(get_source_fd(source));
TRY(Core::System::mount(fd, mountpoint, fs_type, flags));
if (flags & MS_BIND) {
TRY(Core::System::bindmount(fd, mountpoint, flags & ~MS_BIND));
} else if (flags & MS_REMOUNT) {
TRY(Core::System::remount(mountpoint, flags & ~MS_REMOUNT));
} else {
if (fs_type.is_empty())
fs_type = "ext2"sv;
TRY(Core::System::mount(fd, mountpoint, fs_type, flags));
}
return 0;
}