These 2 are an actual separate types of syscalls, so let's stop using
special flags for bind mounting or re-mounting and instead let userspace
calling directly for this kind of actions.
This filesystem is based on the code of the long-lived TmpFS. It differs
from that filesystem in one keypoint - its root inode doesn't have a
sticky bit on it.
Therefore, we mount it on /dev, to ensure only root can modify files on
that directory. In addition to that, /tmp is mounted directly in the
SystemServer main (start) code, so it's no longer specified in the fstab
file. We ensure that /tmp has a sticky bit and has the value 0777 for
root directory permissions, which is certainly a special case when using
RAM-backed (and in general other) filesystems.
Because of these 2 changes, it's no longer needed to maintain the TmpFS
filesystem, hence it's removed (renamed to RAMFS), because the RAMFS
represents the purpose of this filesystem in a much better way - it
relies on being backed by RAM "storage", and therefore it's easy to
conclude it's temporary and volatile, so its content is gone on either
system shutdown or unmounting of the filesystem.
This flag doesn't conform to any POSIX standard nor is found in any OS
out there. The idea behind this mount flag is to ensure that only
non-regular files will be placed in a filesystem, which includes device
nodes, symbolic links, directories, FIFOs and sockets. Currently, the
only valid case for using this mount flag is for TmpFS instances, where
we want to mount a TmpFS but disallow any kind of regular file and only
allow other types of files on the filesystem.
This makes pledge() ignore promises that would otherwise cause it to
fail with EPERM, which is very useful for allowing programs to run under
a "jail" so to speak, without having them termiate early due to a
failing pledge() call.
The URLs of the form `help://man/<section>/<page>` link to another help
page inside the help application. All previous relative page links are
replaced by this new form. This doesn't change any behavior but it looks
much nicer :^)
Note that man doesn't handle these new links, but the previous relative
links didn't work either.
This feature was introduced in version 4.17 of the Linux kernel, and
while it's not specified by POSIX, I think it will be a nice addition to
our system.
MAP_FIXED_NOREPLACE provides a less error-prone alternative to
MAP_FIXED: while regular fixed mappings would cause any intersecting
ranges to be unmapped, MAP_FIXED_NOREPLACE returns EEXIST instead. This
ensures that we don't corrupt our process's address space if something
is already at the requested address.
Note that the more portable way to do this is to use regular
MAP_ANONYMOUS, and check afterwards whether the returned address matches
what we wanted. This, however, has a large performance impact on
programs like Wine which try to reserve large portions of the address
space at once, as the non-matching addresses have to be unmapped
separately.
For setreuid and setresuid syscalls, -1 means to set the current
uid/euid/gid/egid value, to be more convenient for programming.
However, for other syscalls where we pass only one argument, there's no
justification to specify -1.
This behavior is identical to how Linux handles the value -1, and is
influenced by the fact that the manual pages for the group of one
argument syscalls that handle ID operations is ambiguous about this
topic.
Chroot exists neither in code nor in documentation. If we add-in the
feature again, it will be simple enough to add it back in to the
documentation. For now, let's clean it up, instead of refering to things
that don't exist.
Found by markdown-checker.
These interfaces are broken for about 9 months, maybe longer than that.
At this point, this is just a dead code nobody tests or tries to use, so
let's remove it instead of keeping a stale code just for the sake of
keeping it and hoping someone will fix it.
To better justify this, I read that OpenBSD removed loadable kernel
modules in 5.7 release (2014), mainly for the same reason we do -
nobody used it so they had no good reason to maintain it.
Still, OpenBSD had LKMs being effectively working, which is not the
current state in our project for a long time.
An arguably better approach to minimize the Kernel image size is to
allow dropping drivers and features while compiling a new image.
A quick grep revealed these stats (counting only the first occurrence
per line):
`thing`(1): 154
`thing(1)`: 9
thing(1): 4
This commit converts all occurrences to the `thing`(1) format.
We are not using this for anything and it's just been sitting there
gathering dust for well over a year, so let's stop carrying all this
complexity around for no good reason.
This is a new promise that guards access to mmap() with MAP_FIXED.
Fixed-address mappings are rarely used, but can be useful if you are
trying to groom the process address space for malicious purposes.
None of our programs need this at the moment, as the only user of
MAP_FIXED is DynamicLoader, but the fixed mappings are constructed
before the process has had a chance to pledge anything.
We had an exception that allowed SOL_SOCKET + SO_PEERCRED on local
socket to support LibIPC's PID exchange mechanism. This is no longer
needed so let's just remove the exception.
This prevents sys$mmap() and sys$mprotect() from creating executable
memory mappings in pledged programs that don't have this promise.
Note that the dynamic loader runs before pledging happens, so it's
unaffected by this.
When mounting an Ext2FS, a block device source is required. All other
filesystem types are unaffected, as most of them ignore the source file
descriptor anyway.
Fixes#5153.
All users of this mechanism have been switched to anonymous files and
passing file descriptors with sendfd()/recvfd().
Shbufs got us where we are today, but it's time we say good-bye to them
and welcome a much more idiomatic replacement. :^)
The vast majority of programs don't ever need to use sys$ptrace(),
and it seems like a high-value system call to prevent a compromised
process from using.
This patch moves sys$ptrace() from the "proc" promise to its own,
new "ptrace" promise and updates the affected apps.
Problem:
- C functions with no arguments require a single `void` in the argument list.
Solution:
- Put the `void` in the argument list of functions in C header files.
Problem:
- `(void)` simply casts the expression to void. This is understood to
indicate that it is ignored, but this is really a compiler trick to
get the compiler to not generate a warning.
Solution:
- Use the `[[maybe_unused]]` attribute to indicate the value is unused.
Note:
- Functions taking a `(void)` argument list have also been changed to
`()` because this is not needed and shows up in the same grep
command.
This is a new "browse" permission that lets you open (and subsequently list
contents of) directories underneath the path, but not regular files or any other
types of files.
Most systems (Linux, OpenBSD) adjust 0.5 ms per second, or 0.5 us per
1 ms tick. That is, the clock is sped up or slowed down by at most
0.05%. This means adjusting the clock by 1 s takes 2000 s, and the
clock an be adjusted by at most 1.8 s per hour.
FreeBSD adjusts 5 ms per second if the remaining time adjustment is
>= 1 s (0.5%) , else it adjusts by 0.5 ms as well. This allows adjusting
by (almost) 18 s per hour.
Since Serenity OS can lose more than 22 s per hour (#3429), this
picks an adjustment rate up to 1% for now. This allows us to
adjust up to 36s per hour, which should be sufficient to adjust
the clock fast enough to keep up with how much time the clock
currently loses. Once we have a fancier NTP implementation that can
adjust tick rate in addition to offset, we can think about reducing
this.
adjtime is a bit old-school and most current POSIX-y OSs instead
implement adjtimex/ntp_adjtime, but a) we have to start somewhere
b) ntp_adjtime() is a fairly gnarly API. OpenBSD's adjfreq looks
like it might provide similar functionality with a nicer API. But
before worrying about all this, it's probably a good idea to get
to a place where the kernel APIs are (barely) good enough so that
we can write an ntp service, and once we have that we should write
a way to automatically evaluate how well it keeps the time adjusted,
and only then should we add improvements ot the adjustment mechanism.
This syscall allows a parent process to disown a child process, setting
its parent PID to 0.
Unparented processes are automatically reaped by the kernel upon exit,
and no sys$waitid() is required. This will make it much nicer to do
spawn-and-forget which is common in the GUI environment.