VM regions can now be marked as stack regions, which is then validated
on syscall, and on page fault.
If a thread is caught with its stack pointer pointing into anything
that's *not* a Region with its stack bit set, we'll crash the whole
process with SIGSTKFLT.
Userspace must now allocate custom stacks by using mmap() with the new
MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now
we have it too, yay! :^)
While executing in the kernel, a thread can acquire various resources
that need cleanup, such as locks and references to RefCounted objects.
This cleanup normally happens on the exit path, such as in destructors
for various RAII guards. But we weren't calling those exit paths when
killing threads that have been executing in the kernel, such as threads
blocked on reading or sleeping, thus causing leaks.
This commit changes how killing threads works. Now, instead of killing
a thread directly, one is supposed to call thread->set_should_die(),
which will unblock it and make it unwind the stack if it is blocked
in the kernel. Then, just before returning to the userspace, the thread
will automatically die.
This patch adds pthread_create() and pthread_exit(), which currently
simply wrap our existing create_thread() and exit_thread() syscalls.
LibThread is also ported to using LibPthread.
The SysV ABI says that the DF flag should be clear on function entry.
That means we have to clear it when jumping into the kernel from some
random userspace context.
Instead of the big ugly switch statement, build a lookup table using
the syscall enumeration macro.
This greatly simplifies the syscall implementation. :^)
The way it gets the entropy and blasts it to the buffer is pretty
ugly IMHO, but it does work for now. (It should be replaced, by
not truncating a u32.)
It implements an (unused for now) flags argument, like Linux but
instead of OpenBSD's. This is in case we want to distinguish
between entropy sources or any other reason and have to implement
a new syscall later. Of course, learn from Linux's struggles with
entropy sourcing too.
Added the exception_code field to RegisterDump, removing the need
for RegisterDumpWithExceptionCode. To accomplish this, I had to
push a dummy exception code during some interrupt entries to properly
pad out the RegisterDump. Note that we also needed to change some code
in sys$sigreturn to deal with the new RegisterDump layout.
The fchdir() function is equivalent to chdir() except that the
directory that is to be the new current working directory is
specified by a file descriptor.
This commit drastically changes how signals are handled.
In the case that an unblocked thread is signaled it works much
in the same way as previously. However, when a blocking syscall
is interrupted, we set up the signal trampoline on the user
stack, complete the blocking syscall, return down the kernel
stack and then jump to the handler. This means that from the
kernel stack's perspective, we only ever get one system call deep.
The signal trampoline has also been changed in order to properly
store the return value from system calls. This is necessary due
to the new way we exit from signaled system calls.
It is now possible to unmount file systems from the VFS via `umount`.
It works via looking up the `fsid` of the filesystem from the `Inode`'s
metatdata so I'm not sure how fragile it is. It seems to work for now
though as something to get us going.
This patch adds the mprotect() syscall to allow changing the protection
flags for memory regions. We don't do any region splitting/merging yet,
so this only works on whole mmap() regions.
Added a "crash -r" flag to verify that we crash when you attempt to
write to read-only memory. :^)
In the userspace, this mimics the Linux pipe2() syscall;
in the kernel, the Process::sys$pipe() now always accepts
a flags argument, the no-argument pipe() syscall is now a
userspace wrapper over pipe2().
It is now possible to mount ext2 `DiskDevice` devices under Serenity on
any folder in the root filesystem. Currently any user can do this with
any permissions. There's a fair amount of assumptions made here too,
that might not be too good, but can be worked on in the future. This is
a good start to allow more dynamic operation under the OS itself.
It is also currently impossible to unmount and such, and devices will
fail to mount in Linux as the FS 'needs to be cleaned'. I'll work on
getting `umount` done ASAP to rectify this (as well as working on less
assumption-making in the mount syscall. We don't want to just be able
to mount DiskDevices!). This could probably be fixed with some `-t`
flag or something similar.
Processes can now have an icon assigned, which is essentially a 16x16 RGBA32
bitmap exposed as a shared buffer ID.
You set the icon ID by calling set_process_icon(int) and the icon ID will be
exposed through /proc/all.
To make this work, I added a mechanism for making shared buffers globally
accessible. For safety reasons, each app seals the icon buffer before making
it global.
Right now the first call to GWindow::set_icon() is what determines the
process icon. We'll probably change this in the future. :^)
The syscall is quite simple:
int watch_file(const char* path, int path_length);
It returns a file descriptor referring to a "InodeWatcher" object in the
kernel. It becomes readable whenever something changes about the inode.
Currently this is implemented by hooking the "metadata dirty bit" in
Inode which isn't perfect, but it's a start. :^)
The "stddbg" stream was a cute idea but we never ended up using it in
practice, so let's simplify this and implement userspace dbgprintf() on top
of a simple dbgputch() syscall instead.
This makes debugging LibC startup a little bit easier. :^)
This is very simple but already very useful. Now you're able to call to
dump_backtrace() from anywhere userspace to get a nice symbolicated
backtrace in the debugger output. :^)
Rolling with the theme of adding a dialog to shutdown the machine, it is
probably nice to have a way to reboot the machine without performing a full
system powerdown.
A reboot program has been added to `/bin/` as well as a corresponding
`syscall` (SC_reboot). This syscall works by attempting to pulse the 8042
keyboard controller. Note that this is NOT supported on new machines, and
should only be a fallback until we have proper ACPI support.
The implementation causes a triple fault in QEMU, which then restarts the
system. The filesystems are locked and synchronized before this occurs,
so there shouldn't be any corruption etctera.
This allows us to seal a buffer *before* anyone else has access to it
(well, ok, the creating process still does, but you can't win them all).
It also means that a SharedBuffer can be shared with multiple clients:
all you need is to have access to it to share it on again.
Instead of computing the path length inside the syscall handler, let the
caller do that work. This allows us to implement to new variants of open()
and creat(), called open_with_path_length() and creat_with_path_length().
These are suitable for use with e.g StringView.
Right now, we allow anything inside a user to raise or lower any other process's
priority. This feels simple enough to me. Linux disallows raising, but
that's annoying in practice.
Hook this up in Terminal so that the '\a' character generates a beep.
Finally emit an '\a' character in the shell line editing code when
backspacing at the start of the line.
Calling systrace(pid) gives you a file descriptor with a stream of the
syscalls made by a peer process. The process must be owned by the same
UID who calls systrace(). :^)
We can't have multiple threads in the same process running in the kernel
at the same time, so let's have a per-process lock that threads have to
acquire on syscall entry/exit (and yield while blocked.)
It's basically a userspace port of the kernel's Lock class.
Added gettid() and donate() syscalls to support the timeslice donation
feature we already enjoyed in the kernel.