Commit graph

1188 commits

Author SHA1 Message Date
Liav A
633006926f Kernel: Make the Jails' internal design a lot more sane
This is done with 2 major steps:
1. Remove JailManagement singleton and use a structure that resembles
    what we have with the Process object. This is required later for the
    second step in this commit, but on its own, is a major change that
    removes this clunky singleton that had no real usage by itself.
2. Use IntrusiveLists to keep references to Process objects in the same
    Jail so it will be much more straightforward to iterate on this kind
    of objects when needed. Previously we locked the entire Process list
    and we did a simple pointer comparison to check if the checked
    Process we iterate on is in the same Jail or not, which required
    taking multiple Spinlocks in a very clumsy and heavyweight way.
2023-03-12 10:21:59 -06:00
Andreas Kling
d1371d66f7 Kernel: Use non-locking {Nonnull,}RefPtr for OpenFileDescription
This patch switches away from {Nonnull,}LockRefPtr to the non-locking
smart pointers throughout the kernel.

I've looked at the handful of places where these were being persisted
and I don't see any race situations.

Note that the process file descriptor table (Process::m_fds) was already
guarded via MutexProtected.
2023-03-07 00:30:12 +01:00
Andreas Kling
36b0ecfe9e Kernel: Remove two outdated FIXMEs about the file descriptor table mutex
These functions cannot be called without already holding the relevant
mutex these days, since m_fds is a MutexProtected object. :^)
2023-03-06 23:46:36 +01:00
Andreas Kling
359d6e7b0b Everywhere: Stop using NonnullOwnPtrVector
Same as NonnullRefPtrVector: weird semantics, questionable benefits.
2023-03-06 23:46:35 +01:00
Liav A
39de5b7f82 Kernel: Actually check Process unveil data when creating perfcore dump
Before of this patch, we looked at the unveil data of the FinalizerTask,
which naturally doesn't have any unveil restrictions, therefore allowing
an unveil bypass for a process that enabled performance coredumps.

To ensure we always check the dumped process unveil data, an option to
pass a Process& has been added to a couple of methods in the class of
VirtualFileSystem.
2023-03-05 15:15:55 +00:00
Liav A
bedd90b1f0 Kernel: Properly lock Process protected data in the prctl syscall 2023-02-24 22:26:07 +01:00
Liav A
c56e1c5378 Kernel/FileSystem: Simplify the ProcFS significantly
Since the ProcFS doesn't hold many global objects within it, the need
for a fully-structured design of backing components and a registry like
with the SysFS is no longer true.

To acommodate this, let's remove all backing store and components of the
ProcFS, so now it resembles what we had in the early days of ProcFS in
the project - a mostly-static filesystem, with very small amount of
kmalloc allocations needed.
We still use the inode index mechanism to understand the role of each
inode, but this is done in a much "static"ier way than before.
2023-02-24 22:14:18 +01:00
Timon Kruiper
3295137224 Kernel: Add optional userspace backtrace to Process::crash
This is very useful for debugging the initial userspace applications, as
the CrashReporter is not yet running.
2023-02-08 18:19:48 +00:00
Sam Atkins
fe7b08dad7 Kernel: Protect Process::m_name with a spinlock
This also lets us remove the `get_process_name` and `set_process_name`
syscalls from the big lock. :^)
2023-02-06 20:36:53 +01:00
Agustin Gianni
e71c320154 Kernel: Change the way we call a syscall in signal_trampoline_dummy
The function signal_trampoline_dummy was using int 0x82 to call
SC_sigreturn. Since x86 is no longer supported, the correct way
to call a syscall is using the syscall instruction.

This paves the way to remove the syscall trap handling mechanism.
2023-02-02 01:52:52 -07:00
Timon Kruiper
12322670cb Kernel: Use InterruptsState abstraction in execve.cpp
This was using the x86_64 specific cpu_flags abstraction, which is not
compatible with aarch64.
2023-01-27 20:47:08 +00:00
Timon Kruiper
697c5ca5e5 Kernel: Move Memory/PageDirectory.{cpp,h} to arch-specific directory
The handling of page tables is very architecture specific, so belongs
in the Arch directory. Some parts were already architecture-specific,
however this commit moves the rest of the PageDirectory class into the
Arch directory.

While we're here the aarch64/PageDirectory.{h,cpp} files are updated to
be aarch64 specific, by renaming some members and removing x86_64
specific code.
2023-01-27 11:41:43 +01:00
Andrew Kaster
ddea37b521 Kernel+LibC: Move name length constants to Kernel/API from limits.h
Reduce inclusion of limits.h as much as possible at the same time.

This does mean that kmalloc.h is now including Kernel/API/POSIX/limits.h
instead of LibC/limits.h, but the scope could be limited a lot more.
Basically every file in the kernel includes kmalloc.h, and needs the
limits.h include for PAGE_SIZE.
2023-01-21 10:43:59 -07:00
Liav A
04221a7533 Kernel: Mark Process::jail() method as const
We really don't want callers of this function to accidentally change
the jail, or even worse - remove the Process from an attached jail.
To ensure this never happens, we can just declare this method as const
so nobody can mutate it this way.
2023-01-07 03:44:59 +03:30
yyny
9ca979846c Kernel: Add sid and pgid to Credentials
There are places in the kernel that would like to have access
to `pgid` credentials in certain circumstances.

I haven't found any use cases for `sid` yet, but `sid` and `pgid` are
both changed with `sys$setpgid`, so it seemed sensical to add it.

In Linux, `man 7 credentials` also mentions both the session id and
process group id, so this isn't unprecedented.
2023-01-03 18:13:11 +01:00
kleines Filmröllchen
a6a439243f Kernel: Turn lock ranks into template parameters
This step would ideally not have been necessary (increases amount of
refactoring and templates necessary, which in turn increases build
times), but it gives us a couple of nice properties:
- SpinlockProtected inside Singleton (a very common combination) can now
  obtain any lock rank just via the template parameter. It was not
  previously possible to do this with SingletonInstanceCreator magic.
- SpinlockProtected's lock rank is now mandatory; this is the majority
  of cases and allows us to see where we're still missing proper ranks.
- The type already informs us what lock rank a lock has, which aids code
  readability and (possibly, if gdb cooperates) lock mismatch debugging.
- The rank of a lock can no longer be dynamic, which is not something we
  wanted in the first place (or made use of). Locks randomly changing
  their rank sounds like a disaster waiting to happen.
- In some places, we might be able to statically check that locks are
  taken in the right order (with the right lock rank checking
  implementation) as rank information is fully statically known.

This refactoring even more exposes the fact that Mutex has no lock rank
capabilites, which is not fixed here.
2023-01-02 18:15:27 -05:00
Timon Kruiper
1da84c2a2c Kernel: Factor out setting Thread entry function
This adds ThreadRegisters::set_entry_function, and also implements it
for aarch64.
2022-12-29 19:32:20 -07:00
Timon Kruiper
21deb603de Kernel/aarch64: Implement stub for asm_signal_trampoline
This get us further into the boot process, since Process::initialize
does not crash anymore.
2022-12-29 19:32:20 -07:00
Liav A
5ff318cf3a Kernel: Remove i686 support 2022-12-28 11:53:41 +01:00
sin-ack
5c1d5ed51d Kernel: Implement Process::custody_for_dirfd
This allows deduplicating a bunch of code that has to work with
POSIX' *at syscall semantics.
2022-12-11 19:55:37 -07:00
Liav A
0bb7c8f4c4 Kernel+SystemServer: Don't hardcode coredump directory path
Instead, allow userspace to decide on the coredump directory path. By
default, SystemServer sets it to the /tmp/coredump directory, but users
can now change this by writing a new path to the sysfs node at
/sys/kernel/variables/coredump_directory, and also to read this node to
check where coredumps are currently generated at.
2022-12-03 05:56:59 -07:00
Liav A
718ae68621 Kernel+LibCore+LibC: Implement support for forcing unveil on exec
To accomplish this, we add another VeilState which is called
LockedInherited. The idea is to apply exec unveil data, similar to
execpromises of the pledge syscall, on the current exec'ed program
during the execve sequence. When applying the forced unveil data, the
veil state is set to be locked but the special state of LockedInherited
ensures that if the new program tries to unveil paths, the request will
silently be ignored, so the program will continue running without
receiving an error, but is still can only use the paths that were
unveiled before the exec syscall. This in turn, allows us to use the
unveil syscall with a special utility to sandbox other userland programs
in terms of what is visible to them on the filesystem, and is usable on
both programs that use or don't use the unveil syscall in their code.
2022-11-26 12:42:15 -07:00
Liav A
5e062414c1 Kernel: Add support for jails
Our implementation for Jails resembles much of how FreeBSD jails are
working - it's essentially only a matter of using a RefPtr in the
Process class to a Jail object. Then, when we iterate over all processes
in various cases, we could ensure if either the current process is in
jail and therefore should be restricted what is visible in terms of
PID isolation, and also to be able to expose metadata about Jails in
/sys/kernel/jails node (which does not reveal anything to a process
which is in jail).

A lifetime model for the Jail object is currently plain simple - there's
simpy no way to manually delete a Jail object once it was created. Such
feature should be carefully designed to allow safe destruction of a Jail
without the possibility of releasing a process which is in Jail from the
actual jail. Each process which is attached into a Jail cannot leave it
until the end of a Process (i.e. when finalizing a Process). All jails
are kept being referenced in the JailManagement. When a last attached
process is finalized, the Jail is automatically destroyed.
2022-11-05 18:00:58 -06:00
Gunnar Beutner
63a91d6971 Kernel: Add more AARCH64 stubs 2022-10-18 13:08:25 +02:00
Timon Kruiper
9827c11d8b Kernel: Move InterruptDisabler out of Arch directory
The code in this file is not architecture specific, so it can be moved
to the base Kernel directory.
2022-10-17 20:11:31 +02:00
kleines Filmröllchen
1a7d6508e3 Kernel: By default, don't dump regions when a userspace crash happens
There is the DUMP_REGIONS_ON_CRASH debug macro which re-enables this
(old) behavior.
2022-09-24 14:22:09 +02:00
Andreas Kling
cf16b2c8e6 Kernel: Wrap process address spaces in SpinlockProtected
This forces anyone who wants to look into and/or manipulate an address
space to lock it. And this replaces the previous, more flimsy, manual
spinlock use.

Note that pointers *into* the address space are not safe to use after
you unlock the space. We've got many issues like this, and we'll have
to track those down as wlel.
2022-08-24 14:57:51 +02:00
Anthony Iacono
ec3d8a7a18 Kernel: Remove unused Process::in_group() 2022-08-23 01:01:48 +02:00
Anthony Iacono
f86b671de2 Kernel: Use Process::credentials() and remove user ID/group ID helpers
Move away from using the group ID/user ID helpers in the process to
allow for us to take advantage of the immutable credentials instead.
2022-08-22 12:46:32 +02:00
Andreas Kling
c3351d4b9f Kernel: Make VirtualFileSystem functions take credentials as input
Instead of getting credentials from Process::current(), we now require
that they be provided as input to the various VFS functions.

This ensures that an atomic set of credentials is used throughout an
entire VFS operation.
2022-08-21 16:02:24 +02:00
Andreas Kling
8ed06ad814 Kernel: Guard Process "protected data" with a spinlock
This ensures that both mutable and immutable access to the protected
data of a process is serialized.

Note that there may still be multiple TOCTOU issues around this, as we
have a bunch of convenience accessors that make it easy to introduce
them. We'll need to audit those as well.
2022-08-21 12:25:14 +02:00
Andreas Kling
728c3fbd14 Kernel: Use RefPtr instead of LockRefPtr for Custody
By protecting all the RefPtr<Custody> objects that may be accessed from
multiple threads at the same time (with spinlocks), we remove the need
for using LockRefPtr<Custody> (which is basically a RefPtr with a
built-in spinlock.)
2022-08-21 12:25:14 +02:00
Andreas Kling
122d7d9533 Kernel: Add Credentials to hold a set of user and group IDs
This patch adds a new object to hold a Process's user credentials:

- UID, EUID, SUID
- GID, EGID, SGID, extra GIDs

Credentials are immutable and child processes initially inherit the
Credentials object from their parent.

Whenever a process changes one or more of its user/group IDs, a new
Credentials object is constructed.

Any code that wants to inspect and act on a set of credentials can now
do so without worrying about data races.
2022-08-20 18:32:50 +02:00
Andreas Kling
11eee67b85 Kernel: Make self-contained locking smart pointers their own classes
Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:

- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable

This patch renames the Kernel classes so that they can coexist with
the original AK classes:

- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable

The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
2022-08-20 17:20:43 +02:00
kleines Filmröllchen
4314c25cf2 Kernel: Require lock rank for Spinlock construction
All users which relied on the default constructor use a None lock rank
for now. This will make it easier to in the future remove LockRank and
actually annotate the ranks by searching for None.
2022-08-19 20:26:47 -07:00
Brian Gianforcaro
00936e151e Kernel: Make failure to write coredump or perfcore a regular dmesg
This does not need to be a critical dmesg, as the system stays up
it makes more sense for it to be a normal dmesg message.

Luke mentioned this on discord, they really deserve the credit :^)

Reported-by: Luke Wilde <lukew@serenityos.org>
2022-08-10 11:38:18 -04:00
Undefine
97cc33ca47 Everywhere: Make the codebase more architecture aware 2022-07-27 21:46:42 +00:00
Tim Schumacher
e79f0e2ee9 Kernel+LibC: Don't hardcode the maximum signal number everywhere 2022-07-22 10:07:15 -07:00
sin-ack
3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Tim Schumacher
67f352b824 Kernel: Copy signal handlers when forking 2022-07-05 20:58:38 +03:00
Timon Kruiper
a4534678f9 Kernel: Implement InterruptDisabler using generic Processor functions
Now that the code does not use architectural specific code, it is moved
to the generic Arch directory and the paths are modified accordingly.
2022-06-02 13:14:12 +01:00
Luke Wilde
1682b0b6d8 Kernel: Remove big lock from sys$set_coredump_metadata
The only requirement for this syscall is to make
Process::m_coredump_properties SpinlockProtected.
2022-04-09 21:51:16 +02:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Daniel Bertalan
70ccdb300b Kernel: Panic if the init process dies
If init crashes, all other userspace processes exit too, thus rendering
the system unusable. Previously, the kernel would still keep running
even without a userland, showing just a black screen without any
indication of the issue.

We now panic the kernel, which shows a message on the console. In the
case of the CI runners, it shuts down the virtual machine, so we don't
have to wait for the 1 hour timeout if an issue arises with
SystemServer.
2022-03-08 23:30:47 +01:00
Andreas Kling
580d89f093 Kernel: Put Process unveil state in a SpinlockProtected container
This makes path resolution safe to perform without holding the big lock.
2022-03-08 00:19:49 +01:00
Andreas Kling
24f02bd421 Kernel: Put Process's current directory in a SpinlockProtected
Also let's call it "current_directory" instead of "cwd" everywhere.
2022-03-08 00:19:49 +01:00
Ali Mohammad Pur
88d7bf7362 Kernel: Save and restore FPU state on signal dispatch on i386/x86_64 2022-03-04 20:07:05 +01:00
Ali Mohammad Pur
4bd01b7fe9 Kernel: Add support for SA_SIGINFO
We currently don't really populate most of the fields, but that can
wait :^)
2022-03-04 20:07:05 +01:00
Ali Mohammad Pur
585054d68b Kernel: Comment the living daylights out of signal trampoline/sigreturn
Mere mortals like myself cannot understand more than two lines of
assembly without a million comments explaining what's happening, so do
that and make sure no one has to go on a wild stack state chase when
hacking on these.
2022-03-04 20:07:05 +01:00
Ali Mohammad Pur
cf63447044 Kernel: Move signal handlers from being thread state to process state
POSIX requires that sigaction() and friends set a _process-wide_ signal
handler, so move signal handlers and flags inside Process.
This also fixes a "pid/tid confusion" FIXME, as we can now send the
signal to the process and let that decide which thread should get the
signal (which is the thread with tid==pid, but that's now the Process's
problem).
Note that each thread still retains its signal mask, as that is local to
each thread.
2022-03-04 20:07:05 +01:00