This was another vestige from a long time ago, when exiting a thread
would mutate global data structures that were only protected by the
interrupt flag.
This was necessary in the past when crash handling would modify
various global things, but all that stuff is long gone so we can
simplify crashes by leaving the interrupt flag alone.
Make more of the kernel compile in 64-bit mode, and make some things
pointer-size-agnostic (by using FlatPtr.)
There's a lot of work to do here before the kernel will even compile.
Let's help our future selves find this problem sooner next time it
happens. Hopefully we'll come up with a nicer loader before then,
but who knows. :^)
Instead of writing to the userspace utsname struct one field at a time,
build up a utsname on the kernel stack and copy it out to userspace
once it's finished. This is both simpler and gets validity checking
built-in for free.
Found by KUBSAN! :^)
Fixes#5499.
For some reason I don't yet understand, building the kernel with -O2
produces a way-too-large kernel on some people's systems.
Since there are some really nice performance benefits from -O2 in
userspace, let's do a compromise and build Userland with -O2 but
put Kernel back into the -Os box for now.
Due to the non-standard way the boot assembler code is linked into
the kernel (not and actual dependency, but linked via linker.ld script)
both make and ninja weren't re-linking the kernel when boot.S was
changed. This should theoretically work since we use the cmake
`add_dependencies(..)` directive to express a manual dependency
on boot from Kernel, but something is obviously broken in cmake.
We can work around that with a hack, which forces a dependency on
a file we know will always exist in the kernel (init.cpp). So if
boot.S is rebuilt, then init.cpp is forced to be rebuilt, and then
we re-link the kernel. init.cpp is also relatively small, so it
compiles fast.
We were only 448 KiB away from filling up the old slot size we reserve
for the kernel above the 3 GiB mark. This expands the slot to 16 MiB,
which allows us to continue booting the kernel until somebody takes
the time to improve our loader.
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)
Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.
We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
With the kernel command line issue fixed, we can now enable these
KUBSAN options without getting triple faults on startup:
* alignment
* null
* pointer-overflow
When building the kernel with -O2, we somehow ended up with the kernel
command line outside of the lower 8MB of physical memory. Since we don't
map that area in our initial page table setup, we would triple fault
when trying to parse the command line.
This patch sidesteps the issue by copying the (first 4KB of) the kernel
command line to a buffer in a known safe location at boot.
Clangd (CLion) was choking on some of the -fsanitize options, and since
we're not building the kernel with Clang anyway, let's just disable
the options for non-GCC compilers for now.
- We've already computed the number of fds * sizeof(pollfd), so use it
instead of needlessly doing it again.
- Use fds_copy.data() instead off address of indexing the vector.
The `default_signal_action(u8 signal)` function already has the
full mapping. The only caveat being that now we need to make
sure the thread constructor and clear_signals() method do the work
of resetting the m_signal_action_data array, instead or relying on
the previous logic in set_default_signal_dispositions.
This is a new promise that guards access to mmap() with MAP_FIXED.
Fixed-address mappings are rarely used, but can be useful if you are
trying to groom the process address space for malicious purposes.
None of our programs need this at the moment, as the only user of
MAP_FIXED is DynamicLoader, but the fixed mappings are constructed
before the process has had a chance to pledge anything.
We want to make sure these functions actually do get unmapped. If they
were inlined somewhere, the inlined version(s) would remain mapped.
Thanks to "thislooksfun" for the suggestion! :^)
There's no real system here, I just added it to various functions
that I don't believe we ever want to call after initialization
has finished.
With these changes, we're able to unmap 60 KiB of kernel text
after init. :^)
You can now declare functions with UNMAP_AFTER_INIT and they'll get
segregated into a separate kernel section that gets completely
unmapped at the end of initialization.
This can be used for anything we don't need to call once we've booted
into userspace.
There are two nice things about this mechanism:
- It allows us to free up entire pages of memory for other use.
(Note that this patch does not actually make use of the freed
pages yet, but in the future we totally could!)
- It allows us to get rid of obviously dangerous gadgets like
write-to-CR0 and write-to-CR4 which are very useful for an attacker
trying to disable SMAP/SMEP/etc.
I've also made sure to include a helpful panic message in case you
hit a kernel crash because of this protection. :^)
If we have a tracer process waiting for us to exec, we need to release
the ptrace lock before stopping ourselves, since otherwise the tracer
will block forever on the lock.
Fixes#5409.
This enabled trivial ASLR bypass for non-dumpable programs by simply
opening /proc/PID/vm before exec'ing.
We now hold the target process's ptrace lock across the refresh/write
operations, and deny access if the process is non-dumpable. The lock
is necessary to prevent a TOCTOU race on Process::is_dumpable() while
the target is exec'ing.
Fixes#5270.