It would be nice to do this in the assembly code, but we have to check
if the feature is available before doing a CLAC, so I've put this in
the C++ code for now.
Supervisor Mode Access Prevention (SMAP) is an x86 CPU feature that
prevents the kernel from accessing userspace memory. With SMAP enabled,
trying to read/write a userspace memory address while in the kernel
will now generate a page fault.
Since it's sometimes necessary to read/write userspace memory, there
are two new instructions that quickly switch the protection on/off:
STAC (disables protection) and CLAC (enables protection.)
These are exposed in kernel code via the stac() and clac() helpers.
There's also a SmapDisabler RAII object that can be used to ensure
that you don't forget to re-enable protection before returning to
userspace code.
THis patch also adds copy_to_user(), copy_from_user() and memset_user()
which are the "correct" way of doing things. These functions allow us
to briefly disable protection for a specific purpose, and then turn it
back on immediately after it's done. Going forward all kernel code
should be moved to using these and all uses of SmapDisabler are to be
considered FIXME's.
Note that we're not realizing the full potential of this feature since
I've used SmapDisabler quite liberally in this initial bring-up patch.
We now have these API's in <Kernel/Random.h>:
- get_fast_random_bytes(u8* buffer, size_t buffer_size)
- get_good_random_bytes(u8* buffer, size_t buffer_size)
- get_fast_random<T>()
- get_good_random<T>()
Internally they both use x86 RDRAND if available, otherwise they fall
back to the same LCG we had in RandomDevice all along.
The main purpose of this patch is to give kernel code a way to better
express its needs for random data.
Randomness is something that will require a lot more work, but this is
hopefully a step in the right direction.
This prevents code running outside of kernel mode from using the
following instructions:
* SGDT - Store Global Descriptor Table
* SIDT - Store Interrupt Descriptor Table
* SLDT - Store Local Descriptor Table
* SMSW - Store Machine Status Word
* STR - Store Task Register
There's no need for userspace to be able to use these instructions so
let's just disable them to prevent information leakage.
We now refuse to boot on machines that don't support PAE since all
of our paging code depends on it.
Also let's only enable SSE and PGE support if the CPU advertises it.
Instead of having a common entry point and looking at the PIC ISR to
figure out which IRQ we're servicing, just make a separate entryway
for each IRQ that pushes the IRQ number and jumps to a common routine.
This fixes a weird issue where incoming network packets would sometimes
cause the mouse to stop working. I didn't track it down further than
realizing we were sometimes EOI'ing the wrong IRQ.
Now that we have proper wait queues to drive waiter wakeup, we can use
the wake actions to break out of the scheduler's idle loop when we've
got a thread to run.
VM regions can now be marked as stack regions, which is then validated
on syscall, and on page fault.
If a thread is caught with its stack pointer pointing into anything
that's *not* a Region with its stack bit set, we'll crash the whole
process with SIGSTKFLT.
Userspace must now allocate custom stacks by using mmap() with the new
MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now
we have it too, yay! :^)
The SysV ABI says that the DF flag should be clear on function entry.
That means we have to clear it when jumping into the kernel from some
random userspace context.
After we clear the FPU state in a thread when it uses the FPU for the
first time, we also save the clean slate in the thread's FPU state
buffer. When we're doing that, let's write through current->fpu_state()
just to make it clear what's going on.
It was actually safe, since we'd just overwritten the g_last_fpu_thread
pointer anyway, but this patch improves the communication of intent.
Spotted by Bryan Steele, thanks!
Cloned threads (basically, forked processes) inherit the complete FPU
state of their origin thread. There was a bug in the lazy FPU state
save/restore mechanism where a cloned thread would believe it had a
buffer full of valid FPU state (because the inherited flag said so)
but the origin thread had never actually copied any FPU state into it.
This patch fixes that by forcing out an FPU state save after doing
the initial FPU initialization (FNINIT) in a thread. :^)
Now programs can catch the SIGSEGV signal when they segfault.
This commit also introduced the send_urgent_signal_to_self method,
which is needed to send signals to a thread when handling exceptions
caused by the same thread.
Added the exception_code field to RegisterDump, removing the need
for RegisterDumpWithExceptionCode. To accomplish this, I had to
push a dummy exception code during some interrupt entries to properly
pad out the RegisterDump. Note that we also needed to change some code
in sys$sigreturn to deal with the new RegisterDump layout.
If we receive an IRQ while the idle task is running, prevent it from
re-halting the CPU after the IRQ handler returns.
Instead have the idle task yield to the scheduler, so we can see if
the IRQ has unblocked something.
Due to the changes in signal handling m_kernel_stack_for_signal_handler_region
and m_signal_stack_user_region are no longer necessary, and so, have been
removed. I've also removed the similarly reduntant m_tss_to_resume_kernel.
We'll now try to detect crashes that were due to dereferencing nullptr,
uninitialized malloc() memory, or recently free()'d memory.
It's not perfect but I think it's pretty good. :^)
Also added some color to the most important parts of the crash log,
and added some more modes to /bin/crash for exercising this code.
Fixes#243.