In order to ensure a specific owner and mode when the local socket
filesystem endpoint is instantiated, we need to be able to call
fchmod() and fchown() on a socket fd between socket() and bind().
This is because until we call bind(), there is no filesystem inode
for the socket yet.
We now have these API's in <Kernel/Random.h>:
- get_fast_random_bytes(u8* buffer, size_t buffer_size)
- get_good_random_bytes(u8* buffer, size_t buffer_size)
- get_fast_random<T>()
- get_good_random<T>()
Internally they both use x86 RDRAND if available, otherwise they fall
back to the same LCG we had in RandomDevice all along.
The main purpose of this patch is to give kernel code a way to better
express its needs for random data.
Randomness is something that will require a lot more work, but this is
hopefully a step in the right direction.
It was previously possible to write to read-only file descriptors,
and read from write-only file descriptors.
All FileDescription objects now start out non-readable + non-writable,
and whoever is creating them has to "manually" enable reading/writing
by calling set_readable() and/or set_writable() on them.
This code never worked, as was never used for anything. We can build
a much better SHM implementation on top of TmpFS or similar when we
get to the point when we need one.
Split a region into two/three if the desired mprotect range is a strict
subset of an existing region. We can then set the access bits on a new
region that is just our desired range and add both the new
desired subregion and the leftovers back to our page tables.
We now validate the full range of userspace memory passed into syscalls
instead of just checking that the first and last byte of the memory are
in process-owned regions.
This fixes an issue where it was possible to avoid rejection of invalid
addresses that sat between two valid ones, simply by passing a valid
address and a size large enough to put the end of the range at another
valid address.
I added a little test utility that tries to provoke EFAULT in various
ways to help verify this. I'm sure we can think of more ways to test
this but it's at least a start. :^)
Thanks to mozjag for pointing out that this code was still lacking!
Incidentally this also makes backtraces work again.
Fixes#989.
All threads were running with iomapbase=0 in their TSS, which the CPU
interprets as "there's an I/O permission bitmap starting at offset 0
into my TSS".
Because of that, any bits that were 1 inside the TSS would allow the
thread to execute I/O instructions on the port with that bit index.
Fix this by always setting the iomapbase to sizeof(TSS32), and also
setting the TSS descriptor's limit to sizeof(TSS32), effectively making
the I/O permissions bitmap zero-length.
This should make it no longer possible to do I/O from userspace. :^)
This patch introduces a syscall:
int set_thread_boost(int tid, int amount)
You can use this to add a permanent boost value to the effective thread
priority of any thread with your UID (or any thread in the system if
you are the superuser.)
This is quite crude, but opens up some interesting opportunities. :^)
Threads now have numeric priorities with a base priority in the 1-99
range.
Whenever a runnable thread is *not* scheduled, its effective priority
is incremented by 1. This is tracked in Thread::m_extra_priority.
The effective priority of a thread is m_priority + m_extra_priority.
When a runnable thread *is* scheduled, its m_extra_priority is reset to
zero and the effective priority returns to base.
This means that lower-priority threads will always eventually get
scheduled to run, once its effective priority becomes high enough to
exceed the base priority of threads "above" it.
The previous values for ThreadPriority (Low, Normal and High) are now
replaced as follows:
Low -> 10
Normal -> 30
High -> 50
In other words, it will take 20 ticks for a "Low" priority thread to
get to "Normal" effective priority, and another 20 to reach "High".
This is not perfect, and I've used some quite naive data structures,
but I think the mechanism will allow us to build various new and
interesting optimizations, and we can figure out better data structures
later on. :^)
If an mmap fails to allocate a region, but the addr passed in was
non-zero, non-fixed mmaps should attempt to allocate at any available
virtual address.
This is memory that's loaded from an inode (file) but not modified in
memory, so still identical to what's on disk. This kind of memory can
be freed and reloaded transparently from disk if needed.
Dirty private memory is all memory in non-inode-backed mappings that's
process-private, meaning it's not shared with any other process.
This patch exposes that number via SystemMonitor, giving us an idea of
how much memory each process is responsible for all on its own.
This patch implements a simple version of the futex (fast userspace
mutex) API in the kernel and uses it to make the pthread_cond_t API's
block instead of busily sched_yield().
An arbitrary userspace address is passed to the kernel as a "token"
that identifies the futex and you can then FUTEX_WAIT and FUTEX_WAKE
that specific userspace address.
FUTEX_WAIT corresponds to pthread_cond_wait() and FUTEX_WAKE is used
for pthread_cond_signal() and pthread_cond_broadcast().
I'm pretty sure I'm missing something in this implementation, but it's
hopefully okay for a start. :^)
This is a little strange, but it's how I understand things should work.
The first thread in a new process now has TID == PID.
Additional threads subsequently spawned in that process all have unique
TID's generated by the PID allocator. TIDs are now globally unique.
The idea of all processes reliably having a main thread was nice in
some ways, but cumbersome in others. More importantly, it didn't match
up with POSIX thread semantics, so let's move away from it.
This thread gets rid of Process::main_thread() and you now we just have
a bunch of Thread objects floating around each Process.
When the finalizer nukes the last Thread in a Process, it will also
tear down the Process.
There's a bunch of more things to fix around this, but this is where we
get started :^)