This expands the reach of error propagation greatly throughout the
kernel. Sadly, it also exposes the fact that we're allocating (and
doing other fallible things) in constructors all over the place.
This patch doesn't attempt to address that of course. That's work for
our future selves.
The default template argument is only used in one place, and it
looks like it was probably just an oversight. The rest of the Kernel
code all uses u8 as the type. So lets make that the default and remove
the unused template argument, as there doesn't seem to be a reason to
allow the size to be customizable.
This commit moves the KResult and KResultOr objects to Kernel/API to
signify that they may now be freely used by userspace code at points
where a syscall-related error result is to be expected. It also exposes
KResult and KResultOr to the global namespace to make it nicer to use
for userspace code.
This is the idiomatic way to declare type aliases in modern C++.
Flagged by Sonar Cloud as a "Code Smell", but I happen to agree
with this particular one. :^)
Previous implementation sometimes didn't release the key after pressing
and holding shift due to repeating key updates when holding keys. This
meant repeating updates would set/unset `m_both_shift_keys_pressed`
repeatedly, sometimes resulting in shift still being considered pressed
even after you released it.
Simplify left and right shift key pressed logic by tracking both key
states separately and always updating modifiers based on them.
Prior to this change, both uid_t and gid_t were typedef'ed to `u32`.
This made it easy to use them interchangeably. Let's not allow that.
This patch adds UserID and GroupID using the AK::DistinctNumeric
mechanism we've already been employing for pid_t/ProcessID.
Two new ioctl requests are used to get and set the sample rate of the
sound card. The SB16 device keeps track of the sample rate separately,
because I don't want to figure out how to read the sample rate from the
device; it's easier that way.
The soundcard write doesn't set the sample rate to 44100 Hz every time
anymore, as we want to change it externally.
Now that the old PCI::Device was removed, we can complete the PCI
changes by making the PCI::DeviceController to be named PCI::Device.
Really the entire purpose and the distinction between the two was about
interrupts, but since this is no longer a problem, just rename it to
simplify things further.
I created this class a long time ago just to be able to quickly make a
PCI device to also represent an interrupt handler (because PCI devices
have this capability for most devices).
Then after a while I introduced the PCI::DeviceController, which is
really almost the same thing (a PCI device class that has Address member
in it), but is not tied to interrupts so it can have no interrupts, or
spawn interrupt handlers however it wants to seems fit.
However I decided it's time to say goodbye for this class for
a couple of reasons:
1. It made a whole bunch of weird patterns where you had a PCI::Device
and a PCI::DeviceController being used in the topic of implementation,
where originally, they meant to be used mutually exclusively (you
can't and really don't want to use both).
2. We can really make all the classes that inherit from PCI::Device
to inherit from IRQHandler at this point. Later on, when we have MSI
interrupts support, we can go further and untie things even more.
3. It makes it possible to simplify the VirtIO implementation to a great
extent. While this commit almost doesn't change it, future changes
can untangle some complexity in the VirtIO code.
For UHCIController, E1000NetworkAdapter, NE2000NetworkAdapter,
RTL8139NetworkAdapter, RTL8168NetworkAdapter, E1000ENetworkAdapter we
are simply making them to inherit the IRQHandler. This makes some sense,
because the first 3 devices will never support anything besides IRQs.
For the last 2, they might have MSI support, so when we start to utilize
those, we might need to untie these classes from IRQHandler and spawn
IRQHandler(s) or MSIHandler(s) as needed.
The VirtIODevice class is also a case where we currently need to use
both PCI::DeviceController and IRQHandler classes as parents, but it
could also be untied from the latter.
This has several benefits:
1) We no longer just blindly derefence a null pointer in various places
2) We will get nicer runtime error messages if the current process does
turn out to be null in the call location
3) GCC no longer complains about possible nullptr dereferences when
compiling without KUBSAN
This makes for nicer handling of errors compared to checking whether a
RefPtr is null. Additionally, this will give way to return different
types of errors in the future.
...and also RangeAllocator => VirtualRangeAllocator.
This clarifies that the ranges we're dealing with are *virtual* memory
ranges and not anything else.
Now that all KResult and KResultOr are used consistently throughout the
kernel, it's no longer necessary to return negative error codes.
However, we were still doing that in some places, so let's fix all those
(bugs) by removing the minuses. :^)
It's easy to forget the responsibility of validating and safely copying
kernel parameters in code that is far away from syscalls. ioctl's are
one such example, and bugs there are just as dangerous as at the root
syscall level.
To avoid this case, utilize the AK::Userspace<T> template in the ioctl
kernel interface so that implementors have no choice but to properly
validate and copy ioctl pointer arguments.
GCC and Clang allow us to inject a call to a function named
__sanitizer_cov_trace_pc on every edge. This function has to be defined
by us. By noting down the caller in that function we can trace the code
we have encountered during execution. Such information is used by
coverage guided fuzzers like AFL and LibFuzzer to determine if a new
input resulted in a new code path. This makes fuzzing much more
effective.
Additionally this adds a basic KCOV implementation. KCOV is an API that
allows user space to request the kernel to start collecting coverage
information for a given user space thread. Furthermore KCOV then exposes
the collected program counters to user space via a BlockDevice which can
be mmaped from user space.
This work is required to add effective support for fuzzing SerenityOS to
the Syzkaller syscall fuzzer. :^) :^)
We don't need to have a dedicated API for creating a VMObject with a
single page, the multi-page API option works in all cases.
Also make the API take a Span<NonnullRefPtr<PhysicalPage>> instead of
a NonnullRefPtrVector<PhysicalPage>.
These small changes fix the remaining warnings that come up during
kernel compilation with Clang. These specific fixes were for benign
things: unused lambda captures and braces around scalar initializers.
The `#pragma GCC diagnostic` part is needed because the class has
virtual methods with the same name but different arguments, and Clang
tries to warn us that we are not actually overriding anything with
these.
Weirdly enough, GCC does not seem to care.
This hack allows self-test mode run-tests-and-shutdown.sh to give
TestProcFs a stat(2)-able /proc/self/fd/0. For some reason, when
stdin is a SerialDevice, /proc/self/fd/0 will be a symlink to the device
as expected, but, calling realpath or stat on /proc/self/fd/0 will error
out. realpath will give the string from Device::absolute_path() which
would be something like "device:4,64 (SerialDevice)". When VFS is trying
to resolve_path so that we can stat the file, it would bail out on this
fake-y path.
Change the fake path (that doesn't show up when you ls a device, nor
when checking the devices tab in SystemMonitor) from the major/minor
device number and class_name() to /dev/device_name(). There's probably
a very hairy yak standing behind this issue that was only discovered due
to the ProcFS rework.
The new ProcFS design consists of two main parts:
1. The representative ProcFS class, which is derived from the FS class.
The ProcFS and its inodes are much more lean - merely 3 classes to
represent the common type of inodes - regular files, symbolic links and
directories. They're backed by a ProcFSExposedComponent object, which
is responsible for the functional operation behind the scenes.
2. The backend of the ProcFS - the ProcFSComponentsRegistrar class
and all derived classes from the ProcFSExposedComponent class. These
together form the entire backend and handle all the functions you can
expect from the ProcFS.
The ProcFSExposedComponent derived classes split to 3 types in the
manner of lifetime in the kernel:
1. Persistent objects - this category includes all basic objects, like
the root folder, /proc/bus folder, main blob files in the root folders,
etc. These objects are persistent and cannot die ever.
2. Semi-persistent objects - this category includes all PID folders,
and subdirectories to the PID folders. It also includes exposed objects
like the unveil JSON'ed blob. These object are persistent as long as the
the responsible process they represent is still alive.
3. Dynamic objects - this category includes files in the subdirectories
of a PID folder, like /proc/PID/fd/* or /proc/PID/stacks/*. Essentially,
these objects are always created dynamically and when no longer in need
after being used, they're deallocated.
Nevertheless, the new allocated backend objects and inodes try to use
the same InodeIndex if possible - this might change only when a thread
dies and a new thread is born with a new thread stack, or when a file
descriptor is closed and a new one within the same file descriptor
number is opened. This is needed to actually be able to do something
useful with these objects.
The new design assures that many ProcFS instances can be used at once,
with one backend for usage for all instances.
This does the exact thing as `adopt_ref`, which is a recent addition to
AK.
Note that pointers returned by a bare new (without `nothrow`) are
guaranteed not to return null, so they can safely be converted into
references.
This commit converts naked `new`s to `AK::try_make` and `AK::try_create`
wherever possible. If the called constructor is private, this can not be
done, so we instead now use the standard-defined and compiler-agnostic
`new (nothrow)`.
Instead, try to create the device objects in separate static methods,
and if we fail for some odd reason to allocate memory for such devices,
just panic with that reason.
We now store the device descriptor obtained from the device
during enumeration in the device's object in memory instead
of exposing all of the different members contained within it.
If we are in a shared interrupt handler, the called handlers might
indicate it was not their interrupt, so we should not increment the
call counter of these handlers.
These are the actual structures that allow USB to work (i.e the ones
actually defined in the specification). This should provide us enough
of a baseline implementation that we can build on to support
different types of USB device.
The changes in commit 20743e8 removed the s_max_virtual_consoles
constant and hardcoded the number of consoles to 4. But in
PS2KeyboardDevice the keyboard shortcuts for switching to consoles were
hardcoded to 6.
I reintroduced the constant and added it in both places.
Previously reads and writes to /dev/zero, /dev/full, /dev/null and
/dev/random were limited to 4096 bytes.
This removes that restriction so that users can enjoy more zero bytes
in their buffers.
Instead of processing the input after receiving an IRQ, we shift the
responsibility to the io work queue to handle this for us, so if a page
fault occurs when trying to switch the VirtualConsole, the kernel can
handle that.
Problem:
- `static` variables consume memory and sometimes are less
optimizable.
- `static const` variables can be `constexpr`, usually.
- `static` function-local variables require an initialization check
every time the function is run.
Solution:
- If a global `static` variable is only used in a single function then
move it into the function and make it non-`static` and `constexpr`.
- Make all global `static` variables `constexpr` instead of `const`.
- Change function-local `static const[expr]` variables to be just
`constexpr`.
There is a slight race condition in our implementation of write().
We call File::can_write() before attempting to write to it (blocking if
it returns false). If it returns true, we assume that we can write to
the file, and our code assumes that File::write() cannot possibly fail
by being blocked. There is, however, the rare case where another process
writes to the file and prevents further writes in between the call to
Files::can_write() and File::write() in the first process. This would
result in the first process calling File::write() when it cannot be
written to.
We fix this by adding a mechanism for File::can_write() to signal that
it was blocked, making it the responsibilty of File::write() to check
whether it can write and then finally making sys$write() check if the
write failed due to it being blocked.
This commit adds support for initializing multiple serial ports per
PCI board, as well as initializing multiple different pci serial boards
Currently we just choose the first PCI serial port seen as the debug
port, but this should probably be made configurable some how in the
future.
This simple driver simply finds a device in a device definitions list
and then sets up a SerialDevice instance based on the definition.
The driver currently only supports "WCH CH382 2S" pci serial boards,
as that is the only device available for me to test with, but most
other pci serial devices should be as easily addable as adding a
board_definitions entry.
The line control option bits (parity, stop bits, word length) were
masked and then combined incorrectly, resulting in them not being set
when requested.
These were accidentally the wrong way around (LSB part of the divisor
into the MSB register, MSB part of the divisor into the LSB register)
as can be seen in the specification (and in the comments themselves)
As we removed the support of VBE modesetting that was done by GRUB early
on boot, we need to determine if we can modeset the resolution with our
drivers, and if not, we should enable text mode and ensure that
SystemServer knows about it too.
Also, SystemServer should first check if there's a framebuffer device
node, which is an indication that text mode was not even if it was
requested. Then, if it doesn't find it, it should check what boot_mode
argument the user specified (in case it's self-test). This way if we
try to use bochs-display device (which is not VGA compatible) and
request a text mode, it will not honor the request and will continue
with graphical mode.
Also try to print critical messages with mininum memory allocations
possible.
In LibVT, We make the implementation flexible for kernel-specific
methods that are implemented in ConsoleImpl class.
This new subsystem is replacing the old code that was used to
create device nodes of framebuffer devices in /dev.
This subsystem includes for now 3 roles:
1. GraphicsManagement singleton object that is used in the boot
process to enumerate and initialize display devices.
2. GraphicsDevice(s) that are used to control the display adapter.
3. FramebufferDevice(s) that are used to control the device node in
/dev.
For now, we support the Bochs display adapter and any other
generic VGA compatible adapter that was configured by the boot
loader to a known and fixed resolution.
Two improvements in the Bochs display adapter code are that
we can support native bochs-display device (this device doesn't
expose any VGA capabilities) and also that we use the MMIO region,
to configure the device, instead of setting IO ports for such tasks.
This device is a graphics display device that is not supporting
VGA functionality.
Therefore, it exposes a MMIO region to configure it, so we use that
region to set the framebuffer resolution.
The following KUBSAN crash on startup was reported on discord:
```
UHCI: Started
KUBSAN: reference binding to null pointer of type struct UHCIController
KUBSAN: at ../../Kernel/Devices/USB/UHCIController.cpp, line 67
```
After inspecting the code, it became clear that there's a window of time
where the kernel task which monitors the UHCI port can startup and start
executing before the UHCIController constructor completes. This leaves
the singleton pointing to nullptr, thus in the duration of this race
window the "UHCI port proc" thread will go an and de-reference the null
pointer when trying to read for status changes on the UHCI root ports.
Reported-by: @stelar7
Reported-by: @bcoles
Fixes: #6154
AnonymousVMObject::create_with_physical_page(s) can't be NonnullRefPtr
as it allocates internally. Fixing the API then surfaced an issue in
ScatterGatherList, where the code was attempting to create an
AnonymousVMObject in the constructor which will not be observable
during OOM.
Fix all of these issues and start propagating errors at the callers
of the AnonymousVMObject and ScatterGatherList APis.
We use a global setting to determine if Caps Lock should be remapped to
Control because we don't care how keyboard events come in, just that they
should be massaged into different scan codes.
The `proc` filesystem is able to manipulate this global variable using
the `sysctl` utility like so:
```
# sysctl caps_lock_to_ctrl=1
```
Our current implementation does not work in the special case in which
both shift keys are pressed, and then only one of the keys is released,
as this would result in writing lower case letters, instead of the
expected upper case letters.
This commit fixes that by keeping track of the amount of shift keys
that are pressed (instead of if any are at all), and only switching to
the unshifted keymap once all of them are released.
We had some inconsistencies before:
- Sometimes "The", sometimes "the"
- Sometimes trailing ".", sometimes no trailing "."
I picked the most common one (lowecase "the", trailing ".") and applied
it to all copyright headers.
By using the exact same string everywhere we can ensure nothing gets
missed during a global search (and replace), and that these
inconsistencies are not spread any further (as copyright headers are
commonly copied to new files).
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
The previous implementation could allocate on insertion into the completed / pending
sub request vectors. There's no reason these can't be intrusive lists instead.
This is a very minor step towards improving the ability to handle OOM, as tracked by #6369
It might also help improve performance on the IO path in certain situations.
I'll benchmark that later.
Pressing this combo will dump a list of all threads and their state
to the debug console.
This might be useful to figure out why the system is not responding.
Helps with bare metal debugging, as we can't be sure our implementation
will work with a given machine.
As reported by someone on Discord, their machine hangs when we attempt
the dummy transfer.
The first one is for disabling the PS2 controller, the other one is for
disabling physical storage enumeration.
We can't be sure any machine will work with our implementation,
therefore this will help us to test more machines.
The end goal of this commit is to allow to boot on bare metal with no
PS/2 device connected to the system. It turned out that the original
code relied on the existence of the PS/2 keyboard, so VirtualConsole
called it even though ACPI indicated the there's no i8042 controller on
my real machine because I didn't plug any PS/2 device.
The code is much more flexible, so adding HID support for other type of
hardware (e.g. USB HID) could be much simpler.
Briefly describing the change, we have a new singleton called
HIDManagement, which is responsible to initialize the i8042 controller
if exists, and to enumerate its devices. I also abstracted a bit
things, so now every Human interface device is represented with the
HIDDevice class. Then, there are 2 types of it - the MouseDevice and
KeyboardDevice classes; both are responsible to handle the interface in
the DevFS.
PS2KeyboardDevice, PS2MouseDevice and VMWareMouseDevice classes are
responsible for handling the hardware-specific interface they are
assigned to. Therefore, they are inheriting from the IRQHandler class.
We can't use deferred functions for anything that may require preemption,
such as copying from/to user or accessing the disk. For those purposes
we should use a work queue, which is essentially a kernel thread that
may be preempted or blocked.
Alot of code is shared between i386/i686/x86 and x86_64
and a lot probably will be used for compatability modes.
So we start by moving the headers into one Directory.
We will probalby be able to move some cpp files aswell.
Previously all of the CommandLine parsing was spread out around the
Kernel. Instead move it all into the Kernel CommandLine class, and
expose a strongly typed API for querying the state of options.
This commit is very invasive, because Thread likes to take a pointer and write
to it. This means that translating between timespec/timeval/Time would have been
more difficult than just changing everything that hands a raw pointer to Thread,
in bulk.
This may seem like a no-op change, however it shrinks down the Kernel by a bit:
.text -432
.unmap_after_init -60
.data -480
.debug_info -673
.debug_aranges 8
.debug_ranges -232
.debug_line -558
.debug_str -308
.debug_frame -40
With '= default', the compiler can do more inlining, hence the savings.
I intentionally omitted some opportunities for '= default', because they
would increase the Kernel size.
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)
Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.
We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
There's no real system here, I just added it to various functions
that I don't believe we ever want to call after initialization
has finished.
With these changes, we're able to unmap 60 KiB of kernel text
after init. :^)
In preparation for marking BlockingResult [[nodiscard]], there are a few
places that perform infinite waits, which we never observe the result of
the wait. Instead of suppressing them, add an alternate function which
returns void when performing and infinite wait.
If we try to align a number above 0xfffff000 to the next multiple of
the page size (4 KiB), it would wrap around to 0. This is most likely
never what we want, so let's assert if that happens.
This patch adds Space, a class representing a process's address space.
- Each Process has a Space.
- The Space owns the PageDirectory and all Regions in the Process.
This allows us to reorganize sys$execve() so that it constructs and
populates a new Space fully before committing to it.
Previously, we would construct the new address space while still
running in the old one, and encountering an error meant we had to do
tedious and error-prone rollback.
Those problems are now gone, replaced by what's hopefully a set of much
smaller problems and missing cleanups. :^)
Because it was 'static const' and also shared with userland programs,
the default keymap was defined in multiple places. This commit should
save several kilobytes! :^)
In ab14b0ac64, mmap was changed so that
the size of the region is aligned before it was passed to the device
driver. The previous logic would assert when the framebuffer size was
not a multiple of the page size. I've also taken the liberty of
returning an error on mmap failure rather than asserting.
Instead of letting each File subclass do range allocation in their
mmap() override, do it up front in sys$mmap().
This makes us honor alignment requests for file-backed memory mappings
and simplifies the code somwhat.
This was done with the help of several scripts, I dump them here to
easily find them later:
awk '/#ifdef/ { print "#cmakedefine01 "$2 }' AK/Debug.h.in
for debug_macro in $(awk '/#ifdef/ { print $2 }' AK/Debug.h.in)
do
find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/#ifdef '$debug_macro'/#if '$debug_macro'/' {} \;
done
# Remember to remove WRAPPER_GERNERATOR_DEBUG from the list.
awk '/#cmake/ { print "set("$2" ON)" }' AK/Debug.h.in
Besides removing the monolithic DevFSDeviceInode::determine_name()
method, being able to determine a device's name inside the /dev
hierarchy outside of DevFS has its uses.
..and allow implicit creation of KResult and KResultOr from ErrnoCode.
This means that kernel functions that return those types can finally
do "return EINVAL;" and it will just work.
There's a handful of functions that still deal with signed integers
that should be converted to return KResults.
Problem:
- Many constructors are defined as `{}` rather than using the ` =
default` compiler-provided constructor.
- Some types provide an implicit conversion operator from `nullptr_t`
instead of requiring the caller to default construct. This violates
the C++ Core Guidelines suggestion to declare single-argument
constructors explicit
(https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c46-by-default-declare-single-argument-constructors-explicit).
Solution:
- Change default constructors to use the compiler-provided default
constructor.
- Remove implicit conversion operators from `nullptr_t` and change
usage to enforce type consistency without conversion.
Problem:
- The implementation of `find` is coupled to the implementation of
`SinglyLinkedList`.
Solution:
- Decouple the implementation of `find` from the class by using a
generic `find` algorithm.
Problem:
- The implementation of `find` is coupled to the implementation of
`DoublyLinkedList`.
- `append` and `prepend` are implemented multiple times so that
r-value references can be moved from into the new node. This is
probably not called very often because a pr-value or x-value needs
to be used here.
Solution:
- Decouple the implementation of `find` from the class by using a
generic `find` algorithm.
- Make `append` and `prepend` be function templates so that they can
have binding references which can be forwarded.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.Everything:
The modifications in this commit were automatically made using the
following command:
find . -name '*.cpp' -exec sed -i -E 's/dbg\(\) << ("[^"{]*");/dbgln\(\1\);/' {} \;
We can now test a _very_ basic transaction via `do_debug_transfer()`.
This function merely attaches some TDs to the LSCTRL queue head
and points some input and output buffers. We then sense an interrupt
with USBSTS value of 1, meaning Interrupt On Completion
(of the transaction). At this point, the input buffer is filled with
some data.
According the USB spec/UHCI datasheet (as well as the Linux and
BSD source code), if we receive an IRQ and USBSTS is 0, then
the IRQ does not belong to us and we should immediately jump
out of the handler.
We can now read/write to the two root ports exposed to the
UHCI controller, and detect when a device is plugged in or
out via a kernel process that constantly scans the port
for any changes. This is very basic, but is a bit of fun to see
the kernel detecting hardware on the fly :^)
Implemented both Queue Heads and Transfer Descriptors. These
are required to actually perform USB transactions. The UHCI
driver sets up a pool of these that can be allocated when we
need them. It seems some drivers have these statically
allocated, so it might be worth looking into that, but
for now, the simple way seems to be to allocate them on
the fly as we need them, and then release them.
It seems that not setting the framelist address register
was causing the entire system to lock up as it generated an insane
interrupt storm in the IRQ handler for the UHCI controller.
We now allocate a 4KiB aligned page via
`MemoryManager::allocate_supervisor_physical_page()` and set every
value to 1. In effect, this creates a framelist with each entry
being a "TERMINATE" entry in which the controller stalls until its'
1mS time slice is up.
Some more registers have also been set for consistency, though it
seems like this don't need to be set explicitly in software.
Before this change, we would sometimes map a region into the address
space with !is_shared(), and then moments later call set_shared(true).
I found this very confusing while debugging, so this patch makes us pass
the initial shared flag to the Region constructor, ensuring that it's in
the correct state by the time we first map the region.
This new subsystem is somewhat replacing the IDE disk code we had with a
new flexible design.
StorageDevice is a generic class that represent a generic storage
device. It is meant that specific storage hardware will override the
interface. StorageController is a generic class that represent
a storage controller that can be found in a machine.
The IDEController class governs two IDEChannels. An IDEChannel is
responsible to manage the master & slave devices of the channel,
therefore an IDEChannel is an IRQHandler.
Fix some problems with join blocks where the joining thread block
condition was added twice, which lead to a crash when trying to
unblock that condition a second time.
Deferred block condition evaluation by File objects were also not
properly keeping the File object alive, which lead to some random
crashes and corruption problems.
Other problems were caused by the fact that the Queued state didn't
handle signals/interruptions consistently. To solve these issues we
remove this state entirely, along with Thread::wait_on and change
the WaitQueue into a BlockCondition instead.
Also, deliver signals even if there isn't going to be a context switch
to another thread.
Fixes#4336 and #4330
This makes the Scheduler a lot leaner by not having to evaluate
block conditions every time it is invoked. Instead evaluate them as
the states change, and unblock threads at that point.
This also implements some more waitid/waitpid/wait features and
behavior. For example, WUNTRACED and WNOWAIT are now supported. And
wait will now not return EINTR when SIGCHLD is delivered at the
same time.
Use the TimerQueue to expire blocking operations, which is one less thing
the Scheduler needs to check on every iteration.
Also, add a BlockTimeout class that will automatically handle relative or
absolute timeouts as well as overriding timeouts (e.g. socket timeouts)
more consistently.
Also, rework the TimerQueue class to be able to fire events from
any processor, which requires Timer to be RefCounted. Also allow
creating id-less timers for use by blocking operations.
We won't be receiving full PS/2 mouse packets when the VMWareBackdoor
absolute mouse mode is enabled. So, read just one byte every time
and retrieve the latest mouse packet from VMWareBackdoor immediately.
Fixes#4086
This allows issuing asynchronous requests for devices and waiting
on the completion of the request. The requests can cascade into
multiple sub-requests.
Since IRQs may complete at any time, if the current process is no
longer the same that started the process, we need to swich the
paging context before accessing user buffers.
Change the PATA driver to use this model.
Rework the PS/2 keyboard and mouse drivers to use a common 8042
controller driver. Also, reset and reconfigure the 8042 controller
as they are not guaranteed to be in the state that we expect.
This allows issuing asynchronous requests for devices and waiting
on the completion of the request. The requests can cascade into
multiple sub-requests.
Since IRQs may complete at any time, if the current process is no
longer the same that started the process, we need to swich the
paging context before accessing user buffers.
Change the PATA driver to use this model.
There are plenty of places in the kernel that aren't
checking if they actually got their allocation.
This fixes some of them, but definitely not all.
Fixes#3390Fixes#3391
Also, let's make find_one_free_page() return nullptr
if it doesn't get a free index. This stops the kernel
crashing when out of memory and allows memory purging
to take place again.
Fixes#3487
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.
So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.
To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.
Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.