There are now 2 separate classes for almost the same object type:
- EnumerableDeviceIdentifier, which is used in the enumeration code for
all PCI host controller classes. This is allowed to be moved and
copied, as it doesn't support ref-counting.
- DeviceIdentifier, which inherits from EnumerableDeviceIdentifier. This
class uses ref-counting, and is not allowed to be copied. It has a
spinlock member in its structure to allow safely executing complicated
IO sequences on a PCI device and its space configuration.
There's a static method that allows a quick conversion from
EnumerableDeviceIdentifier to DeviceIdentifier while creating a
NonnullRefPtr out of it.
The reason for doing this is for the sake of integrity and reliablity of
the system in 2 places:
- Ensure that "complicated" tasks that rely on manipulating PCI device
registers are done in a safe manner. For example, determining a PCI
BAR space size requires multiple read and writes to the same register,
and if another CPU tries to do something else with our selected
register, then the result will be a catastrophe.
- Allow the PCI API to have a united form around a shared object which
actually holds much more data than the PCI::Address structure. This is
fundamental if we want to do certain types of optimizations, and be
able to support more features of the PCI bus in the foreseeable
future.
This patch already has several implications:
- All PCI::Device(s) hold a reference to a DeviceIdentifier structure
being given originally from the PCI::Access singleton. This means that
all instances of DeviceIdentifier structures are located in one place,
and all references are pointing to that location. This ensures that
locking the operation spinlock will take effect in all the appropriate
places.
- We no longer support adding PCI host controllers and then immediately
allow for enumerating it with a lambda function. It was found that
this method is extremely broken and too much complicated to work
reliably with the new paradigm being introduced in this patch. This
means that for Volume Management Devices (Intel VMD devices), we
simply first enumerate the PCI bus for such devices in the storage
code, and if we find a device, we attach it in the PCI::Access method
which will scan for devices behind that bridge and will add new
DeviceIdentifier(s) objects to its internal Vector. Afterwards, we
just continue as usual with scanning for actual storage controllers,
so we will find a corresponding NVMe controllers if there were any
behind that VMD bridge.
This header has always been fundamentally a Kernel API file. Move it
where it belongs. Include it directly in Kernel files, and make
Userland applications include it via sys/ioctl.h rather than directly.
Instead of just returning nothing, let's return Error or nothing.
This would help later on with error propagation in case of failure
during this method.
This also makes us more paranoid about failure in this method, so when
initializing a DisplayConnector we safely tear down the internal members
of the object. This applies the same for a StorageDevice object, but its
after_inserting method is much smaller compared to the DisplayConnector
overriden method.
We really don't want callers of this function to accidentally change
the jail, or even worse - remove the Process from an attached jail.
To ensure this never happens, we can just declare this method as const
so nobody can mutate it this way.
The setting of scan code set sequence is removed, as it's buggy and
could lead the controller to fail immediately when doing self-test
afterwards. We will restore it when we understand how to do so safely.
Allow the user to determine a preferred detection path with a new kernel
command line argument. The defualt option is to check i8042 presence
with an ACPI check and if necessary - an "aggressive" test to determine
i8042 existence in the system.
Also, keep the i8042 controller pointer on the stack, so don't assign
m_i8042_controller member pointer if it does not exist.
A virtual method named device_name() was added to
Kernel::PCI to support logging the PCI::Device name
and address using dmesgln_pci. Previously, PCI::Device
did not store the device name.
All devices inheriting from PCI::Device now use dmesgln_pci where
they previously used dmesgln.
These instances were detected by searching for files that include
AK/Memory.h, but don't match the regex:
\\b(fast_u32_copy|fast_u32_fill|secure_zero|timing_safe_compare)\\b
This regex is pessimistic, so there might be more files that don't
actually use any memory function.
In theory, one might use LibCPP to detect things like this
automatically, but let's do this one step after another.
These instances were detected by searching for files that include
AK/StdLibExtras.h, but don't match the regex:
\\b(abs|AK_REPLACED_STD_NAMESPACE|array_size|ceil_div|clamp|exchange|for
ward|is_constant_evaluated|is_power_of_two|max|min|mix|move|_RawPtr|RawP
tr|round_up_to_power_of_two|swap|to_underlying)\\b
(Without the linebreaks.)
This regex is pessimistic, so there might be more files that don't
actually use any "extra stdlib" functions.
In theory, one might use LibCPP to detect things like this
automatically, but let's do this one step after another.
This step would ideally not have been necessary (increases amount of
refactoring and templates necessary, which in turn increases build
times), but it gives us a couple of nice properties:
- SpinlockProtected inside Singleton (a very common combination) can now
obtain any lock rank just via the template parameter. It was not
previously possible to do this with SingletonInstanceCreator magic.
- SpinlockProtected's lock rank is now mandatory; this is the majority
of cases and allows us to see where we're still missing proper ranks.
- The type already informs us what lock rank a lock has, which aids code
readability and (possibly, if gdb cooperates) lock mismatch debugging.
- The rank of a lock can no longer be dynamic, which is not something we
wanted in the first place (or made use of). Locks randomly changing
their rank sounds like a disaster waiting to happen.
- In some places, we might be able to statically check that locks are
taken in the right order (with the right lock rank checking
implementation) as rank information is fully statically known.
This refactoring even more exposes the fact that Mutex has no lock rank
capabilites, which is not fixed here.
From now on, we don't allow jailed processes to open all device nodes in
/dev, but only allow jailed processes to open /dev/full, /dev/zero,
/dev/null, and various TTY and PTY devices (and not including virtual
consoles) so we basically restrict applications to what they can do when
they are in jail.
The motivation for this type of restriction is to ensure that even if a
remote code execution occurred, the damage that can be done is very
small.
We also don't restrict reading and writing on device nodes that were
already opened, because that limit seems not useful, especially in the
case where we do want to provide an OpenFileDescription to such device
but nothing further than that.
The ProcFS is an utter mess currently, so let's start move things that
are not related to processes-info. To ensure it's done in a sane manner,
we start by duplicating all /proc/ global nodes to the /sys/kernel/
directory, then we will move Userland to use the new directory so the
old directory nodes can be removed from the /proc directory.
This class is intended to replace all IOAddress usages in the Kernel
codebase altogether. The idea is to ensure IO can be done in
arch-specific manner that is determined mostly in compile-time, but to
still be able to use most of the Kernel code in non-x86 builds. Specific
devices that rely on x86-specific IO instructions are already placed in
the Arch/x86 directory and are omitted for non-x86 builds.
The reason this works so well is the fact that x86 IO space acts in a
similar fashion to the traditional memory space being available in most
CPU architectures - the x86 IO space is essentially just an array of
bytes like the physical memory address space, but requires x86 IO
instructions to load and store data. Therefore, many devices allow host
software to interact with the hardware registers in both ways, with a
noticeable trend even in the modern x86 hardware to move away from the
old x86 IO space to exclusively using memory-mapped IO.
Therefore, the IOWindow class encapsulates both methods for x86 builds.
The idea is to allow PCI devices to be used in either way in x86 builds,
so when trying to map an IOWindow on a PCI BAR, the Kernel will try to
find the proper method being declared with the PCI BAR flags.
For old PCI hardware on non-x86 builds this might turn into a problem as
we can't use port mapped IO, so the Kernel will gracefully fail with
ENOTSUP error code if that's the case, as there's really nothing we can
do within such case.
For general IO, the read{8,16,32} and write{8,16,32} methods are
available as a convenient API for other places in the Kernel. There are
simply no direct 64-bit IO API methods yet, as it's not needed right now
and is not considered to be Arch-agnostic too - the x86 IO space doesn't
support generating 64 bit cycle on IO bus and instead requires two 2
32-bit accesses. If for whatever reason it appears to be necessary to do
IO in such manner, it could probably be added with some neat tricks to
do so. It is recommended to use Memory::TypedMapping struct if direct 64
bit IO is actually needed.
The i8042 controller with its attached devices, the PS2 keyboard and
mouse, rely on x86-specific IO instructions to work. Therefore, move
them to the Arch/x86 directory to make it easier to omit the handling
code of these devices.
The AHCI code doesn't rely on x86 IO at all as it only uses memory
mapped IO so we can simply remove the header.
We also simply don't use x86 IO in the Intel graphics driver, so we can
simply remove the include of the x86 IO header there too.
Everything else was a bunch of stale includes to the x86 IO header and
are actually not necessary, so let's remove them to make it easier to
compile non-x86 Kernel builds.
The VMWare backdoor handling code involves many x86-specific
instructions and therefore should be in the Arch/x86 directory. This
ensures we can easily omit the code in compile-time for non-x86 builds.
Only use the Bochs debug output if we compile a x86 build since bochs
debug output relies on x86 specific instructions.
We also remove the CONSOLE_OUT_TO_BOCHS_DEBUG_PORT flag as we always
compile bochs debug output for x86 builds and we always want to include
the bochs debug output capability as it is very handy and doesn't hurt
bare metal hardware or do any other problem besides taking a small
amount of CPU cycles.
Many code patterns and hardware procedures rely on reliable delay in the
microseconds granularity, and since they are using such delays which are
valid cases, but should not rely on x86 specific code, we allow to
determine in compile time the proper platform-specific code to use to
invoke such delays.
Before this change, we had File::mmap() which did all the work of
setting up a VMObject, and then creating a Region in the current
process's address space.
This patch simplifies the interface by removing the region part.
Files now only have to return a suitable VMObject from
vmobject_for_mmap(), and then sys$mmap() itself will take care of
actually mapping it into the address space.
This fixes an issue where we'd try to block on I/O (for inode metadata
lookup) while holding the address space spinlock. It also reduces time
spent holding the address space lock.
This forces anyone who wants to look into and/or manipulate an address
space to lock it. And this replaces the previous, more flimsy, manual
spinlock use.
Note that pointers *into* the address space are not safe to use after
you unlock the space. We've got many issues like this, and we'll have
to track those down as wlel.
Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:
- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable
This patch renames the Kernel classes so that they can coexist with
the original AK classes:
- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable
The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
Instead of having two separate implementations of AK::RefCounted, one
for userspace and one for kernelspace, there is now RefCounted and
AtomicRefCounted.
All users which relied on the default constructor use a None lock rank
for now. This will make it easier to in the future remove LockRank and
actually annotate the ranks by searching for None.
It is starting to get a little messy with how each device can try to add
or remove itself to either /sys/dev/block or /sys/dev/char directories.
To better do this, we introduce 4 virtual methods to take care of that,
so until we ensure all nodes in /sys/dev/block and /sys/dev/char are
actual symlinks, we allow the Device base class to call virtual methods
upon insertion or before being destroying, so it add itself elegantly to
either of these directories or remove itself when needed.
For special cases where we need to create symlinks, we have two virtual
methods to be called otherwise to do almost the same thing mentioned
before, but to use symlinks instead.
This change in fact does the following:
1. Use support for symlinks between /sys/dev/block/ storage device
identifier nodes and devices in /sys/devices/storage/{LUN}.
2. Add basic nodes in a /sys/devices/storage/{LUN} directory, to let
userspace to know about the device and its details.
These methods are essentially splitted from the after_inserting method
and the will_be_destroyed method so later on we can allow Storage
devices to override the after_inserting method and the will_be_destroyed
method while still being able to use shared functionality as before,
such as adding the device to and removing it from the device list.
This folder in the SysFS code represents everything related to /sys/dev,
which is a directory meant to be a convenient interface to track all IDs
of all block and character devices (ID = major:minor numbers).
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).
No functional changes.
When the size of the audio data was not a multiple of a page size,
subtracting the page size from this unsigned variable would underflow it
close to 2^32 and be clamped to the page size again. This would lead to
writes into garbage addresses because of an incorrect write size,
interestingly only causing the write() call to error out.
Using saturating math neatly fixes this problem and allows buffer
lengths that are not a multiple of a page size.
In a previous commit I moved everything into the new subdirectories in
FileSystem/SysFS directory without trying to actually make changes in
the code itself too much. Now it's time to split the code to make it
more readable and understandable, hence this change occurs now.