Commit graph

7407 commits

Author SHA1 Message Date
demostanis
c56cbf8027 CMake: Quote all CMAKE_COMMAND occurences
Building might fail if the cmake command path contains
whitespace. See https://stackoverflow.com/a/35853080.
2022-09-02 23:34:47 +01:00
Tim Schumacher
5e11a512d6 Kernel: Buffer an entire region when generating coredumps
This allows us to unlock the region tree lock early, to avoid keeping
the lock while we are doing IO.
2022-08-31 16:28:47 +02:00
Tim Schumacher
32a03cffeb Kernel: Work using copies of specific region data during a coredump
This limits our interaction with the "real" region tree (and therefore
its lock) to the time where we actually read from the user address
space.
2022-08-31 16:28:47 +02:00
Liav A
2c84466ad8 Kernel/Storage: Introduce new boot device addressing modes
Before of this patch, we supported two methods to address a boot device:
1. Specifying root=/dev/hdXY, where X is a-z letter which corresponds to
a boot device, and Y as number from 1 to 16, to indicate the partition
number, which can be omitted to instruct the kernel to use a raw device
rather than a partition on a raw device.
2. Specifying root=PARTUUID: with a GUID string of a GUID partition. In
case of existing storage device with GPT partitions, this is most likely
the safest option to ensure booting from persistent storage.

While option 2 is more advanced and reliable, the first option has 2
caveats:
1. The string prefix "/dev/hd" doesn't mean anything beside a convention
on Linux installations, that was taken into use in Serenity. In Serenity
we don't mount DevTmpFS before we mount the boot device on /, so the
kernel doesn't really access /dev anyway, so this convention is only a
big misleading relic that can easily make the user to assume we access
/dev early on boot.
2. This convention although resemble the simple linux convention, is
quite limited in specifying a correct boot device across hardware setup
changes, so option 2 was recommended to ensure the system is always
bootable.

With these caveats in mind, this commit tries to fix the problem with
adding more addressing options as well as to remove the first option
being mentioned above of addressing.
To sum it up, there are 4 addressing options:
1. Hardware relative address - Each instance of StorageController is
assigned with a index number relative to the type of hardware it handles
which makes it possible to address storage devices with a prefix of the
commandset ("ata" for ATA, "nvme" for NVMe, "ramdisk" for Plain memory),
and then the number for the parent controller relative hardware index,
another number LUN target_id, and a third number for LUN disk_id.
2. LUN address - Similar to the previous option, but instead we rely on
the parent controller absolute index for the first number.
3. Block device major and minor numbers - by specifying the major and
minor numbers, the kernel can simply try to get the corresponding block
device and use it as the boot device.
4. GUID string, in the same fashion like before, so the user use the
"PARTUUID:" string prefix and add the GUID of the GPT partition.

For the new address modes 1 and 2, the user can choose to also specify a
partition out of the selected boot device. To do that, the user needs to
append the semicolon character and then add the string "partX" where X
is to be changed for the partition number. We start counting from 0, and
therefore the first partition number is 0 and not 1 in the kernel boot
argument.
2022-08-30 00:50:15 +01:00
b14ckcat
550b3c7330 Kernel/USB: Rework UHCI interrupt transfer schedule
This reworks the way the UHCI schedule is set up to handle interrupt
transfers, creating 11 queue heads each assigned a different
period/latency, so that interrupt transfers can be linked into the
schedule with their specified period more easily.
2022-08-28 13:40:07 +02:00
b14ckcat
4a3a0ac19e Kernel/USB: Rework queued transfer schedule
Modifies the way the UHCI schedule is set up & modified to allow for
multiple transfers of the same type, from one or more devices, to be
queued up and handled simultaneously.
2022-08-28 13:40:07 +02:00
Idan Horowitz
12300b7d0b Kernel: Dump OOM debug info after releasing the MM global data lock
Otherwise we would be holding the MM global data lock and the Process
address space locks in reversed order to the rest of the system, which
can lead to deadlocks.
2022-08-27 21:54:13 +03:00
Idan Horowitz
4ce326205e Kernel: Stop verifying interrupts are disabled in Process::for_each
This is a left-over from back when we didn't have any locking on the
global Process list, nor did we have SMP support, so this acted as some
kind of locking mechanism. We now have proper locks around the Process
list, so this is no longer relevant.
2022-08-27 21:54:13 +03:00
Skye Sprung
eca85f2050 Kernel: Changed serial debug functions and variables for readability
Change the name of set_serial_debug(bool on_or_off) to
set_serial_debug_enabled(bool desired_state). This is to make the names
more expressive and less unclear as to what the function does, as it
only sets the enabled state.
Likewise, change the name of get_serial_debug() to
is_serial_debug_enabled() in order to make clear from the name that
this is simply the state of s_serial_debug_enabled.
Change the name of serial_debug to s_serial_debug_enabled since this is
a static bool describing this state.
Finally, change the signature of set_serial_debug_enabled to return a
bool, as this is more logical and understandable.
2022-08-27 19:42:52 +01:00
Tim Schumacher
8ba6e96d05 Kernel: Reorganize and colorize the scheduler thread list dump 2022-08-26 13:07:07 +02:00
Tim Schumacher
2bf5052608 Kernel: Show more (b)locking info when dumping the process list 2022-08-26 13:07:07 +02:00
Timon Kruiper
47af812a23 Kernel/aarch64: Implement VERIFY_INTERRUPTS_{ENABLED, DISABLED}
This requires to stub `Process::all_instances()`.
2022-08-26 12:51:57 +02:00
Timon Kruiper
026f37b031 Kernel: Move Spinlock functions back to arch independent Locking folder
Now that the Spinlock code is not dependent on architectural specific
code anymore, we can move it back to the Locking folder. This also means
that the Spinlock implemenation is now used for the aarch64 kernel.
2022-08-26 12:51:57 +02:00
Timon Kruiper
c9118de5a6 Kernel/aarch64: Implement critical section related functions 2022-08-26 12:51:57 +02:00
Timon Kruiper
e8aff0c1c8 Kernel: Use InterruptsState in Spinlock code
This commit updates the lock function from Spinlock and
RecursiveSpinlock to return the InterruptsState of the processor,
instead of the processor flags. The unlock functions would only look at
the interrupt flag of the processor flags, so we now use the
InterruptsState enum to clarify the intent, and such that we can use the
same Spinlock code for the aarch64 build.

To not break the build, all the call sites are updated aswell.
2022-08-26 12:51:57 +02:00
Timon Kruiper
6432f3eee8 Kernel: Add enum InterruptsState and helper functions
This commit adds the concept of an InterruptsState to the kernel. This
will be used to make the Spinlock code architecture independent. A new
Processor.cpp file is added such that we don't have to duplicate the
code.
2022-08-26 12:51:57 +02:00
Timon Kruiper
c1b0d254fd Kernel/aarch64: Add stubs for Mutex::lock and Mutex::unlock
Also add the g_scheduler_lock to the Dummy.cpp. This makes the aarch64
build succeeded again.
2022-08-26 12:51:57 +02:00
Andreas Kling
a3b2b20782 Kernel: Remove global MM lock in favor of SpinlockProtected
Globally shared MemoryManager state is now kept in a GlobalData struct
and wrapped in SpinlockProtected.

A small set of members are left outside the GlobalData struct as they
are only set during boot initialization, and then remain constant.
This allows us to access those members without taking any locks.
2022-08-26 01:04:51 +02:00
Andreas Kling
2c72d495a3 Kernel: Use RefPtr instead of LockRefPtr for PhysicalPage
I believe this to be safe, as the main thing that LockRefPtr provides
over RefPtr is safe copying from a shared LockRefPtr instance. I've
inspected the uses of RefPtr<PhysicalPage> and it seems they're all
guarded by external locking. Some of it is less obvious, but this is
an area where we're making continuous headway.
2022-08-24 18:35:41 +02:00
Andreas Kling
5a804b9a1d Kernel: Make PhysicalPage::ref() use relaxed memory order
When incrementing a reference count, it should be sufficient to use
relaxed ordering. Note that unref() still uses acquire-release.
2022-08-24 18:35:41 +02:00
Andreas Kling
0dd88fd836 Kernel: Remove unnecessary forward declaration of s_mm_lock 2022-08-24 14:57:51 +02:00
Andreas Kling
ac3ea277aa Kernel: Don't take MM lock in ~PageDirectory()
We don't need the MM lock to unregister a PageDirectory from the CR3
map. This is already protected by the CR3 map's own lock.
2022-08-24 14:57:51 +02:00
Andreas Kling
5beed613ca Kernel: Don't take MM lock in MemoryManager::dump_kernel_regions()
We have to hold the region tree lock while dumping its regions anyway,
and taking the MM lock here was unnecessary.
2022-08-24 14:57:51 +02:00
Andreas Kling
05156cac94 Kernel: Don't take MM lock in MemoryManager::enter_address_space()
We're not accessing any of the MM members here. Also remove some
redundant code to update CR3, since it calls activate_page_directory()
which does exactly the same thing.
2022-08-24 14:57:51 +02:00
Andreas Kling
2607a6a4bd Kernel: Update comment about what the MM lock protects 2022-08-24 14:57:51 +02:00
Andreas Kling
da24a937f5 Kernel: Don't wrap AddressSpace's RegionTree in SpinlockProtected
Now that AddressSpace itself is always SpinlockProtected, we don't
need to also wrap the RegionTree. Whoever has the AddressSpace locked
is free to poke around its tree.
2022-08-24 14:57:51 +02:00
Andreas Kling
d3e8eb5918 Kernel: Make file-backed memory regions remember description permissions
This allows sys$mprotect() to honor the original readable & writable
flags of the open file description as they were at the point we did the
original sys$mmap().

IIUC, this is what Dr. POSIX wants us to do:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/mprotect.html

Also, remove the bogus and racy "W^X" checking we did against mappings
based on their current inode metadata. If we want to do this, we can do
it properly. For now, it was not only racy, but also did blocking I/O
while holding a spinlock.
2022-08-24 14:57:51 +02:00
Andreas Kling
30861daa93 Kernel: Simplify the File memory-mapping API
Before this change, we had File::mmap() which did all the work of
setting up a VMObject, and then creating a Region in the current
process's address space.

This patch simplifies the interface by removing the region part.
Files now only have to return a suitable VMObject from
vmobject_for_mmap(), and then sys$mmap() itself will take care of
actually mapping it into the address space.

This fixes an issue where we'd try to block on I/O (for inode metadata
lookup) while holding the address space spinlock. It also reduces time
spent holding the address space lock.
2022-08-24 14:57:51 +02:00
Andreas Kling
cf16b2c8e6 Kernel: Wrap process address spaces in SpinlockProtected
This forces anyone who wants to look into and/or manipulate an address
space to lock it. And this replaces the previous, more flimsy, manual
spinlock use.

Note that pointers *into* the address space are not safe to use after
you unlock the space. We've got many issues like this, and we'll have
to track those down as wlel.
2022-08-24 14:57:51 +02:00
Andreas Kling
d6ef18f587 Kernel: Don't hog the MM lock while unmapping regions
We were holding the MM lock across all of the region unmapping code.
This was previously necessary since the quickmaps used during unmapping
required holding the MM lock.

Now that it's no longer necessary, we can leave the MM lock alone here.
2022-08-24 14:57:51 +02:00
Andreas Kling
dc9d2c1b10 Kernel: Wrap RegionTree objects in SpinlockProtected
This makes locking them much more straightforward, and we can remove
a bunch of confusing use of AddressSpace::m_lock. That lock will also
be converted to use of SpinlockProtected in a subsequent patch.
2022-08-24 14:57:51 +02:00
James Bellamy
9c1ee8cbd1 Kernel: Remove big lock from sys$socket
With the implementation of the credentials object the socket syscall no
longer needs the big lock.
2022-08-23 20:29:50 +02:00
Timon Kruiper
d62bd3c635 Kernel/aarch64: Properly initialize T0SZ and T1SZ fields in TCR_EL1
By default these 2 fields were zero, which made it rely on
implementation defined behavior whether these fields internally would be
set to the correct value. The ARM processor in the Raspberry PI (and
QEMU 6.x) would actually fixup these values, whereas QEMU 7.x now does
not do that anymore, and a translation fault would be generated instead.

For more context see the relevant QEMU issue:
 - https://gitlab.com/qemu-project/qemu/-/issues/1157

Fixes #14856
2022-08-23 09:23:27 -04:00
Samuel Bowman
91574ed677 Kernel: Fix boot profiling
Boot profiling was previously broken due to init_stage2() passing the
event mask to sys$profiling_enable() via kernel pointer, but a user
pointer is expected.

To fix this, I added Process::profiling_enable() as an alternative to
Process::sys$profiling_enable which takes a u64 rather than a
Userspace<u64 const*>. It's a bit of a hack, but it works.
2022-08-23 11:48:50 +02:00
Anthony Iacono
ec3d8a7a18 Kernel: Remove unused Process::in_group() 2022-08-23 01:01:48 +02:00
Andreas Kling
434d77cd43 Kernel/ProcFS: Silently ignore attempts to update ProcFS timestamps
We have to override Inode::update_timestamps() for ProcFS inodes,
otherwise we'll get the default behavior of erroring with ENOTIMPL.
2022-08-23 01:00:40 +02:00
Andreas Kling
5307e1bf01 Kernel/SysFS: Silently ignore attempts to update SysFS timestamps
We have to override Inode::update_timestamps() for SysFS inodes,
otherwise we'll get the default behavior of erroring with ENOTIMPL.
2022-08-23 00:55:41 +02:00
Andreas Kling
4c081e0479 Kernel/x86: Protect the CR3->PD map with a spinlock
This can be accessed from multiple CPUs at the same time, so relying on
the interrupt flag is clearly insufficient.
2022-08-22 17:56:03 +02:00
Andreas Kling
6cd3695761 Kernel: Stop taking MM lock while using regular quickmaps
You're still required to disable interrupts though, as the mappings are
per-CPU. This exposed the fact that our CR3 lookup map is insufficiently
protected (but we'll address that in a separate commit.)
2022-08-22 17:56:03 +02:00
Andreas Kling
c8375c51ff Kernel: Stop taking MM lock while using PD/PT quickmaps
This is no longer required as these quickmaps are now per-CPU. :^)
2022-08-22 17:56:03 +02:00
Andreas Kling
a838fdfd88 Kernel: Make the page table quickmaps per-CPU
While the "regular" quickmap (used to temporarily map a physical page
at a known address for quick access) has been per-CPU for a while,
we also have the PD (page directory) and PT (page table) quickmaps
used by the memory management code to edit page tables. These have been
global, which meant that SMP systems had to keep fighting over them.

This patch makes *all* quickmaps per-CPU. We reserve virtual addresses
for up to 64 CPUs worth of quickmaps for now.

Note that all quickmaps are still protected by the MM lock, and we'll
have to fix that too, before seeing any real throughput improvements.
2022-08-22 17:56:03 +02:00
Andreas Kling
930dedfbd8 Kernel: Make sys$utime() and sys$utimensat() not take the big lock 2022-08-22 17:56:03 +02:00
Andreas Kling
280694bb46 Kernel: Update atime/ctime/mtime timestamps atomically
Instead of having three separate APIs (one for each timestamp),
there's now only Inode::update_timestamps() and it takes 3x optional
timestamps. The non-empty timestamps are updated while holding the inode
mutex, and the outside world no longer has to look at intermediate
timestamp states.
2022-08-22 17:56:03 +02:00
Andreas Kling
35b2e9c663 Kernel: Make sys$mknod() not take the big lock 2022-08-22 17:56:03 +02:00
Anthony Iacono
f86b671de2 Kernel: Use Process::credentials() and remove user ID/group ID helpers
Move away from using the group ID/user ID helpers in the process to
allow for us to take advantage of the immutable credentials instead.
2022-08-22 12:46:32 +02:00
Andreas Kling
42435ce5e4 Kernel: Make sys$recvfrom() with MSG_DONTWAIT not so racy
Instead of temporary changing the open file description's "blocking"
flag while doing a non-waiting recvfrom, we instead plumb the currently
wanted blocking behavior all the way through to the underlying socket.
2022-08-21 16:45:42 +02:00
Andreas Kling
8997c6a4d1 Kernel: Make Socket::connect() take credentials as input 2022-08-21 16:35:03 +02:00
Andreas Kling
51318d51a4 Kernel: Make Socket::bind() take credentials as input 2022-08-21 16:33:09 +02:00
Andreas Kling
8d0bd3f225 Kernel: Make LocalSocket do chown/chmod through VFS
This ensures that all the permissions checks are made against the
provided credentials. Previously we were just calling through directly
to the inode setters, which did no security checks!
2022-08-21 16:22:34 +02:00
Andreas Kling
dbe182f1c6 Kernel: Make Inode::resolve_as_link() take credentials as input 2022-08-21 16:17:13 +02:00