Commit graph

73 commits

Author SHA1 Message Date
Liav A
60b088b89a Kernel: Send SIGBUS to threads that use after valid Inode mmaped range
According to Dr. POSIX, we should allow to call mmap on inodes even on
ranges that currently don't map to any actual data. Trying to read or
write to those ranges should result in SIGBUS being sent to the thread
that did violating memory access.

To implement this restriction, we simply check if the result of
read_bytes on an Inode returns 0, which means we have nothing valid to
map to the program, hence it should receive a SIGBUS in that case.
2022-09-26 20:00:34 +03:00
Liav A
6e26e9fb29 Revert "Kernel: Send SIGBUS to threads that use after valid Inode mmaped range"
This reverts commit 0c675192c9.
2022-09-24 13:49:40 +02:00
Liav A
0c675192c9 Kernel: Send SIGBUS to threads that use after valid Inode mmaped range
According to Dr. POSIX, we should allow to call mmap on inodes even on
ranges that currently don't map to any actual data. Trying to read or
write to those ranges should result in SIGBUS being sent to the thread
that did violating memory access.
2022-09-16 14:55:45 +03:00
Andreas Kling
a3b2b20782 Kernel: Remove global MM lock in favor of SpinlockProtected
Globally shared MemoryManager state is now kept in a GlobalData struct
and wrapped in SpinlockProtected.

A small set of members are left outside the GlobalData struct as they
are only set during boot initialization, and then remain constant.
This allows us to access those members without taking any locks.
2022-08-26 01:04:51 +02:00
Andreas Kling
2c72d495a3 Kernel: Use RefPtr instead of LockRefPtr for PhysicalPage
I believe this to be safe, as the main thing that LockRefPtr provides
over RefPtr is safe copying from a shared LockRefPtr instance. I've
inspected the uses of RefPtr<PhysicalPage> and it seems they're all
guarded by external locking. Some of it is less obvious, but this is
an area where we're making continuous headway.
2022-08-24 18:35:41 +02:00
Andreas Kling
d3e8eb5918 Kernel: Make file-backed memory regions remember description permissions
This allows sys$mprotect() to honor the original readable & writable
flags of the open file description as they were at the point we did the
original sys$mmap().

IIUC, this is what Dr. POSIX wants us to do:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/mprotect.html

Also, remove the bogus and racy "W^X" checking we did against mappings
based on their current inode metadata. If we want to do this, we can do
it properly. For now, it was not only racy, but also did blocking I/O
while holding a spinlock.
2022-08-24 14:57:51 +02:00
Andreas Kling
d6ef18f587 Kernel: Don't hog the MM lock while unmapping regions
We were holding the MM lock across all of the region unmapping code.
This was previously necessary since the quickmaps used during unmapping
required holding the MM lock.

Now that it's no longer necessary, we can leave the MM lock alone here.
2022-08-24 14:57:51 +02:00
Andreas Kling
6cd3695761 Kernel: Stop taking MM lock while using regular quickmaps
You're still required to disable interrupts though, as the mappings are
per-CPU. This exposed the fact that our CR3 lookup map is insufficiently
protected (but we'll address that in a separate commit.)
2022-08-22 17:56:03 +02:00
Andreas Kling
c8375c51ff Kernel: Stop taking MM lock while using PD/PT quickmaps
This is no longer required as these quickmaps are now per-CPU. :^)
2022-08-22 17:56:03 +02:00
Andreas Kling
11eee67b85 Kernel: Make self-contained locking smart pointers their own classes
Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:

- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable

This patch renames the Kernel classes so that they can coexist with
the original AK classes:

- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable

The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
2022-08-20 17:20:43 +02:00
Andreas Kling
5ada38f9c3 Kernel: Reduce time under VMObject lock while handling zero faults
We only need to hold the VMObject lock while inspecting and/or updating
the physical page array in the VMObject.
2022-08-19 12:52:48 +02:00
Andreas Kling
a84d893af8 Kernel/x86: Re-enable interrupts ASAP when handling page faults
As soon as we've saved CR2 (the faulting address), we can re-enable
interrupt processing. This should make the kernel more responsive under
heavy fault loads.
2022-08-19 12:14:57 +02:00
Andreas Kling
4bc3745ce6 Kernel: Make Region's physical page accessors safer to use
Region::physical_page() now takes the VMObject lock while accessing the
physical pages array, and returns a RefPtr<PhysicalPage>. This ensures
that the array access is safe.

Region::physical_page_slot() now VERIFY()'s that the VMObject lock is
held by the caller. Since we're returning a reference to the physical
page slot in the VMObject's physical page array, this is the best we
can do here.
2022-08-18 19:20:33 +02:00
Andreas Kling
b560442fe1 Kernel: Don't hog VMObject lock when remapping a region page
We really only need the VMObject lock when accessing the physical pages
array, so once we have a strong pointer to the physical page we want to
remap, we can give up the VMObject lock.

This fixes a deadlock I encountered while building DOOM on SMP.
2022-08-18 18:56:35 +02:00
Andreas Kling
10399a258f Kernel: Move Region physical page accessors out of line 2022-08-18 18:52:34 +02:00
Andreas Kling
75348bdfd3 Kernel: Don't require MM lock for Region::set_page_directory()
The MM lock is not required for this, it's just a simple ref-counted
pointer assignment.
2022-08-18 18:52:34 +02:00
Andreas Kling
27c1135d30 Kernel: Don't remap all regions from Region::remap_vmobject_page()
When handling a page fault, we only need to remap the faulting region in
the current process. There's no need to traverse *all* regions that map
the same VMObject and remap them cross-process as well.

Those other regions will get remapped lazily by their own page fault
handlers eventually. Or maybe they won't and we avoided some work. :^)
2022-08-18 18:52:34 +02:00
Andreas Kling
45e6123de8 Kernel: Shorten time under spinlocks while handling inode faults
- Instead of holding the VMObject lock across physical page allocation
  and quick-map + copy, we now only hold it when updating the VMObject's
  physical page slot.
2022-08-18 18:52:34 +02:00
Liav A
e4e5fa74d0 Kernel+Userland: Rename prefix of user_physical => physical
There's no such supervisor pages concept, so there's no need to call
physical pages with the "user_physical" prefix anymore.
2022-07-14 23:27:46 +02:00
Andrew Kaster
1d3b5d330d Kernel: Tolerate cloning MAP_STACK regions that are PROT_NONE
There's nothing stopping a userspace program from keeping a bunch of
threads around with a custom signal stack in a suspended state with
their normal thread stack mprotected to PROT_NONE.

OpenJDK seems to do this, for example.
2022-06-19 09:05:35 +02:00
Andreas Kling
0a83c03546 Kernel: Don't unregister Region from RegionTree *before* unmapping it
If we unregister from the RegionTree before unmapping, there's a race
where a new region can get inserted at the same address that we're about
to unmap. If this happens, ~Region() will then unmap the newly inserted
region, which now finds itself with cleared-out page table entries.
2022-04-05 13:46:50 +02:00
Andreas Kling
e3e1d79a7d Kernel: Remove unused ShouldDeallocateVirtualRange parameters
Since there is no separate virtual range allocator anymore, this is
no longer used for anything.
2022-04-05 01:15:22 +02:00
Andreas Kling
9bb45ab3a6 Kernel: Add debug logging to learn more about unexpected NP faults 2022-04-04 17:10:30 +02:00
Andreas Kling
d1f2d63840 Kernel: Remove unused Region::try_create_kernel_only() 2022-04-04 12:34:13 +02:00
Andreas Kling
07f3d09c55 Kernel: Make VM allocation atomic for userspace regions
This patch move AddressSpace (the per-process memory manager) to using
the new atomic "place" APIs in RegionTree as well, just like we did for
MemoryManager in the previous commit.

This required updating quite a few places where VM allocation and
actually committing a Region object to the AddressSpace were separated
by other code.

All you have to do now is call into AddressSpace once and it'll take
care of everything for you.
2022-04-03 21:51:58 +02:00
Andreas Kling
e852a69a06 LibWeb: Make VM allocation atomic for kernel regions
Instead of first allocating the VM range, and then inserting a region
with that range into the MM region tree, we now do both things in a
single atomic operation:

    - RegionTree::place_anywhere(Region&, size, alignment)
    - RegionTree::place_specifically(Region&, address, size)

To reduce the number of things we do while locking the region tree,
we also require callers to provide a constructed Region object.
2022-04-03 21:51:58 +02:00
Andreas Kling
e8f543c390 Kernel: Use intrusive RegionTree solution for kernel regions as well
This patch ports MemoryManager to RegionTree as well. The biggest
difference between this and the userspace code is that kernel regions
are owned by extant OwnPtr<Region> objects spread around the kernel,
while userspace regions are owned by the AddressSpace itself.

For kernelspace, there are a couple of situations where we need to make
large VM reservations that never get backed by regular VMObjects
(for example the kernel image reservation, or the big kmalloc range.)
Since we can't make a VM reservation without a Region object anymore,
this patch adds a way to create unbacked Region objects that can be
used for this exact purpose. They have no internal VMObject.)
2022-04-03 21:51:58 +02:00
Andreas Kling
02a95a196f Kernel: Use AddressSpace region tree for range allocation
This patch stops using VirtualRangeAllocator in AddressSpace and instead
looks for holes in the region tree when allocating VM space.

There are many benefits:

- VirtualRangeAllocator is non-intrusive and would call kmalloc/kfree
  when used. This new solution is allocation-free. This was a source
  of unpleasant MM/kmalloc deadlocks.

- We consolidate authority on what the address space looks like in a
  single place. Previously, we had both the range allocator *and* the
  region tree both being used to determine if an address was valid.
  Now there is only the region tree.

- Deallocation of VM when splitting regions is no longer complicated,
  as we don't need to keep two separate trees in sync.
2022-04-03 21:51:58 +02:00
James Mintram
627fd231d5 Kernel: Make Region.cpp compile on aarch64 2022-04-02 19:34:20 -07:00
Idan Horowitz
8030e2a88f Kernel: Make AnonymousVMObject COW-Bitmap allocation OOM-fallible 2022-02-11 17:49:46 +02:00
Andreas Kling
d85f062990 Revert "Kernel: Only update page tables for faulting region"
This reverts commit 1c5ffaae41.

This broke shared memory as used by OutOfProcessWebView. Let's do
a revert until we can figure out what went wrong.
2022-02-02 11:02:54 +01:00
Andreas Kling
1c5ffaae41 Kernel: Only update page tables for faulting region
When a page fault led to the mapping of a new physical page, we were
updating the page tables for *every* region that shared the same
underlying VMObject.

Let's just not do that, avoiding a bunch of unnecessary page table
updates and TLB invalidations.
2022-02-02 02:16:49 +01:00
Andreas Kling
3845c90e08 Kernel: Remove unnecessary includes from Thread.h
...and deal with the fallout by adding missing includes everywhere.
2022-01-30 16:21:59 +01:00
Idan Horowitz
5146315a15 Kernel: Convert MemoryManager::allocate_user_physical_page to ErrorOr
This allows is to use the TRY macro at the call sites, instead of using
clunky null checks.
2022-01-28 19:05:52 +02:00
Tom
6e46e21c42 Kernel: Implement Page Attribute Table (PAT) support and Write-Combine
This allows us to enable Write-Combine on e.g. framebuffers,
significantly improving performance on bare metal.

To keep things simple we right now only use one of up to three bits
(bit 7 in the PTE), which maps to the PA4 entry in the PAT MSR, which
we set to the Write-Combine mode on each CPU at boot time.
2022-01-26 09:21:04 +02:00
Andreas Kling
094b88e6a5 Kernel: Don't remap already non-writable regions when they become CoW
The only purpose of the remap() in Region::try_clone() is to ensure
non-writable page table entries for CoW regions. If a region is already
non-writable, there's no need to waste time updating the page tables.
2022-01-15 19:51:16 +01:00
Andreas Kling
c6adefcfc0 Kernel: Don't bother with page tables for PROT_NONE mappings
When mapping or unmapping completely inaccessible memory regions,
we don't need to update the page tables at all. This saves a bunch of
time in some situations, most notably during dynamic linking, where we
make a large VM reservation and immediately throw it away. :^)
2022-01-15 19:51:16 +01:00
Andreas Kling
9ddccbc6da Kernel: Use move() in Region::try_clone() to avoid a VMObject::unref() 2022-01-15 19:51:15 +01:00
Andreas Kling
8e0387e674 Kernel: Only register kernel regions with MemoryManager
We were already only tracking kernel regions, this patch just makes it
more clear by having it reflected in the name of the registration
helpers.

We also stop calling them for userspace regions, avoiding some spinlock
action in such cases.
2022-01-15 19:51:15 +01:00
Andreas Kling
5f71925aa4 Kernel: Actually clear page slots in Region::clear_to_zero()
We were copying the RefPtr<PhysicalPage> and zeroing the copy instead
of zeroing the slot itself.
2022-01-12 14:52:47 +01:00
Andreas Kling
d8206c1059 Kernel: Don't release/relock spinlocks repeatedly during space teardown
Grab the page directory and MM locks once at the start of address space
teardown, then hold onto them across all the region unmapping work.
2022-01-12 14:52:47 +01:00
Andreas Kling
2323cdd914 Kernel: Do less unnecessary work when tearing down process address space
When deleting an entire AddressSpace, we don't need to do TLB flushes
at all (since the entire page directory is going away anyway).

We also don't need to deallocate VM ranges one by one, since the entire
VM range allocator will be deleted anyway.
2022-01-12 14:52:47 +01:00
Andreas Kling
bdbff9df24 Kernel: Don't relock MM lock for every page when remapping region
Make sure that callers already hold the MM lock, and we don't have to
worry about reacquiring it every time.
2022-01-10 16:22:37 +01:00
Hendiadyoin1
1cdace7898 Kernel: Add implied auto qualifiers in Memory 2022-01-09 23:29:57 -08:00
Idan Horowitz
5f4a67434c Kernel: Move userspace virtual address range base to 0x10000
Now that the shared bottom 2 MiB virtual address mappings are gone
userspace can use lower virtual addresses.
2021-12-22 00:02:36 -08:00
Daniel Bertalan
4fc28bfe02 Kernel: Unmap Prekernel pages after they are no longer needed
The Prekernel's memory is only accessed until MemoryManager has been
initialized. Keeping them around afterwards is both unnecessary and bad,
as it prevents the userland from using the 0x100000-0x155000 virtual
address range.

Co-authored-by: Idan Horowitz <idan.horowitz@gmail.com>
2021-12-22 00:02:36 -08:00
Idan Horowitz
ff6b43734c Kernel: Add Region::clear_to_zero
This helper method can be used to quickly and efficiently zero out a
region.
2021-12-01 21:44:11 +02:00
James Mintram
eb33df0c30 Kernel: Add an x86 include check+error in x86/PageFault.h 2021-12-01 11:22:04 -08:00
Andreas Kling
1f894cee59 Kernel: Automatically sync shared file mappings when unmapped
To make sure we don't lose changes, shared file mappings will now be
fully synced when they are unmapped, whether explicitly or implicitly
(by the program exiting/crashing/etc.)

This can incur a lot of work, since we don't keep track of dirty pages,
but that's something we can optimize down the road. :^)
2021-11-17 19:35:53 +01:00
Andreas Kling
79fa9765ca Kernel: Replace KResult and KResultOr<T> with Error and ErrorOr<T>
We now use AK::Error and AK::ErrorOr<T> in both kernel and userspace!
This was a slightly tedious refactoring that took a long time, so it's
not unlikely that some bugs crept in.

Nevertheless, it does pass basic functionality testing, and it's just
real nice to finally see the same pattern in all contexts. :^)
2021-11-08 01:10:53 +01:00