Commit graph

257 commits

Author SHA1 Message Date
Andreas Kling
05156cac94 Kernel: Don't take MM lock in MemoryManager::enter_address_space()
We're not accessing any of the MM members here. Also remove some
redundant code to update CR3, since it calls activate_page_directory()
which does exactly the same thing.
2022-08-24 14:57:51 +02:00
Andreas Kling
2607a6a4bd Kernel: Update comment about what the MM lock protects 2022-08-24 14:57:51 +02:00
Andreas Kling
da24a937f5 Kernel: Don't wrap AddressSpace's RegionTree in SpinlockProtected
Now that AddressSpace itself is always SpinlockProtected, we don't
need to also wrap the RegionTree. Whoever has the AddressSpace locked
is free to poke around its tree.
2022-08-24 14:57:51 +02:00
Andreas Kling
d3e8eb5918 Kernel: Make file-backed memory regions remember description permissions
This allows sys$mprotect() to honor the original readable & writable
flags of the open file description as they were at the point we did the
original sys$mmap().

IIUC, this is what Dr. POSIX wants us to do:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/mprotect.html

Also, remove the bogus and racy "W^X" checking we did against mappings
based on their current inode metadata. If we want to do this, we can do
it properly. For now, it was not only racy, but also did blocking I/O
while holding a spinlock.
2022-08-24 14:57:51 +02:00
Andreas Kling
cf16b2c8e6 Kernel: Wrap process address spaces in SpinlockProtected
This forces anyone who wants to look into and/or manipulate an address
space to lock it. And this replaces the previous, more flimsy, manual
spinlock use.

Note that pointers *into* the address space are not safe to use after
you unlock the space. We've got many issues like this, and we'll have
to track those down as wlel.
2022-08-24 14:57:51 +02:00
Andreas Kling
d6ef18f587 Kernel: Don't hog the MM lock while unmapping regions
We were holding the MM lock across all of the region unmapping code.
This was previously necessary since the quickmaps used during unmapping
required holding the MM lock.

Now that it's no longer necessary, we can leave the MM lock alone here.
2022-08-24 14:57:51 +02:00
Andreas Kling
dc9d2c1b10 Kernel: Wrap RegionTree objects in SpinlockProtected
This makes locking them much more straightforward, and we can remove
a bunch of confusing use of AddressSpace::m_lock. That lock will also
be converted to use of SpinlockProtected in a subsequent patch.
2022-08-24 14:57:51 +02:00
Andreas Kling
6cd3695761 Kernel: Stop taking MM lock while using regular quickmaps
You're still required to disable interrupts though, as the mappings are
per-CPU. This exposed the fact that our CR3 lookup map is insufficiently
protected (but we'll address that in a separate commit.)
2022-08-22 17:56:03 +02:00
Andreas Kling
c8375c51ff Kernel: Stop taking MM lock while using PD/PT quickmaps
This is no longer required as these quickmaps are now per-CPU. :^)
2022-08-22 17:56:03 +02:00
Andreas Kling
a838fdfd88 Kernel: Make the page table quickmaps per-CPU
While the "regular" quickmap (used to temporarily map a physical page
at a known address for quick access) has been per-CPU for a while,
we also have the PD (page directory) and PT (page table) quickmaps
used by the memory management code to edit page tables. These have been
global, which meant that SMP systems had to keep fighting over them.

This patch makes *all* quickmaps per-CPU. We reserve virtual addresses
for up to 64 CPUs worth of quickmaps for now.

Note that all quickmaps are still protected by the MM lock, and we'll
have to fix that too, before seeing any real throughput improvements.
2022-08-22 17:56:03 +02:00
Andreas Kling
11eee67b85 Kernel: Make self-contained locking smart pointers their own classes
Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:

- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable

This patch renames the Kernel classes so that they can coexist with
the original AK classes:

- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable

The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
2022-08-20 17:20:43 +02:00
Andreas Kling
e475263113 AK+Kernel: Add AK::AtomicRefCounted and use everywhere in the kernel
Instead of having two separate implementations of AK::RefCounted, one
for userspace and one for kernelspace, there is now RefCounted and
AtomicRefCounted.
2022-08-20 17:15:52 +02:00
kleines Filmröllchen
4314c25cf2 Kernel: Require lock rank for Spinlock construction
All users which relied on the default constructor use a None lock rank
for now. This will make it easier to in the future remove LockRank and
actually annotate the ranks by searching for None.
2022-08-19 20:26:47 -07:00
Liav A
a1a1462a22 Kernel/Memory: Use scope guard to remove a region if we failed to map it 2022-08-19 15:26:04 +03:00
Andreas Kling
5ada38f9c3 Kernel: Reduce time under VMObject lock while handling zero faults
We only need to hold the VMObject lock while inspecting and/or updating
the physical page array in the VMObject.
2022-08-19 12:52:48 +02:00
Andreas Kling
a84d893af8 Kernel/x86: Re-enable interrupts ASAP when handling page faults
As soon as we've saved CR2 (the faulting address), we can re-enable
interrupt processing. This should make the kernel more responsive under
heavy fault loads.
2022-08-19 12:14:57 +02:00
Andreas Kling
4bc3745ce6 Kernel: Make Region's physical page accessors safer to use
Region::physical_page() now takes the VMObject lock while accessing the
physical pages array, and returns a RefPtr<PhysicalPage>. This ensures
that the array access is safe.

Region::physical_page_slot() now VERIFY()'s that the VMObject lock is
held by the caller. Since we're returning a reference to the physical
page slot in the VMObject's physical page array, this is the best we
can do here.
2022-08-18 19:20:33 +02:00
Andreas Kling
b560442fe1 Kernel: Don't hog VMObject lock when remapping a region page
We really only need the VMObject lock when accessing the physical pages
array, so once we have a strong pointer to the physical page we want to
remap, we can give up the VMObject lock.

This fixes a deadlock I encountered while building DOOM on SMP.
2022-08-18 18:56:35 +02:00
Andreas Kling
10399a258f Kernel: Move Region physical page accessors out of line 2022-08-18 18:52:34 +02:00
Andreas Kling
c14dda14c4 Kernel: Add a comment about what the MM lock protects 2022-08-18 18:52:34 +02:00
Andreas Kling
75348bdfd3 Kernel: Don't require MM lock for Region::set_page_directory()
The MM lock is not required for this, it's just a simple ref-counted
pointer assignment.
2022-08-18 18:52:34 +02:00
Andreas Kling
27c1135d30 Kernel: Don't remap all regions from Region::remap_vmobject_page()
When handling a page fault, we only need to remap the faulting region in
the current process. There's no need to traverse *all* regions that map
the same VMObject and remap them cross-process as well.

Those other regions will get remapped lazily by their own page fault
handlers eventually. Or maybe they won't and we avoided some work. :^)
2022-08-18 18:52:34 +02:00
Andreas Kling
45e6123de8 Kernel: Shorten time under spinlocks while handling inode faults
- Instead of holding the VMObject lock across physical page allocation
  and quick-map + copy, we now only hold it when updating the VMObject's
  physical page slot.
2022-08-18 18:52:34 +02:00
Mike Akers
de980de0e4 Kernel: Lock the inode before writing in SharedInodeVMObject::sync
We ensure that when we call SharedInodeVMObject::sync we lock the inode
lock before calling Inode virtual write_bytes method directly to avoid
assertion on the unlocked inode lock, as it was regressed recently. This
is not a complete fix as the need to lock from each path before calling
the write_bytes method should be avoided because it can lead to
hard-to-find bugs, and this commit only fixes the problem temporarily.
2022-08-16 16:54:03 +02:00
dylanbobb
8180211431 Kernel: Release 1 page instead of all pages when starved for pages
Previously, when starved for pages, *all* clean file-backed memory
would be released, which is quite excessive.

This patch instead releases just 1 page, since only 1 page is needed
to satisfy the request to `allocate_physical_page()`
2022-08-16 01:13:17 +02:00
dylanbobb
09d0fae2c6 Kernel: Allow release of a specific amount of of clean pages
Previously, we could only release *all* clean pages.
This patch makes it possible to release a specific amount of clean
pages. If the attempted number of pages to release is more than the
amount of clean pages, all clean pages will be released.
2022-08-16 01:13:17 +02:00
Idan Horowitz
4edae21bd1 Kernel: Remove regions from the region tree after failing to map them
At the point at which we try to map the Region it was already added to
the Process region tree, so we have to make sure to remove it before
freeing it in the mapping failure path, otherwise the tree will contain
a dangling pointer to the free'd instance.
2022-08-15 02:42:28 +02:00
Jorropo
ec4b83326b Kernel: Don't release file-pages if volatile memory purge did it 2022-08-15 00:11:33 +02:00
Andreas Kling
3c7b0dab0b Kernel: Dump list of processes and their memory usage when OOMing 2022-08-14 23:33:28 +02:00
Andreas Kling
9e9924115f Kernel: Release some clean file-backed memory when starved for pages
Until now, our only backup plan when running out of physical pages
was to try and purge volatile memory. If that didn't work out, we just
hung userspace out to dry with an ENOMEM.

This patch improves the situation by also considering clean, file-backed
pages (that we could page back in from disk).

This could be better in many ways, but it already allows us to boot to
WindowServer with 256 MiB of RAM. :^)
2022-08-14 23:33:28 +02:00
Andreas Kling
92556e07d3 Kernel: Update outdated "user physical pages" terminology
These are now just "physical pages".
2022-08-14 23:33:28 +02:00
Brian Gianforcaro
2d06f6399f Kernel: Fix SMP deadlock in MM::allocate_contiguous_physical_pages
This deadlock was introduced with the creation of this API. The lock
order is such that we always need to take the page directory lock
before we ever take the MM lock.

This function violated that, as both Region creation and region
destruction require the pd and mm locks, but with the mm lock
already acquired we deadlocked with SMP mode enabled while other
threads were allocating regions.

With this change SMP boots to the desktop successfully for me,
(and then subsequently has other issues). :^)
2022-08-09 12:09:59 +02:00
Hendiadyoin1
66fc06001d Kernel: Add some inline capacity to find_regions_intersecting
This should avoid some allocations during simple cases of munmap,
mprotect and msync, where you usually don't have a lot of regions anyway
2022-07-15 12:42:43 +02:00
Liav A
e4e5fa74d0 Kernel+Userland: Rename prefix of user_physical => physical
There's no such supervisor pages concept, so there's no need to call
physical pages with the "user_physical" prefix anymore.
2022-07-14 23:27:46 +02:00
Liav A
1c499e75bd Kernel+Userland: Remove supervisor pages concept
There's no real value in separating physical pages to supervisor and
user types, so let's remove the concept and just let everyone to use
"user" physical pages which can be allocated from any PhysicalRegion
we want to use. Later on, we will remove the "user" prefix as this
prefix is not needed anymore.
2022-07-14 23:27:46 +02:00
Liav A
37b4133c51 Kernel: Allocate user physical pages instead of supervisor ones for DMA
We are limited on the amount of supervisor pages we can allocate, so
don't allocate from that pool. Supervisor pages are always below 16 MiB
barrier so using those was crucial when we used devices like the ISA
SoundBlaster 16 card, because that device required very low physical
addresses to be used.
2022-07-14 13:15:24 +02:00
sin-ack
3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Idan Horowitz
8717e78918 Kernel: Stop committing pages for COW of uncommitted pages on sys$fork
Uncommitted pages (shared zero pages) can not contain any existing data
and can not be modified, so there's no point to committing a bunch of
extra pages to cover for them in the forked child.
2022-07-11 16:29:10 +02:00
Idan Horowitz
1d96c30488 Kernel: Stop leaking leftover committed cow pages from forked processes
Since both the parent process and child process hold a reference to the
COW committed set, once the child process exits, the committed COW
pages are effectively leaked, only being slowly re-claimed each time
the parent process writes to one of them, realizing it's no longer
shared, and uncommitting it.
In order to mitigate this we now hold a weak reference the parent
VMObject from which the pages are cloned, and we use it on destruction
when available to drop the reference to the committed set from it as
well.
2022-07-10 22:17:21 +03:00
Tim Schumacher
cedec9751a Kernel: Decrease the amount of address space offset randomization
This is basically unchanged since the beginning of 2020, which is a year
before we had proper ASLR.

Now that we have a proper ASLR implementation, we can turn this down a
bit, as it is no longer our only protection against predictable dynamic
loader addresses, and it actually obstructs the default loading address
of x86_64 quite frequently.
2022-06-21 22:38:15 +01:00
Andrew Kaster
1d3b5d330d Kernel: Tolerate cloning MAP_STACK regions that are PROT_NONE
There's nothing stopping a userspace program from keeping a bunch of
threads around with a custom signal stack in a suspended state with
their normal thread stack mprotected to PROT_NONE.

OpenJDK seems to do this, for example.
2022-06-19 09:05:35 +02:00
Liav A
3d22917548 Kernel/Memory: Introduce the SharedFramebufferVMObject class
This new type of VMObject will be used to coordinate switching safely
from graphical mode to text mode and vice-versa, by supplying a way to
remap all Regions that were created with this object, so mappings can be
changed according to the given state of system mode. This makes it quite
easy to give applications like WindowServer the feeling of having full
access to the framebuffer device from a DisplayConnector, but still keep
the Kernel in control to be able to safely switch to text console.
2022-06-06 20:11:05 +01:00
Idan Horowitz
ea1e5b630d Kernel: Verify system memory info consistency 2022-06-06 01:36:18 +03:00
Idan Horowitz
b4e45a6636 Kernel: Tighten assertion in MM::find_free_user_physical_page
If our book-keeping of user physical pages is correct, we should always
find a physical page, regardless if it was committed or uncommitted.
2022-06-06 01:36:18 +03:00
Idan Horowitz
427f1f7e31 Kernel: Only use uncommitted pages when allocating contiguous user pages 2022-06-06 01:36:18 +03:00
Timon Kruiper
a4534678f9 Kernel: Implement InterruptDisabler using generic Processor functions
Now that the code does not use architectural specific code, it is moved
to the generic Arch directory and the paths are modified accordingly.
2022-06-02 13:14:12 +01:00
Liav A
e6ebf9e5c1 Kernel/Memory: Add TypedMapping base_address method
This method will be used to ease usage with the structure when we need
to do virtual pointer arithmetics.
2022-05-05 20:55:57 +02:00
Timon Kruiper
feba7bc8a8 Kernel: Move Kernel/Arch/x86/SafeMem.h to Kernel/Arch/SafeMem.h
The file does not contain any specific architectural code, thus it can
be moved to the Kernel/Arch directory.
2022-05-03 21:53:36 +02:00
Tim Schumacher
71f2c342c4 Kernel: Limit free space between randomized memory allocations 2022-04-21 13:16:56 +02:00
Andreas Kling
0a83c03546 Kernel: Don't unregister Region from RegionTree *before* unmapping it
If we unregister from the RegionTree before unmapping, there's a race
where a new region can get inserted at the same address that we're about
to unmap. If this happens, ~Region() will then unmap the newly inserted
region, which now finds itself with cleared-out page table entries.
2022-04-05 13:46:50 +02:00