Commit graph

267 commits

Author SHA1 Message Date
Liav A
0c675192c9 Kernel: Send SIGBUS to threads that use after valid Inode mmaped range
According to Dr. POSIX, we should allow to call mmap on inodes even on
ranges that currently don't map to any actual data. Trying to read or
write to those ranges should result in SIGBUS being sent to the thread
that did violating memory access.
2022-09-16 14:55:45 +03:00
Liav A
3ad0e1a1d5 Kernel: Handle mmap requests on zero-length data file inodes safely 2022-09-16 14:55:45 +03:00
Filiph Sandström
7e1e208d08 Kernel: Add basic aarch64 support to MemoryManager
FIXME: There's still a lot to do like for example, port `quickmap_page`.
This does however get us further into the boot process than before.
2022-09-12 00:56:44 +01:00
Idan Horowitz
12300b7d0b Kernel: Dump OOM debug info after releasing the MM global data lock
Otherwise we would be holding the MM global data lock and the Process
address space locks in reversed order to the rest of the system, which
can lead to deadlocks.
2022-08-27 21:54:13 +03:00
Timon Kruiper
e8aff0c1c8 Kernel: Use InterruptsState in Spinlock code
This commit updates the lock function from Spinlock and
RecursiveSpinlock to return the InterruptsState of the processor,
instead of the processor flags. The unlock functions would only look at
the interrupt flag of the processor flags, so we now use the
InterruptsState enum to clarify the intent, and such that we can use the
same Spinlock code for the aarch64 build.

To not break the build, all the call sites are updated aswell.
2022-08-26 12:51:57 +02:00
Andreas Kling
a3b2b20782 Kernel: Remove global MM lock in favor of SpinlockProtected
Globally shared MemoryManager state is now kept in a GlobalData struct
and wrapped in SpinlockProtected.

A small set of members are left outside the GlobalData struct as they
are only set during boot initialization, and then remain constant.
This allows us to access those members without taking any locks.
2022-08-26 01:04:51 +02:00
Andreas Kling
2c72d495a3 Kernel: Use RefPtr instead of LockRefPtr for PhysicalPage
I believe this to be safe, as the main thing that LockRefPtr provides
over RefPtr is safe copying from a shared LockRefPtr instance. I've
inspected the uses of RefPtr<PhysicalPage> and it seems they're all
guarded by external locking. Some of it is less obvious, but this is
an area where we're making continuous headway.
2022-08-24 18:35:41 +02:00
Andreas Kling
5a804b9a1d Kernel: Make PhysicalPage::ref() use relaxed memory order
When incrementing a reference count, it should be sufficient to use
relaxed ordering. Note that unref() still uses acquire-release.
2022-08-24 18:35:41 +02:00
Andreas Kling
ac3ea277aa Kernel: Don't take MM lock in ~PageDirectory()
We don't need the MM lock to unregister a PageDirectory from the CR3
map. This is already protected by the CR3 map's own lock.
2022-08-24 14:57:51 +02:00
Andreas Kling
5beed613ca Kernel: Don't take MM lock in MemoryManager::dump_kernel_regions()
We have to hold the region tree lock while dumping its regions anyway,
and taking the MM lock here was unnecessary.
2022-08-24 14:57:51 +02:00
Andreas Kling
05156cac94 Kernel: Don't take MM lock in MemoryManager::enter_address_space()
We're not accessing any of the MM members here. Also remove some
redundant code to update CR3, since it calls activate_page_directory()
which does exactly the same thing.
2022-08-24 14:57:51 +02:00
Andreas Kling
2607a6a4bd Kernel: Update comment about what the MM lock protects 2022-08-24 14:57:51 +02:00
Andreas Kling
da24a937f5 Kernel: Don't wrap AddressSpace's RegionTree in SpinlockProtected
Now that AddressSpace itself is always SpinlockProtected, we don't
need to also wrap the RegionTree. Whoever has the AddressSpace locked
is free to poke around its tree.
2022-08-24 14:57:51 +02:00
Andreas Kling
d3e8eb5918 Kernel: Make file-backed memory regions remember description permissions
This allows sys$mprotect() to honor the original readable & writable
flags of the open file description as they were at the point we did the
original sys$mmap().

IIUC, this is what Dr. POSIX wants us to do:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/mprotect.html

Also, remove the bogus and racy "W^X" checking we did against mappings
based on their current inode metadata. If we want to do this, we can do
it properly. For now, it was not only racy, but also did blocking I/O
while holding a spinlock.
2022-08-24 14:57:51 +02:00
Andreas Kling
cf16b2c8e6 Kernel: Wrap process address spaces in SpinlockProtected
This forces anyone who wants to look into and/or manipulate an address
space to lock it. And this replaces the previous, more flimsy, manual
spinlock use.

Note that pointers *into* the address space are not safe to use after
you unlock the space. We've got many issues like this, and we'll have
to track those down as wlel.
2022-08-24 14:57:51 +02:00
Andreas Kling
d6ef18f587 Kernel: Don't hog the MM lock while unmapping regions
We were holding the MM lock across all of the region unmapping code.
This was previously necessary since the quickmaps used during unmapping
required holding the MM lock.

Now that it's no longer necessary, we can leave the MM lock alone here.
2022-08-24 14:57:51 +02:00
Andreas Kling
dc9d2c1b10 Kernel: Wrap RegionTree objects in SpinlockProtected
This makes locking them much more straightforward, and we can remove
a bunch of confusing use of AddressSpace::m_lock. That lock will also
be converted to use of SpinlockProtected in a subsequent patch.
2022-08-24 14:57:51 +02:00
Andreas Kling
6cd3695761 Kernel: Stop taking MM lock while using regular quickmaps
You're still required to disable interrupts though, as the mappings are
per-CPU. This exposed the fact that our CR3 lookup map is insufficiently
protected (but we'll address that in a separate commit.)
2022-08-22 17:56:03 +02:00
Andreas Kling
c8375c51ff Kernel: Stop taking MM lock while using PD/PT quickmaps
This is no longer required as these quickmaps are now per-CPU. :^)
2022-08-22 17:56:03 +02:00
Andreas Kling
a838fdfd88 Kernel: Make the page table quickmaps per-CPU
While the "regular" quickmap (used to temporarily map a physical page
at a known address for quick access) has been per-CPU for a while,
we also have the PD (page directory) and PT (page table) quickmaps
used by the memory management code to edit page tables. These have been
global, which meant that SMP systems had to keep fighting over them.

This patch makes *all* quickmaps per-CPU. We reserve virtual addresses
for up to 64 CPUs worth of quickmaps for now.

Note that all quickmaps are still protected by the MM lock, and we'll
have to fix that too, before seeing any real throughput improvements.
2022-08-22 17:56:03 +02:00
Andreas Kling
11eee67b85 Kernel: Make self-contained locking smart pointers their own classes
Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:

- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable

This patch renames the Kernel classes so that they can coexist with
the original AK classes:

- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable

The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
2022-08-20 17:20:43 +02:00
Andreas Kling
e475263113 AK+Kernel: Add AK::AtomicRefCounted and use everywhere in the kernel
Instead of having two separate implementations of AK::RefCounted, one
for userspace and one for kernelspace, there is now RefCounted and
AtomicRefCounted.
2022-08-20 17:15:52 +02:00
kleines Filmröllchen
4314c25cf2 Kernel: Require lock rank for Spinlock construction
All users which relied on the default constructor use a None lock rank
for now. This will make it easier to in the future remove LockRank and
actually annotate the ranks by searching for None.
2022-08-19 20:26:47 -07:00
Liav A
a1a1462a22 Kernel/Memory: Use scope guard to remove a region if we failed to map it 2022-08-19 15:26:04 +03:00
Andreas Kling
5ada38f9c3 Kernel: Reduce time under VMObject lock while handling zero faults
We only need to hold the VMObject lock while inspecting and/or updating
the physical page array in the VMObject.
2022-08-19 12:52:48 +02:00
Andreas Kling
a84d893af8 Kernel/x86: Re-enable interrupts ASAP when handling page faults
As soon as we've saved CR2 (the faulting address), we can re-enable
interrupt processing. This should make the kernel more responsive under
heavy fault loads.
2022-08-19 12:14:57 +02:00
Andreas Kling
4bc3745ce6 Kernel: Make Region's physical page accessors safer to use
Region::physical_page() now takes the VMObject lock while accessing the
physical pages array, and returns a RefPtr<PhysicalPage>. This ensures
that the array access is safe.

Region::physical_page_slot() now VERIFY()'s that the VMObject lock is
held by the caller. Since we're returning a reference to the physical
page slot in the VMObject's physical page array, this is the best we
can do here.
2022-08-18 19:20:33 +02:00
Andreas Kling
b560442fe1 Kernel: Don't hog VMObject lock when remapping a region page
We really only need the VMObject lock when accessing the physical pages
array, so once we have a strong pointer to the physical page we want to
remap, we can give up the VMObject lock.

This fixes a deadlock I encountered while building DOOM on SMP.
2022-08-18 18:56:35 +02:00
Andreas Kling
10399a258f Kernel: Move Region physical page accessors out of line 2022-08-18 18:52:34 +02:00
Andreas Kling
c14dda14c4 Kernel: Add a comment about what the MM lock protects 2022-08-18 18:52:34 +02:00
Andreas Kling
75348bdfd3 Kernel: Don't require MM lock for Region::set_page_directory()
The MM lock is not required for this, it's just a simple ref-counted
pointer assignment.
2022-08-18 18:52:34 +02:00
Andreas Kling
27c1135d30 Kernel: Don't remap all regions from Region::remap_vmobject_page()
When handling a page fault, we only need to remap the faulting region in
the current process. There's no need to traverse *all* regions that map
the same VMObject and remap them cross-process as well.

Those other regions will get remapped lazily by their own page fault
handlers eventually. Or maybe they won't and we avoided some work. :^)
2022-08-18 18:52:34 +02:00
Andreas Kling
45e6123de8 Kernel: Shorten time under spinlocks while handling inode faults
- Instead of holding the VMObject lock across physical page allocation
  and quick-map + copy, we now only hold it when updating the VMObject's
  physical page slot.
2022-08-18 18:52:34 +02:00
Mike Akers
de980de0e4 Kernel: Lock the inode before writing in SharedInodeVMObject::sync
We ensure that when we call SharedInodeVMObject::sync we lock the inode
lock before calling Inode virtual write_bytes method directly to avoid
assertion on the unlocked inode lock, as it was regressed recently. This
is not a complete fix as the need to lock from each path before calling
the write_bytes method should be avoided because it can lead to
hard-to-find bugs, and this commit only fixes the problem temporarily.
2022-08-16 16:54:03 +02:00
dylanbobb
8180211431 Kernel: Release 1 page instead of all pages when starved for pages
Previously, when starved for pages, *all* clean file-backed memory
would be released, which is quite excessive.

This patch instead releases just 1 page, since only 1 page is needed
to satisfy the request to `allocate_physical_page()`
2022-08-16 01:13:17 +02:00
dylanbobb
09d0fae2c6 Kernel: Allow release of a specific amount of of clean pages
Previously, we could only release *all* clean pages.
This patch makes it possible to release a specific amount of clean
pages. If the attempted number of pages to release is more than the
amount of clean pages, all clean pages will be released.
2022-08-16 01:13:17 +02:00
Idan Horowitz
4edae21bd1 Kernel: Remove regions from the region tree after failing to map them
At the point at which we try to map the Region it was already added to
the Process region tree, so we have to make sure to remove it before
freeing it in the mapping failure path, otherwise the tree will contain
a dangling pointer to the free'd instance.
2022-08-15 02:42:28 +02:00
Jorropo
ec4b83326b Kernel: Don't release file-pages if volatile memory purge did it 2022-08-15 00:11:33 +02:00
Andreas Kling
3c7b0dab0b Kernel: Dump list of processes and their memory usage when OOMing 2022-08-14 23:33:28 +02:00
Andreas Kling
9e9924115f Kernel: Release some clean file-backed memory when starved for pages
Until now, our only backup plan when running out of physical pages
was to try and purge volatile memory. If that didn't work out, we just
hung userspace out to dry with an ENOMEM.

This patch improves the situation by also considering clean, file-backed
pages (that we could page back in from disk).

This could be better in many ways, but it already allows us to boot to
WindowServer with 256 MiB of RAM. :^)
2022-08-14 23:33:28 +02:00
Andreas Kling
92556e07d3 Kernel: Update outdated "user physical pages" terminology
These are now just "physical pages".
2022-08-14 23:33:28 +02:00
Brian Gianforcaro
2d06f6399f Kernel: Fix SMP deadlock in MM::allocate_contiguous_physical_pages
This deadlock was introduced with the creation of this API. The lock
order is such that we always need to take the page directory lock
before we ever take the MM lock.

This function violated that, as both Region creation and region
destruction require the pd and mm locks, but with the mm lock
already acquired we deadlocked with SMP mode enabled while other
threads were allocating regions.

With this change SMP boots to the desktop successfully for me,
(and then subsequently has other issues). :^)
2022-08-09 12:09:59 +02:00
Hendiadyoin1
66fc06001d Kernel: Add some inline capacity to find_regions_intersecting
This should avoid some allocations during simple cases of munmap,
mprotect and msync, where you usually don't have a lot of regions anyway
2022-07-15 12:42:43 +02:00
Liav A
e4e5fa74d0 Kernel+Userland: Rename prefix of user_physical => physical
There's no such supervisor pages concept, so there's no need to call
physical pages with the "user_physical" prefix anymore.
2022-07-14 23:27:46 +02:00
Liav A
1c499e75bd Kernel+Userland: Remove supervisor pages concept
There's no real value in separating physical pages to supervisor and
user types, so let's remove the concept and just let everyone to use
"user" physical pages which can be allocated from any PhysicalRegion
we want to use. Later on, we will remove the "user" prefix as this
prefix is not needed anymore.
2022-07-14 23:27:46 +02:00
Liav A
37b4133c51 Kernel: Allocate user physical pages instead of supervisor ones for DMA
We are limited on the amount of supervisor pages we can allocate, so
don't allocate from that pool. Supervisor pages are always below 16 MiB
barrier so using those was crucial when we used devices like the ISA
SoundBlaster 16 card, because that device required very low physical
addresses to be used.
2022-07-14 13:15:24 +02:00
sin-ack
3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Idan Horowitz
8717e78918 Kernel: Stop committing pages for COW of uncommitted pages on sys$fork
Uncommitted pages (shared zero pages) can not contain any existing data
and can not be modified, so there's no point to committing a bunch of
extra pages to cover for them in the forked child.
2022-07-11 16:29:10 +02:00
Idan Horowitz
1d96c30488 Kernel: Stop leaking leftover committed cow pages from forked processes
Since both the parent process and child process hold a reference to the
COW committed set, once the child process exits, the committed COW
pages are effectively leaked, only being slowly re-claimed each time
the parent process writes to one of them, realizing it's no longer
shared, and uncommitting it.
In order to mitigate this we now hold a weak reference the parent
VMObject from which the pages are cloned, and we use it on destruction
when available to drop the reference to the committed set from it as
well.
2022-07-10 22:17:21 +03:00
Tim Schumacher
cedec9751a Kernel: Decrease the amount of address space offset randomization
This is basically unchanged since the beginning of 2020, which is a year
before we had proper ASLR.

Now that we have a proper ASLR implementation, we can turn this down a
bit, as it is no longer our only protection against predictable dynamic
loader addresses, and it actually obstructs the default loading address
of x86_64 quite frequently.
2022-06-21 22:38:15 +01:00