Commit graph

57 commits

Author SHA1 Message Date
Andreas Kling
32ec1e5aed Kernel: Mask kernel addresses in backtraces and profiles
Addresses outside the userspace virtual range will now show up as
0xdeadc0de in backtraces and profiles generated by unprivileged users.
2020-01-02 20:51:31 +01:00
Andreas Kling
3dcec260ed Kernel: Validate the full range of user memory passed to syscalls
We now validate the full range of userspace memory passed into syscalls
instead of just checking that the first and last byte of the memory are
in process-owned regions.

This fixes an issue where it was possible to avoid rejection of invalid
addresses that sat between two valid ones, simply by passing a valid
address and a size large enough to put the end of the range at another
valid address.

I added a little test utility that tries to provoke EFAULT in various
ways to help verify this. I'm sure we can think of more ways to test
this but it's at least a start. :^)

Thanks to mozjag for pointing out that this code was still lacking!

Incidentally this also makes backtraces work again.

Fixes #989.
2020-01-02 02:17:12 +01:00
Andreas Kling
5aeaab601e Kernel: Move CPU feature detection to Arch/x86/CPU.{cpp.h}
We now refuse to boot on machines that don't support PAE since all
of our paging code depends on it.

Also let's only enable SSE and PGE support if the CPU advertises it.
2020-01-01 12:57:00 +01:00
Andreas Kling
8602fa5b49 Kernel: Enable x86 SMEP (Supervisor Mode Execution Protection)
This prevents the kernel from jumping to code in userspace memory.
2020-01-01 01:59:52 +01:00
Conrad Pankoff
17aef7dc99 Kernel: Detect support for no-execute (NX) CPU features
Previously we assumed all hosts would have support for IA32_EFER.NXE.
This is mostly true for newer hardware, but older hardware will crash
and burn if you try to use this feature.

Now we check for support via CPUID.80000001[20].
2019-12-26 10:05:51 +01:00
Andreas Kling
9e55bcb7da Kernel: Make kernel memory regions be non-executable by default
From now on, you'll have to request executable memory specifically
if you want some.
2019-12-25 22:41:34 +01:00
Andreas Kling
52deb09382 Kernel: Enable PAE (Physical Address Extension)
Introduce one more (CPU) indirection layer in the paging code: the page
directory pointer table (PDPT). Each PageDirectory now has 4 separate
PageDirectoryEntry arrays, governing 1 GB of VM each.

A really neat side-effect of this is that we can now share the physical
page containing the >=3GB kernel-only address space metadata between
all processes, instead of lazily cloning it on page faults.

This will give us access to the NX (No eXecute) bit, allowing us to
prevent execution of memory that's not supposed to be executed.
2019-12-25 13:35:57 +01:00
Andreas Kling
b6ee8a2c8d Kernel: Rename vmo => vmobject everywhere 2019-12-19 19:15:27 +01:00
Andreas Kling
0a75a46501 Kernel: Make sure the kernel info page is read-only for userspace
To enforce this, we create two separate mappings of the same underlying
physical page. A writable mapping for the kernel, and a read-only one
for userspace (the one returned by sys$get_kernel_info_page.)
2019-12-15 22:21:28 +01:00
Andreas Kling
65229a4082 Kernel: Move VMObject::for_each_region() to MemoryManager.h
It can't be in VMObject.h since it depends on MemoryManager.h
2019-12-09 20:06:03 +01:00
Andreas Kling
e56daf547c Kernel: Disallow syscalls from writeable memory
Processes will now crash with SIGSEGV if they attempt making a syscall
from PROT_WRITE memory.

This neat idea comes from OpenBSD. :^)
2019-11-29 16:30:05 +01:00
Andreas Kling
9a157b5e81 Revert "Kernel: Move Kernel mapping to 0xc0000000"
This reverts commit bd33c66273.

This broke the network card drivers, since they depended on kmalloc
addresses being identity-mapped.
2019-11-23 17:27:09 +01:00
Jesse Buhagiar
bd33c66273 Kernel: Move Kernel mapping to 0xc0000000
The kernel is now no longer identity mapped to the bottom 8MiB of
memory, and is now mapped at the higher address of `0xc0000000`.

The lower ~1MiB of memory (from GRUB's mmap), however is still
identity mapped to provide an easy way for the kernel to get
physical pages for things such as DMA etc. These could later be
mapped to the higher address too, as I'm not too sure how to
go about doing this elegantly without a lot of address subtractions.
2019-11-22 16:23:23 +01:00
Andreas Kling
794758df3a Kernel: Implement some basic stack pointer validation
VM regions can now be marked as stack regions, which is then validated
on syscall, and on page fault.

If a thread is caught with its stack pointer pointing into anything
that's *not* a Region with its stack bit set, we'll crash the whole
process with SIGSTKFLT.

Userspace must now allocate custom stacks by using mmap() with the new
MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now
we have it too, yay! :^)
2019-11-17 12:15:43 +01:00
Liav A
bce510bf6f Kernel: Fix the search method of free userspace physical pages (#742)
Now the userspace page allocator will search through physical regions,
and stop the search as it finds an available page.

Also remove an "address of" sign since we don't need that when
counting size of physical regions
2019-11-08 22:39:29 +01:00
supercomputer7
c3c905aa6c Kernel: Removing hardcoded offsets from Memory Manager
Now the kernel page directory and the page tables are located at a
safe address, to prevent from paging data colliding with garbage.
2019-11-08 17:38:23 +01:00
Andreas Kling
d67c6a92db Kernel: Move page fault handling from MemoryManager to Region
After the page fault handler has found the region in which the fault
occurred, do the rest of the work in the region itself.

This patch also makes all fault types consistently crash the process
if a new page is needed but we're all out of pages.
2019-11-04 00:47:03 +01:00
Andreas Kling
9b2dc36229 Kernel: Merge MemoryManager::map_region_at_address() into Region::map() 2019-11-04 00:05:57 +01:00
Andreas Kling
4bf1a72d21 Kernel: Teach Region how to remap itself
Now remapping (i.e flushing kernel metadata to the CPU page tables)
is done by simply calling Region::remap().
2019-11-03 21:11:08 +01:00
Andreas Kling
2cfc43c982 Kernel: Move region map/unmap operations into the Region class
The more Region can take care of itself, the better.
2019-11-03 21:11:08 +01:00
Andreas Kling
fe455c5ac4 Kernel: Move page remapping into Region::remap_page(index)
Let Region deal with this, instead of everyone calling MemoryManager.
2019-11-03 15:32:11 +01:00
Tom
00a7c48d6e APIC: Enable APIC and start APs 2019-10-16 19:14:02 +02:00
Andreas Kling
2584636d19 Kernel: Fix partial munmap() deallocating still-in-use VM
We were always returning the full VM range of the partially-unmapped
Region to the range allocator. This caused us to re-use those addresses
for subsequent VM allocations.

This patch also skips creating a new VMObject in partial munmap().
Instead we just make split regions that point into the same VMObject.

This fixes the mysterious GCC ICE on large C++ programs.
2019-09-27 20:21:52 +02:00
Andreas Kling
7f9a33dba1 Kernel: Make Region single-owner instead of ref-counted
This simplifies the ownership model and makes Region easier to reason
about. Userspace Regions are now primarily kept by Process::m_regions.

Kernel Regions are kept in various OwnPtr<Regions>'s.

Regions now only ever get unmapped when they are destroyed.
2019-09-27 14:25:42 +02:00
Andreas Kling
a40afc4562 Kernel: Get rid of MemoryManager::allocate_page_table()
We can just use the physical page allocator directly, there's no need
for a dedicated function for page tables.
2019-09-15 20:34:03 +02:00
Andreas Kling
73fdbba59c AK: Rename <AK/AKString.h> to <AK/String.h>
This was a workaround to be able to build on case-insensitive file
systems where it might get confused about <string.h> vs <String.h>.

Let's just not support building that way, so String.h can have an
objectively nicer name. :^)
2019-09-06 15:36:54 +02:00
Andreas Kling
9104d32341 Kernel: Use range-for with InlineLinkedList 2019-08-08 13:40:58 +02:00
Andreas Kling
07425580a8 Kernel: Put all Regions on InlineLinkedLists (separated by user/kernel)
Remove the global hash tables and replace them with InlineLinkedLists.
This significantly reduces the kernel heap pressure from doing many
small mmap()'s.
2019-08-08 11:11:22 +02:00
Andreas Kling
a96d76fd90 Kernel: Put all VMObjects in an InlineLinkedList instead of a HashTable
Using a HashTable to track "all instances of Foo" is only useful if we
actually need to look up entries by some kind of index. And since they
are HashTable (not HashMap), the pointer *is* the index.

Since we have the pointer, we can just use it directly. Duh.
This increase sizeof(VMObject) by two pointers, but removes a global
table that had an entry for every VMObject, where the cost was higher.
It also avoids all the general hash tabling business when creating or
destroying VMObjects. Generally we should do more of this. :^)
2019-08-08 11:11:22 +02:00
Andreas Kling
98ce498922 Kernel: Remove unused MemoryManager::remove_identity_mapping()
This was not actually used and just sitting there being confusing.
2019-08-07 22:13:10 +02:00
Andreas Kling
c973a51a23 Kernel: Make KBuffer lazily populated
KBuffers are now zero-filled on demand instead of up front. This means
that you can create a huge KBuffer and it will only take up VM, not
physical pages (until you access them.)
2019-08-06 15:06:31 +02:00
Andreas Kling
2ad963d261 Kernel: Add mapping from page directory base (PDB) to PageDirectory
This allows the page fault code to find the owning PageDirectory and
corresponding process for faulting regions.

The mapping is implemented as a global hash map right now, which is
definitely not optimal. We can come up with something better when it
becomes necessary.
2019-08-06 11:30:26 +02:00
Andreas Kling
8d07bce12a Kernel: Break region_from_vaddr() into {user,kernel}_region_from_vaddr
Sometimes you're only interested in either user OR kernel regions but
not both. Let's break this into two functions so the caller can choose
what he's interested in.
2019-08-06 10:33:22 +02:00
Andreas Kling
79e22acb22 Kernel: Use KBuffers for ProcFS and SynthFS
Instead of generating ByteBuffers and keeping those lying around, have
these filesystems generate KBuffers instead. These are way less spooky
to leave around for a while.

Since FileDescription will keep a generated file buffer around until
userspace has read the whole thing, this prevents trivially exhausting
the kmalloc heap by opening many files in /proc for example.

The code responsible for generating each /proc file is not perfectly
efficient and many of them still use ByteBuffers internally but they
at least go away when we return now. :^)
2019-08-05 11:37:48 +02:00
Andreas Kling
f8beb0f665 Kernel: Share the "return to ring 0/3 from signal" trampolines globally.
Generate a special page containing the "return from signal" trampoline code
on startup and then route signalled threads to it. This avoids a page
allocation in every process that ever receives a signal.
2019-07-19 17:01:16 +02:00
Andreas Kling
5b2447a27b Kernel: Track user accessibility per Region.
Region now has is_user_accessible(), which informs the memory manager how
to map these pages. Previously, we were just passing a "bool user_allowed"
to various functions and I'm not at all sure that any of that was correct.

All the Region constructors are now hidden, and you must go through one of
these helpers to construct a region:

- Region::create_user_accessible(...)
- Region::create_kernel_only(...)

That ensures that we don't accidentally create a Region without specifying
user accessibility. :^)
2019-07-19 16:11:52 +02:00
Andreas Kling
3dac1f8ac5 Kernel: Remove use of [[gnu::pure]].
I was messing around with this to tell the compiler that these functions
always return the same value no matter how many times you call them.

It doesn't really seem to improve code generation and it looks weird so
let's just get rid of it.
2019-07-16 13:44:41 +02:00
Andreas Kling
eca5c2bdf8 Kernel: Move VirtualAddress.h into VM/ 2019-07-09 15:04:45 +02:00
Andreas Kling
27f699ef0c AK: Rename the common integer typedefs to make it obvious what they are.
These types can be picked up by including <AK/Types.h>:

* u8, u16, u32, u64 (unsigned)
* i8, i16, i32, i64 (signed)
2019-07-03 21:20:13 +02:00
Andreas Kling
601b0a8c68 Kernel: Use NonnullRefPtrVector in parts of the kernel. 2019-06-27 13:35:02 +02:00
Andreas Kling
183205d51c Kernel: Make the x86 paging code slightly less insane.
Instead of PDE's and PTE's being weird wrappers around dword*, just have
MemoryManager::ensure_pte() return a PageDirectoryEntry&, which in turn has
a PageTableEntry* entries().

I've been trying to understand how things ended up this way, and I suspect
it was because I inadvertently invoked the PageDirectoryEntry copy ctor in
the original work on this, which must have made me very confused..

Anyways, now things are a bit saner and we can move forward towards a better
future, etc. :^)
2019-06-26 21:45:56 +02:00
Andreas Kling
d343fb2429 AK: Rename Retainable.h => RefCounted.h. 2019-06-21 18:58:45 +02:00
Andreas Kling
550b0b062b AK: Rename RetainPtr.h => RefPtr.h, Retained.h => NonnullRefPtr.h. 2019-06-21 18:45:59 +02:00
Andreas Kling
90b1354688 AK: Rename RetainPtr => RefPtr and Retained => NonnullRefPtr. 2019-06-21 18:37:47 +02:00
Sergey Bugaev
118cb391dd VM: Pass a PhysicalPage by rvalue reference when returning it to the freelist.
This makes no functional difference, but it makes it clear that
MemoryManager and PhysicalRegion take over the actual physical
page represented by this PhysicalPage instance.
2019-06-14 16:14:49 +02:00
Conrad Pankoff
aee9317d86 Kernel: Refactor MemoryManager to use a Bitmap rather than a Vector
This significantly reduces the pressure on the kernel heap when
allocating a lot of pages.

Previously at about 250MB allocated, the free page list would outgrow
the kernel's heap. Given that there is no longer a page list, this does
not happen.

The next barrier will be the kernel memory used by the page records for
in-use memory. This kicks in at about 1GB.
2019-06-12 15:38:17 +02:00
Andreas Kling
9da62f52a1 Kernel: Use the Multiboot memory map info to inform our paging setup.
This makes it possible to run Serenity with more than 64 MB of RAM.
Because each physical page is represented by a PhysicalPage object, and such
objects are allocated using kmalloc_eternal(), more RAM means more pressure
on kmalloc_eternal(), so we're gonna need a better strategy for this.

But for now, let's just celebrate that we can use the 128 MB of RAM we've
been telling QEMU to run with. :^)
2019-06-09 11:48:58 +02:00
Andreas Kling
736092a087 Kernel: Move i386.{cpp,h} => Arch/i386/CPU.{cpp,h}
There's a ton of work that would need to be done before we could spin up on
another architecture, but let's at least try to separate things out a bit.
2019-06-07 20:02:01 +02:00
Andreas Kling
39d1a9ae66 Meta: Tweak .clang-format to not wrap braces after enums. 2019-06-07 17:13:23 +02:00
Andreas Kling
e42c3b4fd7 Kernel: Rename LinearAddress => VirtualAddress. 2019-06-07 12:56:50 +02:00