Commit graph

3483 commits

Author SHA1 Message Date
Andreas Kling
14493645e0 Kernel: Make Region::amount_shared() and amount_resident() lazy-aware
Don't count the lazy-committed page towards shared/resident amounts.
2021-01-02 00:47:55 +01:00
Tom
a1904b67e9 Kernel: Fix dirty page map bitmap
We also need to check against the new lazy allocation page
when generating the dirty page bitmap.
2021-01-02 00:10:21 +01:00
Tom
e87eaf5df0 Kernel: Fix memory corruption when rolling back regions in execve
We need to free the regions before reverting the paging scope to the
original one when rolling back changes due to an error. This fixes
silent memory corruption.
2021-01-01 23:43:44 +01:00
Tom
2f429bd2d5 Kernel: Pass new region owner to Region::clone 2021-01-01 23:43:44 +01:00
Tom
a0c91719d8 Kernel: Restore thread count if thread cannot be fully created 2021-01-01 23:43:44 +01:00
Tom
bf9be3ec01 Kernel: More gracefully handle out-of-memory when creating PageDirectory 2021-01-01 23:43:44 +01:00
Tom
ae956edf6e Kernel: Improve some low-memory situations with ext2 2021-01-01 23:43:44 +01:00
Tom
476f17b3f1 Kernel: Merge PurgeableVMObject into AnonymousVMObject
This implements memory commitments and lazy-allocation of committed
memory.
2021-01-01 23:43:44 +01:00
Tom
b2a52f6208 Kernel: Implement lazy committed page allocation
By designating a committed page pool we can guarantee to have physical
pages available for lazy allocation in mappings. However, when forking
we will overcommit. The assumption is that worst-case it's better for
the fork to die due to insufficient physical memory on COW access than
the parent that created the region. If a fork wants to ensure that all
memory is available (trigger a commit) then it can use madvise.

This also means that fork now can gracefully fail if we don't have
enough physical pages available.
2021-01-01 23:43:44 +01:00
Tom
e21cc4cff6 Kernel: Remove MAP_PURGEABLE from mmap
This brings mmap more in line with other operating systems. Prior to
this, it was impossible to request memory that was definitely committed,
instead MAP_PURGEABLE would provide a region that was not actually
purgeable, but also not fully committed, which meant that using such memory
still could cause crashes when the underlying pages could no longer be
allocated.

This fixes some random crashes in low-memory situations where non-volatile
memory is mapped (e.g. malloc, tls, Gfx::Bitmap, etc) but when a page in
these regions is first accessed, there is insufficient physical memory
available to commit a new page.
2021-01-01 23:43:44 +01:00
Tom
c3451899bc Kernel: Add MAP_NORESERVE support to mmap
Rather than lazily committing regions by default, we now commit
the entire region unless MAP_NORESERVE is specified.

This solves random crashes in low-memory situations where e.g. the
malloc heap allocated memory, but using pages that haven't been
used before triggers a crash when no more physical memory is available.

Use this flag to create large regions without actually committing
the backing memory. madvise() can be used to commit arbitrary areas
of such regions after creating them.
2021-01-01 23:43:44 +01:00
Tom
bc5d6992a4 Kernel: Memory purging improvements
This adds the ability for a Region to define volatile/nonvolatile
areas within mapped memory using madvise(). This also means that
memory purging takes into account all views of the PurgeableVMObject
and only purges memory that is not needed by all of them. When calling
madvise() to change an area to nonvolatile memory, return whether
memory from that area was purged. At that time also try to remap
all memory that is requested to be nonvolatile, and if insufficient
pages are available notify the caller of that fact.
2021-01-01 23:43:44 +01:00
Liav A
9dc8bea3e7 Kernel: Allow to boot from a partition with partition UUID
Instead of specifying the boot argument to be root=/dev/hdXY, now
one can write root=PARTUUID= with the right UUID, and if the partition
is found, the kernel will boot from it.

This feature is mainly used with GUID partitions, and is considered to
be the most reliable way for the kernel to identify partitions.
2021-01-01 22:59:48 +01:00
Andreas Kling
7c3b6b10e4 Kernel: Remove the limited use of AK::TypeTraits we had in the kernel
This was only used for VMObject and we can do without it there. This is
preparation for migrating to dynamic_cast-based helpers in userspace.
2021-01-01 15:32:44 +01:00
Andrew Kaster
350d4d3543 Meta: Enable RTTI for Userspace programs
RTTI is still disabled for the Kernel, and for the Dynamic Loader. This
allows for much less awkward navigation of class heirarchies in LibCore,
LibGUI, LibWeb, and LibJS (eventually). Measured RootFS size increase
was < 1%, and libgui.so binary size was ~3.3%. The small binary size
increase here seems worth it :^)
2021-01-01 14:45:09 +01:00
Brian Gianforcaro
ab6ee9f7b2 CMake: Remove some trailing whitespace from a few CMakeLists.txt files 2021-01-01 14:37:04 +01:00
Andrew Kaster
a3a9016701 DynamicLoader: Tell the linker to not add a PT_INTERP header
Use the GNU LD option --no-dynamic-linker. This allows uncommenting some
code in the Kernel that gets upset if your ELF interpreter has its own
interpreter.
2021-01-01 02:12:28 +01:00
Linus Groh
bbe787a0af Everywhere: Re-format with clang-format-11
Compared to version 10 this fixes a bunch of formatting issues, mostly
around structs/classes with attributes like [[gnu::packed]], and
incorrect insertion of spaces in parameter types ("T &"/"T &&").
I also removed a bunch of // clang-format off/on and FIXME comments that
are no longer relevant - on the other hand it tried to destroy a couple of
neatly formatted comments, so I had to add some as well.
2020-12-31 21:51:00 +01:00
Tom
72440d90fe Kernel: Fix BlockCondition::unblock return value
BlockCondition::unblock should return true if it unblocked at
least one thread, not if iterating the blockers had been stopped.
This is a regression introduced by 49a76164c.

Fixes #4670
2020-12-31 10:52:58 +01:00
Tom
82c4812730 Kernel: Remove flawed SharedInodeVMObject assertion
This assertion cannot be safely/reliably made in the
~SharedInodeVMObject destructor. The problem is that
Inode::is_shared_vmobject holds a weak reference to the instance
that is being destroyed (ref count 0). Checking the pointer using
WeakPtr::unsafe_ptr will produce nullptr depending on timing in
this case, and WeakPtr::safe_ref will reliably produce a nullptr
as soon as the reference count drops to 0. The only case where
this assertion could succeed is when WeakPtr::unsafe_ptr returned
the pointer because it won the race against revoking it. And
because WeakPtr::safe_ref will always return a nullptr, we cannot
reliably assert this from the ~SharedInodeVMObject destructor.

Fixes #4621
2020-12-31 10:52:45 +01:00
Andreas Kling
1fdd39ff14 Kernel: Sprinkle some lockers in Inode
It did look pretty suspicious the way we were accessing members in some
of these functions without taking the lock first.
2020-12-31 02:10:31 +01:00
Luke
0f66589007 Everywhere: Fix more typos 2020-12-31 01:47:41 +01:00
Tom
22250780ff Kernel: Fix heap expansions deadlock
If a heap expansion is triggered by allocating from e.g. the
RangeAllocator, which may be holding a spin lock, we cannot
immediately allocate another block of backup memory, which could
require the same locks to be acquired. So, defer allocating the
backup memory

Fixes #4675
2020-12-31 01:15:37 +01:00
asynts
7e62ffbc6e AK+Format: Remove TypeErasedFormatParams& from format function. 2020-12-30 20:33:53 +01:00
Luke
865f5ed4f6 Kernel: Prevent sign bit extension when creating a PDPTE
When doing the cast to u64 on the page directory physical address,
the sign bit was being extended. This only beomes an issue when
crossing the 2 GiB boundary. At >= 2 GiB, the physical address
has the sign bit set. For example, 0x80000000.

This set all the reserved bits in the PDPTE, causing a GPF
when loading the PDPT pointer into CR3. The reserved bits are
presumably there to stop you writing out a physical address that
the CPU physically cannot handle, as the size of the reserved bits
is determined by the physical address width of the CPU.

This fixes this by casting to FlatPtr instead. I believe the sign
extension only happens when casting to a bigger type. I'm also using
FlatPtr because it's a pointer we're writing into the PDPTE.
sizeof(FlatPtr) will always be the same size as sizeof(void*).

This also now asserts that the physical address in the PDPTE is
within the max physical address the CPU supports. This is better
than getting a GPF, because CPU::handle_crash tries to do the same
operation that caused the GPF in the first place. That would cause
an infinite loop of GPFs until the stack was exhausted, causing a
triple fault.

As far as I know and tested, I believe we can now use the full 32-bit
physical range without crashing.

Fixes #4584. See that issue for the full debugging story.
2020-12-30 20:33:15 +01:00
Linus Groh
d84b96bddc Kernel: Embed a Metadata notes entry in coredumps 2020-12-30 16:28:27 +01:00
Linus Groh
91332515a6 Kernel: Add sys$set_coredump_metadata() syscall
This can be used by applications to store information (key/value pairs)
likely useful for debugging, which will then be embedded in the coredump.
2020-12-30 16:28:27 +01:00
Linus Groh
6fe6e0a36a Kernel: Embed a ProcessInfo notes entry in coredumps 2020-12-30 15:00:17 +01:00
Tom
49a76164c8 Kernel: Consolidate the various BlockCondition::unblock variants
The unblock_all variant used to ASSERT if a blocker didn't unblock,
but it wasn't clear from the name that it would do that. Because
the BlockCondition already asserts that no blockers are left at
destruction time, it would still catch blockers that haven't been
unblocked for whatever reason.

Fixes #4496
2020-12-30 13:23:17 +01:00
asynts
50d24e4f98 AK: Make binary_search signature more generic. 2020-12-30 02:13:30 +01:00
Tom
c2332780ee Kernel: Fix HPET::update_time to set ticks within the valid range
ticks_this_second must be less than the ticks per second (frequency).
2020-12-30 02:11:06 +01:00
meme
23b23cee5a Build: Support non-i686 toolchains
* Add SERENITY_ARCH option to CMake for selecting the target toolchain
* Port all build scripts but continue to use i686
* Update GitHub Actions cache to include BuildIt.sh
2020-12-29 17:42:04 +01:00
Andreas Kling
af28a8ad11 Kernel: Hold InodeVMObject reference while inspecting it in sys$mmap() 2020-12-29 15:43:35 +01:00
Andreas Kling
b8db585a83 Kernel: Remove unnecessary non-const Inode::shared_vmobject() 2020-12-29 15:43:35 +01:00
Andreas Kling
30dbe9c78a Kernel+LibC: Add a very limited sys$mremap() implementation
This syscall can currently only remap a shared file-backed mapping into
a private file-backed mapping.
2020-12-29 02:20:43 +01:00
Luke
b980782343 Kernel/VM: Make local_offset in PhysicalRegion::find_one_free_page unsigned
An extension to #4613, as I didn't notice that it also happens here.
2020-12-29 02:20:26 +01:00
Luke
eb38fe4a82 Kernel/VM: Make local_offset in PhysicalRegion::free_page_at unsigned
Anything above or equal to the 2 GB mark has the left most bit set
(0x8000...), which was falsely interpreted as negative due to
local_offset being signed.

This makes it unsigned by using FlatPtr. To check for underflow as
was intended, lets use Checked instead.

Fixes #4585
2020-12-29 01:41:16 +01:00
Andreas Kling
43d9fe15f9 Revert "Kernel: Convert read_block method to get a reference instead of pointer"
This reverts commit 092a13211a.

Fixes #4611.
2020-12-29 00:06:52 +01:00
Liav A
72b1998f0d Kernel: Introduce a new partitioning subsystem
The partitioning code was very outdated, and required a full refactor.
The new subsystem removes duplicated code and uses more AK containers.

The most important change is that all implementations of the
PartitionTable class conform to one interface, which made it possible
to remove unnecessary code in the EBRPartitionTable class.

Finding partitions is now done in the StorageManagement singleton,
instead of doing so in init.cpp.

Also, now we don't try to find partitions on demand - the kernel will
try to detect if a StorageDevice is partitioned, and if so, will check
what is the partition table, which could be MBR, GUID or EBR.
Then, it will create DiskPartitionMetadata object for each partition
that is available in the partition table. This object will be used
by the partition enumeration code to create a DiskPartition with the
correct minor number.
2020-12-27 23:07:44 +01:00
Liav A
43d833d94f Kernel: Add DiskPartitionMetadata Class
This class will be used to describe a partition of a StorageDevice,
without creating a DiskPartition object.
2020-12-27 23:07:44 +01:00
Liav A
3a19e18d1e Kernel: Move Partition code files to the Storage folder
This folder is more appropriate for these files.
2020-12-27 23:07:44 +01:00
Liav A
247517cd4a Kernel: Introduce the DevFS
The DevFS along with DevPtsFS give a complete solution for populating
device nodes in /dev. The main purpose of DevFS is to eliminate the
need of device nodes generation when building the system.

Later on, DevFS will assist with exposing disk partition nodes.
2020-12-27 23:07:44 +01:00
Liav A
18e77aa285 Kernel: Add a method to determine the desired permissions of a Device
This method will be used later in DevFS, to set the appropriate
permissions for each device node.
2020-12-27 23:07:44 +01:00
Liav A
092a13211a Kernel: Convert read_block method to get a reference instead of pointer
BlockBasedFileSystem::read_block method should get a reference of
a UserOrKernelBuffer.

If we need to force caching a block, we will call other method to do so.
2020-12-27 23:07:44 +01:00
Nathan Lanza
d1891f67ac
AK: Use direct-list-initialization for Vector::empend() (#4564)
clang trunk with -std=c++20 doesn't seem to properly look for an
aggregate initializer here when the type being constructed is a simple
aggregate (e.g. `struct Thing { int a; int b; };`). This template fails
to compile in a usage added 12/16/2020 in `AK/Trie.h`.

Both forms of initialization are supposed to call the
aggregate-initializers but direct-list-initialization delegating to
aggregate initializers is a new addition in c++20 that might not be
implemented yet.
2020-12-27 23:06:37 +01:00
Brendan Coles
fae2304c67 Kernel: CoreDump::write_program_headers: set NOTE p_memsz to p_filesz 2020-12-27 22:45:25 +01:00
Andreas Kling
ddaedbca87 Kernel: Allow sys$rename() to rename symlinks
Previously, this syscall would try to rename the target of the link,
not the link itself.
2020-12-27 15:38:07 +01:00
Brian Gianforcaro
815d39886f Kernel: Tag more methods and types as [[nodiscard]]
Tag methods at where not obvserving the return value is an obvious error
with [[nodiscard]] to catch potential future bugs.
2020-12-27 11:09:30 +01:00
Tom
f1534ff36e Kernel: Take into account the time keeper's frequency (if no HPET)
The PIT is now also running at a rate of ~250 ticks/second, so rather
than assuming there are 1000 ticks/second we need to query the timer
being used for the actual frequency.

Fixes #4508
2020-12-27 01:17:50 +01:00
Andreas Kling
0e2b7f9c9a Kernel: Remove the per-process icon_id and sys$set_process_icon()
This was a goofy kernel API where you could assign an icon_id (int) to
a process which referred to a global shbuf with a 16x16 icon bitmap
inside it.

Instead of this, programs that want to display a process icon now
retrieve it from the process executable instead.
2020-12-27 01:16:56 +01:00