The pattern of having Prekernel inherit all of the build flags of the
Kernel, and then disabling some flags by adding `-fno-<flag>` options
to then disable those options doesn't work in all scenarios. For example
the ASAN flag `-fasan-shadow-offset=<offset>` has no option to disable
it once it's been passed, so in a future change where this flag is added
we need to be able to disable it cleanly.
The cleaner way is to just allow the Prekernel CMake logic to filter out
the COMPILE_OPTIONS specified for that specific target. This allows us
to remove individual options without trashing all inherited options.
Now that the old PCI::Device was removed, we can complete the PCI
changes by making the PCI::DeviceController to be named PCI::Device.
Really the entire purpose and the distinction between the two was about
interrupts, but since this is no longer a problem, just rename it to
simplify things further.
I created this class a long time ago just to be able to quickly make a
PCI device to also represent an interrupt handler (because PCI devices
have this capability for most devices).
Then after a while I introduced the PCI::DeviceController, which is
really almost the same thing (a PCI device class that has Address member
in it), but is not tied to interrupts so it can have no interrupts, or
spawn interrupt handlers however it wants to seems fit.
However I decided it's time to say goodbye for this class for
a couple of reasons:
1. It made a whole bunch of weird patterns where you had a PCI::Device
and a PCI::DeviceController being used in the topic of implementation,
where originally, they meant to be used mutually exclusively (you
can't and really don't want to use both).
2. We can really make all the classes that inherit from PCI::Device
to inherit from IRQHandler at this point. Later on, when we have MSI
interrupts support, we can go further and untie things even more.
3. It makes it possible to simplify the VirtIO implementation to a great
extent. While this commit almost doesn't change it, future changes
can untangle some complexity in the VirtIO code.
For UHCIController, E1000NetworkAdapter, NE2000NetworkAdapter,
RTL8139NetworkAdapter, RTL8168NetworkAdapter, E1000ENetworkAdapter we
are simply making them to inherit the IRQHandler. This makes some sense,
because the first 3 devices will never support anything besides IRQs.
For the last 2, they might have MSI support, so when we start to utilize
those, we might need to untie these classes from IRQHandler and spawn
IRQHandler(s) or MSIHandler(s) as needed.
The VirtIODevice class is also a case where we currently need to use
both PCI::DeviceController and IRQHandler classes as parents, but it
could also be untied from the latter.
The number of UHCI related files is starting to expand to the point
where it's best if we move this into their own subdirectory. It'll
also make it easier to manage when we decide to add some more
controller types (whenever that may be)
The `-z,text` linker flag causes the linker to reject shared libraries
and PIE executables that have textrels. Our code mostly did not use
these except in one place in LibC, which is changed in this commit.
This makes GNU ld match LLD's behavior, which has this option enabled by
default.
TEXTRELs pose a security risk, as performing these relocations require
executable pages to be written to by the dynamic linker. This can
significantly weaken W^X hardening mitigations.
Note that after this change, TEXTRELs can still be used in ports, as the
dynamic loader code is not changed. There are also uses of it in the
kernel, removing which are outside the scope of this PR. To allow those,
`-z,notext` is added.
We are not using this for anything and it's just been sitting there
gathering dust for well over a year, so let's stop carrying all this
complexity around for no good reason.
This allows us to 1) let go of the Process when an inode is ref'ing for
ProcFSExposedComponent related reasons, and 2) change our ref/unref
implementation.
Currently, to append a UTF-16 view to a StringBuilder, callers must
first convert the view to UTF-8 and then append the copy. Add a UTF-16
overload so callers do not need to hold an entire copy in memory.
This removes Pipes dependency on the UHCIController by introducing a
controller base class. This will be used to implement other controllers
such as OHCI.
Additionally, there can be multiple instances of a UHCI controller.
For example, multiple UHCI instances can be required for systems with
EHCI controllers. EHCI relies on using multiple of either UHCI or OHCI
controllers to drive USB 1.x devices.
This means UHCIController can no longer be a singleton. Multiple
instances of it can now be created and passed to the device and then to
the pipe.
To handle finding and creating these instances, USBManagement has been
introduced. It has the same pattern as the other management classes
such as NetworkManagement.
Taking a reference or a pointer to a value that's not aligned properly
is undefined behavior. While `[[gnu::packed]]` ensures that reads from
and writes to fields of packed structs is a safe operation, the
information about the reduced alignment is lost when creating pointers
to these values.
Weirdly enough, GCC's undefined behavior sanitizer doesn't flag these,
even though the doc of `-Waddress-of-packed-member` says that it usually
leads to UB. In contrast, x86_64 Clang does flag these, which renders
the 64-bit kernel unable to boot.
For now, the `address-of-packed-member` warning will only be enabled in
the kernel, as it is absolutely crucial there because of KUBSAN, but
might get excessively noisy for the userland in the future.
Also note that we can't append to `CMAKE_CXX_FLAGS` like we do for other
flags in the kernel, because flags added via `add_compile_options` come
after these, so the `-Wno-address-of-packed-member` in the root would
cancel it out.
This commit implements the ISO 9660 filesystem as specified in ECMA 119.
Currently, it only supports the base specification and Joliet or Rock
Ridge support is not present. The filesystem will normalize all
filenames to be lowercase (same as Linux).
The filesystem can be mounted directly from a file. Loop devices are
currently not supported by SerenityOS.
Special thanks to Lubrsi for testing on real hardware and providing
profiling help.
Co-Authored-By: Luke <luke.wilde@live.co.uk>
...and also RangeAllocator => VirtualRangeAllocator.
This clarifies that the ranges we're dealing with are *virtual* memory
ranges and not anything else.
This enables further work on implementing KASLR by adding relocation
support to the pre-kernel and updating the kernel to be less dependent
on specific virtual memory layouts.
GCC and Clang allow us to inject a call to a function named
__sanitizer_cov_trace_pc on every edge. This function has to be defined
by us. By noting down the caller in that function we can trace the code
we have encountered during execution. Such information is used by
coverage guided fuzzers like AFL and LibFuzzer to determine if a new
input resulted in a new code path. This makes fuzzing much more
effective.
Additionally this adds a basic KCOV implementation. KCOV is an API that
allows user space to request the kernel to start collecting coverage
information for a given user space thread. Furthermore KCOV then exposes
the collected program counters to user space via a BlockDevice which can
be mmaped from user space.
This work is required to add effective support for fuzzing SerenityOS to
the Syzkaller syscall fuzzer. :^) :^)
We don't need an entirely separate VMObject subclass to influence the
location of the physical pages.
Instead, we simply allocate enough physically contiguous memory first,
and then pass it to the AnonymousVMObject constructor that takes a span
of physical pages.
This patch changes the semantics of purgeable memory.
- AnonymousVMObject now has a "purgeable" flag. It can only be set when
constructing the object. (Previously, all anonymous memory was
effectively purgeable.)
- AnonymousVMObject now has a "volatile" flag. It covers the entire
range of physical pages. (Previously, we tracked ranges of volatile
pages, effectively making it a page-level concept.)
- Non-volatile objects maintain a physical page reservation via the
committed pages mechanism, to ensure full coverage for page faults.
- When an object is made volatile, it relinquishes any unused committed
pages immediately. If later made non-volatile again, we then attempt
to make a new committed pages reservation. If this fails, we return
ENOMEM to userspace.
mmap() now creates purgeable objects if passed the MAP_PURGEABLE option
together with MAP_ANONYMOUS. anon_create() memory is always purgeable.
GCC-11 added a new option `-fzero-call-used-regs` which causes the
compiler to zero function arguments before return of a function. The
goal being to reduce the possible attack surface by disarming ROP
gadgets that might be potentially useful to attackers, and reducing
the risk of information leaks via stale register data. You can find
the GCC commit below[0].
This is a mitigation I noticed on the Linux KSPP issue tracker[1] and
thought it would be useful mitigation for the SerenityOS Kernel.
The reduction in ROP gadgets is observable using the ropgadget utility:
$ ROPgadget --nosys --nojop --binary Kernel | tail -n1
Unique gadgets found: 42754
$ ROPgadget --nosys --nojop --binary Kernel.RegZeroing | tail -n1
Unique gadgets found: 41238
The size difference for the i686 Kernel binary is negligible:
$ size Kernel Kernel.RegZerogin
text data bss dec hex filename
13253648 7729637 6302360 27285645 1a0588d Kernel
13277504 7729637 6302360 27309501 1a0b5bd Kernel.RegZeroing
We don't have any great workloads to measure regressions in Kernel
performance, but Kees Cook mentioned he measured only around %1
performance regression with this enabled on his Linux kernel build.[2]
References:
[0] d10f3e900b
[1] https://github.com/KSPP/linux/issues/84
[2] https://lore.kernel.org/lkml/20210714220129.844345-1-keescook@chromium.org/
This implements a simple bootloader that is capable of loading ELF64
kernel images. It does this by using QEMU/GRUB to load the kernel image
from disk and pass it to our bootloader as a Multiboot module.
The bootloader then parses the ELF image and sets it up appropriately.
The kernel's entry point is a C++ function with architecture-native
code.
Co-authored-by: Liav A <liavalb@gmail.com>
By default, the compiler will assume that `operator new` returns
pointers that are aligned correctly for every built-in type. This is not
the case in the kernel on x64, since the assumed alignment is 16
(because of long double), but the kmalloc blocks are only
`alignas(void*)`.
This adds a new section .ksyms at the end of the linker map, reserves
5MiB for it (which are after end_of_kernel_image so they get re-used
once MemoryManager is initialized) and then embeds the symbol map into
the kernel binary with objcopy. This also shrinks the .ksyms section to
the real size of the symbol file (around 900KiB at the moment).
By doing this we can make the symbol map available much earlier in the
boot process, i.e. even before VFS is available.
The previous allocator was very naive and kept the state of all pages
in one big bitmap. When allocating, we had to scan through the bitmap
until we found an unset bit.
This patch introduces a new binary buddy allocator that manages the
physical memory pages.
Each PhysicalRegion is divided into zones (PhysicalZone) of 16MB each.
Any extra pages at the end of physical RAM that don't fit into a 16MB
zone are turned into 15 or fewer 1MB zones.
Each zone starts out with one full-sized block, which is then
recursively subdivided into halves upon allocation, until a block of
the request size can be returned.
There are more opportunities for improvement here: the way zone objects
are allocated and stored is non-optimal. Same goes for the allocation
of buddy block state bitmaps.
This involves refactoring VirtIOConsole into VirtIOConsole and
VirtIOConsolePort. VirtIOConsole is the VirtIODevice, it owns multiple
VirtIOConsolePorts as well as two control queues. Each
VirtIOConsolePort is a CharacterDevice.
This replaces all uses of LexicalPath in the Kernel with the functions
from KLexicalPath. This also allows the Kernel to stop including
AK::LexicalPath.
This adds KLexicalPath, which are a few static functions which aim to
mostly emulate AK::LexicalPath. They are however constrained to work
with absolute paths only, containing no '.' or '..' path segments and no
consecutive slashes. This way, it is possible to avoid use StringView
for the return values and thus avoid allocating new String objects.
As explained above, the functions are currently very strict about the
allowed input paths. This seems to not be a problem currently. Since the
functions VERIFY this, potential bugs caused by this will become
immediately obvious.