This removes some hard references to the toolchain, some unnecessary
uses of an external install command, and disables a -Werror flag (for
the time being) - only if run inside serenity.
With this, we can build and link the kernel :^)
Running 'ninja install && ninja image && ninja run` is kind of
annoying. I got tired, and came up with this instead, which does the
right thing and I don't have to type out the incantation.
KASAN is a dynamic analysis tool that finds memory errors. It focuses
mostly on finding use-after-free and out-of-bound read/writes bugs.
KASAN works by allocating a "shadow memory" region which is used to store
whether each byte of memory is safe to access. The compiler then instruments
the kernel code and a check is inserted which validates the state of the
shadow memory region on every memory access (load or store).
To fully integrate KASAN into the SerenityOS kernel we need to:
a) Implement the KASAN interface to intercept the injected loads/stores.
void __asan_load*(address);
void __asan_store(address);
b) Setup KASAN region and determine the shadow memory offset + translation.
This might be challenging since Serenity is only 32bit at this time.
Ex: Linux implements kernel address -> shadow address translation like:
static inline void *kasan_mem_to_shadow(const void *addr)
{
return ((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT)
+ KASAN_SHADOW_OFFSET;
}
c) Integrating KASAN with Kernel allocators.
The kernel allocators need to be taught how to record allocation state
in the shadow memory region.
This commit only implements the initial steps of this long process:
- A new (default OFF) CMake build flag `ENABLE_KERNEL_ADDRESS_SANITIZER`
- Stubs out enough of the KASAN interface to allow the Kernel to link clean.
Currently the KASAN kernel crashes on boot (triple fault because of the crash
in strlen other sanitizer are seeing) but the goal here is to just get started,
and this should help others jump in and continue making progress on KASAN.
References:
* ASAN Paper: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/37752.pdf
* KASAN Docs: https://github.com/google/kasan
* NetBSD KASAN Blog: https://blog.netbsd.org/tnf/entry/kernel_address_sanitizer_part_3
* LWN KASAN Article: https://lwn.net/Articles/612153/
* Tracking Issue #5351
This is an external file from https://pci-ids.ucw.cz that's being updated
daily, which was imported a while ago but probably shouldn't live in the
SerenityOS repository in the first place (or else would need manual
maintenance). The legal aspects of redistributing this file as we
currently do are not quite clear to me, they require either GPL (version
2 or later) or 3-clause BSD - Serenity is 2-clause BSD...
The current version we use is 2019.08.08, so quite outdated - and while
most of these devices are obviously not supported, we're still capable
of *listing* them, so having an up-to-date version with recent additions
and fixes would be nice.
This updates the root CMakeLists.txt to check for existence of the file
and download it if not found - effectively on every fresh build. Do note
that this is not a critical file, and the system runs just fine should
this ever fail. :^)
This achieves two things:
- Programs can now intentionally perform arbitrary syscalls by calling
syscall(). This allows us to work on things like syscall fuzzing.
- It restricts the ability of userspace to make syscalls to a single
4KB page of code. In order to call the kernel directly, an attacker
must now locate this page and call through it.
This was done with the help of several scripts, I dump them here to
easily find them later:
awk '/#ifdef/ { print "#cmakedefine01 "$2 }' AK/Debug.h.in
for debug_macro in $(awk '/#ifdef/ { print $2 }' AK/Debug.h.in)
do
find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/#ifdef '$debug_macro'/#if '$debug_macro'/' {} \;
done
# Remember to remove WRAPPER_GERNERATOR_DEBUG from the list.
awk '/#cmake/ { print "set("$2" ON)" }' AK/Debug.h.in
Else, there's tons of "-- Set runtime path of" spam at build time,
with apparently no way of disabling the build noise other than turning
of rpaths. If the dynamic loader uses them at some point, we probably
want to set them through cflags/ldflags instead of through cmake's
built-in thing anyways, for that reason.
Modify the user mode runtime to insert stack canaries to find stack corruptions.
The `-fstack-protector-strong` variant was chosen because it catches more
issues than vanilla `-fstack-protector`, but doesn't have substantial
performance impact like `-fstack-protector-all`.
Details:
-fstack-protector enables stack protection for vulnerable functions that contain:
* A character array larger than 8 bytes.
* An 8-bit integer array larger than 8 bytes.
* A call to alloca() with either a variable size or a constant size bigger than 8 bytes.
-fstack-protector-strong enables stack protection for vulnerable functions that contain:
* An array of any size and type.
* A call to alloca().
* A local variable that has its address taken.
Example of it catching corrupting in the `stack-smash` test:
```
courage ~ $ ./user/Tests/LibC/stack-smash
[+] Starting the stack smash ...
Error: Stack protector failure, stack smashing detected!
Shell: Job 1 (/usr/Tests/LibC/stack-smash) Aborted
```
RTTI is still disabled for the Kernel, and for the Dynamic Loader. This
allows for much less awkward navigation of class heirarchies in LibCore,
LibGUI, LibWeb, and LibJS (eventually). Measured RootFS size increase
was < 1%, and libgui.so binary size was ~3.3%. The small binary size
increase here seems worth it :^)
* Add SERENITY_ARCH option to CMake for selecting the target toolchain
* Port all build scripts but continue to use i686
* Update GitHub Actions cache to include BuildIt.sh
Previosuly, generation of the SONAME attribute was disabled.
This caused libraries to have relative paths in DT_NEEDED attributes
(e.g "Libraries/libcore.so" instead of just "libcore.so"),
which caused build errors when the working directory during build was
not $SERENITY_ROOT/Build.
This caused the build of ports that use libraries other than libc.so
to fail (e.g the nesalizer port).
Closes#4457
We now configure the gcc spec files to use a different crt files for
static & PIE binaries.
This relieves us from the need to explicitly specify the desired crt0
file in cmake scripts.
Problem:
- These utility functions are only used in `AK`, but are being defined
in the top-level. This clutters the top-level.
Solution:
- Move the utility functions to `Meta/CMake/utils.cmake` and include
where needed.
- Also, move `all_the_debug_macros.cmake` into `Meta/CMake` directory
to consolidate the location of `*.cmake` script files.
Problem:
- Modifying CXXFLAGS directly is an old CMake style.
- The giant and ever-growing list of `*_DEBUG` macros clutters the
top-level CMakeLists.txt.
Solution:
- Use the more current `add_compile_definitions` function.
- Sort all the debug options so that they are easy to view.
- Move the `*_DEBUG` macros to their own file which can be included
directly.
Problem:
- Functions are duplicated in [PBM,PGM,PPM]Loader class
implementations. They are functionally equivalent. This does not
follow the DRY (Don't Repeat Yourself) principle.
Solution:
- Factor out the common functions into a separate file.
- Refactor common code to generic functions.
- Change `PPM_DEBUG` macro to be `PORTABLE_IMAGE_LOADER_DEBUG` to work
with all the supported types. This requires adding the image type to
the debug log messages for easier debugging.
Problem:
- Appending to CMAKE_CXX_FLAGS for everything is cumbersome.
Solution:
- Use the `add_compile_options` built-in function to handle adding
compiler options (and even de-duplicating).
Problem:
- Setting `CMAKE_CXX_FLAGS` directly to effect the version of the C++
standard being used is no longer the recommended best practice.
Solution:
- Set C++20 mode in the compiler by setting `CMAKE_CXX_STANDARD`.
- Force the build system generator not to fallback to the latest
standard supported by the compiler by enabling
`CMAKE_CXX_STANDARD_REQUIRED`. This shouldn't ever be a problem
though since the toolchain is tightly controlled.
- Disable GNU compiler extensions by disabling `CMAKE_CXX_EXTENSIONS`
to preserve the previous flags.
This new subsystem is somewhat replacing the IDE disk code we had with a
new flexible design.
StorageDevice is a generic class that represent a generic storage
device. It is meant that specific storage hardware will override the
interface. StorageController is a generic class that represent
a storage controller that can be found in a machine.
The IDEController class governs two IDEChannels. An IDEChannel is
responsible to manage the master & slave devices of the channel,
therefore an IDEChannel is an IRQHandler.
New serenity_app() targets can be defined which allows application
icons to be emedded directly into the executable. The embedded
icons will then be used when creating an icon for that file in
LibGUI.
This patch replaces the UI-from-JSON mechanism with a more
human-friendly DSL.
The current implementation simply converts the GML into a JSON object
that can be consumed by GUI::Widget::load_from_json(). The parser is
not very helpful if you make a mistake.
The language offers a very simple way to instantiate any registered
Core::Object class by simply saying @ClassName
@GUI::Label {
text: "Hello friends!"
tooltip: ":^)"
}
Layouts are Core::Objects and can be assigned to the "layout" property:
@GUI::Widget {
layout: @GUI::VerticalBoxLayout {
spacing: 2
margins: [8, 8, 8, 8]
}
}
And finally, child objects are simply nested within their parent:
@GUI::Widget {
layout: @GUI::HorizontalBoxLayout {
}
@GUI::Button {
text: "OK"
}
@GUI::Button {
text: "Cancel"
}
}
This feels a *lot* more pleasant to write than the JSON we had. The fact
that no new code was being written with the JSON mechanism was pretty
telling, so let's approach this with developer convenience in mind. :^)
We need to account for how many shared lock instances the current
thread owns, so that we can properly release such references when
yielding execution.
We also need to release the process lock when donating.
The dynamic loader exists as /usr/lib/Loader.so and is loaded by the
kernel when ET_DYN programs are executed.
The dynamic loader is responsible for loading the dependencies of the
main program, allocating TLS storage, preparing all loaded objects for
execution and finally jumping to the entry of the main program.
This prevents zombies created by multi-threaded applications and brings
our model back to closer to what other OSs do.
This also means that SIGSTOP needs to halt all threads, and SIGCONT needs
to resume those threads.
This changes the Thread::wait_on function to not enable interrupts
upon leaving, which caused some problems with page fault handlers
and in other situations. It may now be called from critical
sections, with interrupts enabled or disabled, and returns to the
same state.
This also requires some fixes to Lock. To aid debugging, a new
define LOCK_DEBUG is added that enables checking for Lock leaks
upon finalization of a Thread.
This commit is a mix of several commits, squashed into one because the
commits before 'Move regex to own Library and fix all the broken stuff'
were not fixable in any elegant way.
The commits are listed below for "historical" purposes:
- AK: Add options/flags and Errors for regular expressions
Flags can be provided for any possible flavour by adding a new scoped enum.
Handling of flags is done by templated Options class and the overloaded
'|' and '&' operators.
- AK: Add Lexer for regular expressions
The lexer parses the input and extracts tokens needed to parse a regular
expression.
- AK: Add regex Parser and PosixExtendedParser
This patchset adds a abstract parser class that can be derived to implement
different parsers. A parser produces bytecode to be executed within the
regex matcher.
- AK: Add regex matcher
This patchset adds an regex matcher based on the principles of the T-REX VM.
The bytecode pruduced by the respective Parser is put into the matcher and
the VM will recursively execute the bytecode according to the available OpCodes.
Possible improvement: the recursion could be replaced by multi threading capabilities.
To match a Regular expression, e.g. for the Posix standard regular expression matcher
use the following API:
```
Pattern<PosixExtendedParser> pattern("^.*$");
auto result = pattern.match("Well, hello friends!\nHello World!"); // Match whole needle
EXPECT(result.count == 1);
EXPECT(result.matches.at(0).view.starts_with("Well"));
EXPECT(result.matches.at(0).view.end() == "!");
result = pattern.match("Well, hello friends!\nHello World!", PosixFlags::Multiline); // Match line by line
EXPECT(result.count == 2);
EXPECT(result.matches.at(0).view == "Well, hello friends!");
EXPECT(result.matches.at(1).view == "Hello World!");
EXPECT(pattern.has_match("Well,....")); // Just check if match without a result, which saves some resources.
```
- AK: Rework regex to work with opcodes objects
This patchsets reworks the matcher to work on a more structured base.
For that an abstract OpCode class and derived classes for the specific
OpCodes have been added. The respective opcode logic is contained in
each respective execute() method.
- AK: Add benchmark for regex
- AK: Some optimization in regex for runtime and memory
- LibRegex: Move regex to own Library and fix all the broken stuff
Now regex works again and grep utility is also in place for testing.
This commit also fixes the use of regex.h in C by making `regex_t`
an opaque (-ish) type, which makes its behaviour consistent between
C and C++ compilers.
Previously, <regex.h> would've blown C compilers up, and even if it
didn't, would've caused a leak in C code, and not in C++ code (due to
the existence of `OwnPtr` inside the struct).
To make this whole ordeal easier to deal with (for now), this pulls the
definitions of `reg*()` into LibRegex.
pros:
- The circular dependency between LibC and LibRegex is broken
- Eaiser to test (without accidentally pulling in the host's libc!)
cons:
- Using any of the regex.h functions will require the user to link -lregex
- The symbols will be missing from libc, which will be a big surprise
down the line (especially with shared libs).
Co-Authored-By: Ali Mohammad Pur <ali.mpfard@gmail.com>
Problem:
- CMake is not outputting `compile_commands.json`.
- `compile_commands.json` is used by build integration tooling such as
`clang-tidy`.
Solution:
- Enable `CMAKE_EXPORT_COMPILE_COMMANDS` option so that the file is
output.
This is our first client of the new JSON GUI declaration thingy.
The skeleton of the TextEditor app GUI is now declared separately from
the C++ logic, and we use the Core::Object::name() of widgets to locate
them once they have been instantiated by the GUI builder.
I know, the tags don't actually matter. However, clang warns by default,
and instead of disabling the warning for clang I'd rather enable the warning
for gcc.
Concludes #3096.
Phew! From here on, build system and CI will ensure that all new code
defines compilation-unit-only code as 'static', and that dead code can
be found more easily. Also, this style encourages type checking
by suggesting that you put a proper declaration in a shared header.
It looks like PR #2986 mistakenly removed this from both the
Clang and GCC CXX_FLAGS, when the intention seems to have been
to only disable it for Clang.
Useful for sanitizer fuzzer builds.
clang doesn't have a -fconcepts switch (I'm guessing it just enables
concepts automatically with -std=c++2a, but I haven't checked),
and at least the version on my system doesn't understand
-Wno-deprecated-move, so pass these two flags only to gcc.
In return, disable -Woverloaded-virtual which fires in many places.
The preceding commits fixed the handful of -Wunused-private-field
warnings that clang emitted.
After running a build command, make by default stat()s the command's
output, and if it wasn't touched, then it cancels all build steps
that were scheduled only because this command was expected to change
the output.
Ninja has the same feature, but it's opt-in behind the per-command
"restat = 1" setting. However, CMake enables it by default for all
custom commands.
Use Meta/write-only-on-difference.sh to write the output to a temporary
file, and then copy the temporary file only to the final location if the
contents of the output have changed since last time.
write-only-on-difference.sh automatically creates the output's parent
directory, so stop doing that in CMake.
Reduces the number of build steps that run after touching a file
in LibCore from 522 to 312.
Since we now no longer trigger the CMake special case "If COMMAND
specifies an executable target name (created by the add_executable()
command), it will automatically be replaced by the location of the
executable created at build time", we now need to use qualified paths to
the generators.
Somewhat related to #2877.
Seems like we can build without these two flags now:
-Wno-sized-deallocation
-fno-sized-deallocation
I don't remember why they were needed in the first place.
This allows us to look up source file/line information from addresses
without bloating the build too much. It could probably be made smaller
with some tricks.
I tried setting it to Release, then noticed that it didn't build
due to gcc's optimizer-level dependent warnings and -Werror, then
started fixing the warnings for a bit (all false positives),
then looked at the global CMakeLists.txt and realized that the
default build is aleady using compiler optimizations. It looks like
people aren't supposed to change this, so make that explicit to
be friendly to people familiar with cmake but new to serenity.
The SDL port failed to build because the CMake toolchain filed pointed
to the old root. Now the toolchain file assumes that the Root is in
Build/Root.
Additionally, the AK/ and Kernel/ headers need to be installed in the
root too.
Make sure that userspace is always referencing "system" headers in a way
that would build on target :). This means removing the explicit
include_directories of Libraries/LibC in favor of having it export its
headers as SYSTEM. Also remove a redundant include_directories of
Libraries in the 'serenity build' part of the build script. It's already
set at the top.
This causes issues for the Kernel, and for crt0.o. These special cases
are handled individually.
This stopped working quite some time ago due to Clang losing track of
typestates for some reason and everything becoming "unknown".
Since we're primarily using GCC anyway, it doesn't seem worth it to try
and maintain this non-working experiment for a secondary compiler.
Also it doesn't look like the Clang team is actively maintaining this
flag anyway. So good-bye, -Wconsumed. :/