The current emoji_txt.cmake does not handle download errors (which were
a common source of issues in the build problems channel) or Unicode
versioning. These are both handled by unicode_data.cmake. Move the
download to unicode_data.cmake so that we can more easily handle next
month's Unicode 15 release.
Instead of manually updating emoji.txt whenever new emoji are added,
we use Unicode's emoji-test.txt to generate emoji.txt on each build,
including only the emojis that Serenity supports at that time.
By using emoji-test.txt, we can also include all forms of each emoji
(fully-qualified, minimally-qualified, and unqualified) which can be
helpful when double-checking how certain forms are handled.
Not sure why that happens and how it worked until now, but we need to be
more precise about the location of PCI and USB IDs when decompressing
them while building the OS.
In commit 02e8f29560 we started exporting
the `CMAKE_INSTALL_*DIR` variables without prefix in order to make
better use of the actual `PREFIX` settings.
However, commands like `file(MAKE_DIRECTORY ...)` don't understand the
GNUInstallDirs way of building paths, so we ended up creating
directories in our main project directory by accident.
Fix that by manually adding the correct prefix onto the path.
In preparation for future refactoring of Lagom, let's use the variables
from GNUInstallDirs as much as possible for the helper macros and other
scripts used by the main build already.
This release brings support for various C++23 constructs like `if
consteval` and multidimensional subscript operators. Vectorization is
now enabled for O2 too, and `-ftrivial-auto-var-init` has been added
which can help us find and prevent security issues coming from
uninitialized variables.
Toolchain/Patches/gcc.patch is now significanly smaller as some unused,
autoconf-generated code has been removed.
Besides a version bump, the following changes have been made to our
toolchain infrastructure:
- LLVM/Clang is now built with -march=native if the host compiler
supports it. An exception to this is CI, as the toolchain cache is
shared among many different machines there.
- The LLVM tarball is not re-extracted if the hash of the applied
patches doesn't differ.
- The patches have been split up into atomic chunks.
- Port-specific patches have been integrated into the main patches,
which will aid in the work towards self-hosting.
- <sysroot>/usr/local/lib is now appended to the linker's search path by
default.
- --pack-dyn-relocs=relr is appended to the linker command line by
default, meaning ports take advantage of RELR relocations without any
patches or additional compiler flags.
The formatting of LLVM port's package.sh has been bothering me, so I
also indented the arguments to the CMake invocation.
We have seen some cases where the build fails for folks, and they are
missing unzip/tar/gzip etc. We can catch some of these in CMake itself,
so lets make sure to handle that uniformly across the build system.
The REQUIRED flag to `find_program` was only added on in CMake 3.18 and
above, so we can't rely on that to actually halt the program execution.
With regular builds, the generated IPC headers exist inside the Build
directory. The path Userland/Services under the build directory is
added to the include path.
For in-system builds the IPC headers are installed at /usr/include/.
To support this, we add /usr/include/Userland/Services to the build path
when building from Hack Studio.
Co-Authored-By: Andrew Kaster <akaster@serenityos.org>
This variable was originally called USE_MOLD_LINKER, but it was changed
to ENABLE_MOLD_LINKER during review to be consistent with other
configuration options. I branched off the commits that added RELR
support before this change, and I failed to update the variable name
there.
While playing around with getting serenity to run on my main desktop
machine I wanted a way of easily updating my physical serenity
partition.
To use it you just need to:
- Create and format your local partition to ext4
- Set `SERENITY_TARGET_INSTALL_PARTITION` to the partition /dev path.
- Run the `install-native-partition` build target.
Example:
$ export SERENITY_TARGET_INSTALL_PARTITION=/dev/nvme1n1p3
$ cd serenity/Build/x86_64
$ ninja install-native-partition
This commit adds support for building the SerenityOS userland with the
new [mold linker].
This is not enabled by default yet; to link using mold, run the
`Toolchain/BuildMold.sh` script to build the latest release of mold, and
set the `ENABLE_MOLD_LINKER` CMake variable to ON. This option relies on
toolchain support that has been added just recently, so you might need
to rebuild your toolchain for mold to work.
[mold linker]: https://github.com/rui314/mold
If this option is set, we will not build all components.
Instead, we include an external CMake file passed in via a variable
named HACKSTUDIO_BUILD_CMAKE_FILE.
This will be used to build serenity components from Hack Studio.
The `--allow-shlib-undefined` option is a bit of a misnomer. It actually
controls whether we should be allowed to have undefined references after
symbols from all dependencies have been resolved, so it applies both to
shared libraries and executables.
LLD defaults to allowing undefined references in shared libraries, but
not in executables. Previously, we had to disable this check for
executables too, as it caused a build failure due to the
LibC-LibPthread-libc++ and the LibCore-LibCrypto circular dependencies.
Now that those have been resolved, we can enable this warning, in the
hopes that it will prevent us from introducing circular libraries and
missing dependencies that might cause unexpected breakage.
There's only two places where we're using the C99 feature of array
designated initalizers. This feature seemingly wasn't included with
C++20 designated initalizers for classes and structs. The only two
places we were using this feature are suitably old and isolated that
it makes sense to just suppress the warning at the usage sites while
discouraging future array designated intializers in new code.
Enable the warning project-wide. It catches when a non-virtual method
creates an overload set with a virtual method. This might cause
surprising overload resolution depending on how the method is invoked.
The Clang implementation of this warning protects against some undefined
pre-processor behavior while ignoring function-like macros. The gcc
implementation also warns on function-like macros, and is therefore
noisy.
These were removed in the Superbuild conversion. Re-add the checks that
make sure that if there's a toolchain update, developers re-build their
toolchain.
gzip -c is supported in both Linux and BSD flavors of gzip. The -o flag
was introduced in a previous commit which is present in OpenBSD, but not
other flavors of Linux. -c will write to stdout which is redirected to
the target files. As a side benefit, we no longer need to copy files
anywhere
OpenBSD gzip does not have the -k flag to keep the original after
extraction. Work around this by copying the original gzip to the dest
and then extracting. A bit of a hack, but only needs to be done for the
first-time or rebuilds
OpenBSD provides crypt in libc, not libcrypt. Adjust if/else to check
for either and proceed accordingly
Remove outdated OpenBSD checks when building the toolchain
This option is already enabled when building Lagom, so let's enable it
for the main build too. We will no longer be surprised by Lagom Clang
CI builds failing while everything compiles locally.
Furthermore, the stronger `-Wsuggest-override` warning is enabled in
this commit, which enforces the use of the `override` keyword in all
classes, not just those which already have some methods marked as
`override`. This works with both GCC and Clang.
This commit updates the Clang toolchain's version to 13.0.0, which comes
with better C++20 support and improved handling of new features by
clang-format. Due to the newly enabled `-Bsymbolic-functions` flag, our
Clang binaries will only be 2-4% slower than if we dynamically linked
them, but we save hundreds of megabytes of disk space.
The `BuildClang.sh` script has been reworked to build the entire
toolchain in just three steps: one for the compiler, one for GNU
binutils, and one for the runtime libraries. This reduces the complexity
of the build script, and will allow us to modify the CI configuration to
only rebuild the libraries when our libc headers change.
Most of the compile flags have been moved out to a separate CMake cache
file, similarly to how the Android and Fuchsia toolchains are
implemented within the LLVM repo. This provides a nicer interface than
the heaps of command-line arguments.
We no longer build separate toolchains for each architecture, as the
same Clang binary can compile code for multiple targets.
The horrible mess that `SERENITY_CLANG_ARCH` was, has been removed in
this commit. Clang happily accepts an `i686-pc-serenity` target triple,
which matches what our GCC toolchain accepts.
Replace the old logic where we would start with a host build, and swap
all the CMake compiler and target variables underneath it to trick
CMake into building for Serenity after we configured and built the Lagom
code generators.
The SuperBuild creates two ExternalProjects, one for Lagom and one for
Serenity. The Serenity project depends on the install stage for the
Lagom build. The SuperBuild also generates a CMakeToolchain file for the
Serenity build to use that replaces the old toolchain file that was only
used for Ports.
To ensure that code generators are rebuilt when core libraries such as
AK and LibCore are modified, developers will need to direct their manual
`ninja` invocations to the SuperBuild's binary directory instead of the
Serenity binary directory.
This commit includes warning coalescing and option style cleanup for the
affected CMakeLists in the Kernel, top level, and runtime support
libraries. A large part of the cleanup is replacing USE_CLANG_TOOLCHAIN
with the proper CMAKE_CXX_COMPILER_ID variable, which will no longer be
confused by a host clang compiler.
This common strategy of having a serenity_option() macro defined in
either the Lagom or top level CMakeLists.txt allows us to do two things:
First, we can more clearly see which options are Serenity-specific,
Lagom-specific, or common between the target and host builds.
Second, it enables the upcoming SuperBuild changes to set() the options
in the SuperBuild's CMake cache and forward each target's options to the
corresponding ExternalProject.
This makes it so we don't need to specify the full path to all the
helper scripts we include() from different places in the codebase and
feels a lot cleaner.
This prevents GCC and Clang from deleting null pointer checks for
optimization purposes. I think we're strictly better off crashing
in those cases instead of the compiler hiding errors from us.
This tells the linker to not combine read-only data and executable code,
instead favoring multiple PT_LOAD headers with more precise permissions.
This greatly reduces the amount of executable pages in all our programs
and libraries.
/usr/lib/libjs.so before:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x00000000 0x00000000 0x2fc77c 0x2fc77c R E 0x1000
LOAD 0x2fc900 0x002fd900 0x002fd900 0x0c708 0x0dd1c RW 0x1000
/usr/lib/libjs.so after:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x00000000 0x00000000 0x80e60 0x80e60 R 0x1000
LOAD 0x081000 0x00081000 0x00081000 0x25f6c9 0x25f6c9 R E 0x1000
LOAD 0x2e1000 0x002e1000 0x002e1000 0x1c27c 0x1c27c R 0x1000
LOAD 0x2fd900 0x002fe900 0x002fe900 0x0c708 0x0dd1c RW 0x1000
As you can see, we go from 0x2fc77c bytes of executable memory down to
0x25f6c9 (a ~20% reduction!) The memory that was previous executable is
now simply read-only instead. :^)
This is needed so all headers and files exist on disk, so that
the sonar cloud analyzer can find them when executing the compilation
commands contained in compile_commands.json, without actually building.
Co-authored-by: Andrew Kaster <akaster@serenityos.org>
This allows us to remove all the add_subdirectory calls from the top
level CMakeLists.txt that referred to targets linking LagomCore.
Segregating the host tools and Serenity targets helps us get to a place
where the main Serenity build can simply use a CMake toolchain file
rather than swapping all the compiler/sysroot variables after building
host libraries and tools.
By using SerenityOS_SOURCE_DIR we can make custom targets and commands
agnostic to the actual location of the root CMakeLists directory.
All we care about is the root of the SerenityOS project.
The `-z,text` linker flag causes the linker to reject shared libraries
and PIE executables that have textrels. Our code mostly did not use
these except in one place in LibC, which is changed in this commit.
This makes GNU ld match LLD's behavior, which has this option enabled by
default.
TEXTRELs pose a security risk, as performing these relocations require
executable pages to be written to by the dynamic linker. This can
significantly weaken W^X hardening mitigations.
Note that after this change, TEXTRELs can still be used in ports, as the
dynamic loader code is not changed. There are also uses of it in the
kernel, removing which are outside the scope of this PR. To allow those,
`-z,notext` is added.
Previously, this was disabled because GCC flagged seemingly correct and
well-defined code. This was however not the case because GCC implicitly
marked some pointers non-null, even if we wanted to handle them
ourselves, and deleted null checks on them. By re-introducing this
warning, we will know if the compiler tries to discard our code again.
This is primarily to allow using LibUnicode within LibJS and its REPL.
Note: this seems to be the first time that a Lagom dependency requires
generated source files. For this to work, some of Lagom's CMakeLists.txt
commands needed to be re-organized to include the CMake files that fetch
and parse UnicodeData.txt. The paths required to invoke the generator
also differ depending on what is currently building (SerenityOS vs.
Lagom as part of the Serenity build vs. a standalone Lagom build).
The Unicode standard publishes the Unicode Character Database (UCD) with
information about every code point, such as each code point's upper case
mapping. LibUnicode exists to download and parse UCD files at build time
and to provide accessors to that data.
As a start, LibUnicode includes upper- and lower-case code point
converters.
GCC and Clang allow us to inject a call to a function named
__sanitizer_cov_trace_pc on every edge. This function has to be defined
by us. By noting down the caller in that function we can trace the code
we have encountered during execution. Such information is used by
coverage guided fuzzers like AFL and LibFuzzer to determine if a new
input resulted in a new code path. This makes fuzzing much more
effective.
Additionally this adds a basic KCOV implementation. KCOV is an API that
allows user space to request the kernel to start collecting coverage
information for a given user space thread. Furthermore KCOV then exposes
the collected program counters to user space via a BlockDevice which can
be mmaped from user space.
This work is required to add effective support for fuzzing SerenityOS to
the Syzkaller syscall fuzzer. :^) :^)
The GCC documentation says that since it's officially a part of C++20,
this flag does nothing. Clang, however, does complain that it does not
recognize it, so it's better to just remove it.
This adds a utility program which is essentially a command generator for
CMake. It reads the 'components.ini' file generated by CMake in the
build directory, prompts the user to select a build type and optionally
customize it, generates and runs a CMake command as well as 'ninja
clean' and 'rm -rf Root', which are needed to properly remove system
components.
The program uses whiptail(1) for user interaction.
Neither the kernel nor LibELF support loading libraries with larger
PT_LOAD alignment. The default on x86 is 4096 while it's 2MiB on x86_64.
This changes the alignment to 4096 on all platforms.
Components are a group of build targets that can be built and installed
separately. Whether a component should be built can be configured with
CMake arguments: -DBUILD_<NAME>=ON|OFF, where <NAME> is the name of the
component (in all caps).
Components can be marked as REQUIRED if they're necessary for a
minimally functional base system or they can be marked as RECOMMENDED
if they're not strictly necessary but are useful for most users.
A component can have an optional description which isn't used by the
build system but may be useful for a configuration UI.
Components specify the TARGETS which should be built when the component
is enabled. They can also specify other components which they depend on
(with DEPENDS).
This also adds the BUILD_EVERYTHING CMake variable which lets the user
build all optional components. For now this defaults to ON to make the
transition to the components-based build system easier.
The list of components is exported as an INI file in the build directory
(e.g. Build/i686/components.ini).
Fixes#8048.
It's prone to finding "technically uninitialized but can never happen"
cases, particularly in Optional<T> and Variant<Ts...>.
The general case seems to be that it cannot infer the dependency
between Variant's index (or Optional's boolean state) and a particular
alternative (or Optional's buffer) being untouched.
So it can flag cases like this:
```c++
if (index == StaticIndexForF)
new (new_buffer) F(move(*bit_cast<F*>(old_buffer)));
```
The code in that branch can _technically_ make a partially initialized
`F`, but that path can never be taken since the buffer holding an
object of type `F` and the condition being true are correlated, and so
will never be taken _unless_ the buffer holds an object of type `F`.
This commit also removed the various 'diagnostic ignored' pragmas used
to work around this warning, as they no longer do anything.
Since I introduced this functionality there has been a steady stream of
people building with `ALL_THE_DEBUG_MACROS` and trying to boot the
system, and immediately hitting this assert. I have no idea why people
try to build with all the debugging enabled, but I'm tired of seeing the
bug reports about asserts we know are going to happen at this point.
So I'm hiding this value under the new ENABLE_ALL_DEBUG_FACILITIES flag
instead. This is only set by CI, and hopefully no-one will try to build
with this thing (It's documented as not recommended).
Fixes: #7527
There are lots of people who have issues building serenity because
they don't read the build directions closely enough and have an
unsupported GCC version as their host compiler. Instead of repeatedly
having to answer these kinds of questions, lets just error out upfront.
Take Kernel/UBSanitizer.cpp and make a copy in LibSanitizer.
We can use LibSanitizer to hold other sanitizers as people implement
them :^).
To enable UBSAN for LibC, DynamicLoader, and other low level system
libraries, LibUBSanitizer is built as a serenity_libc, and has a static
version for LibCStatic to use. The approach is the same as that taken in
Note that this means now UBSAN is enabled for code generators, Lagom,
Kernel, and Userspace with -DENABLE_UNDEFINED_SANTIZER=ON. In userspace
however, UBSAN is not deadly (yet).
Co-authored-by: ForLoveOfCats <ForLoveOfCats@vivaldi.net>
This option replaces the use of ENABLE_ALL_THE_DEBUG_MACROS in CI runs,
and enables all debug options that might be broken by developers
unintentionally that are only used in specific debugging situations.
When debugging kernel code, it's necessary to set extra flags. Normal
advice is to set -ggdb3. Sometimes that still doesn't provide enough
debugging information for complex functions that still get optimized.
Compiling with -Og gives the best optimizations for debugging, but can
sometimes be broken by changes that are innocuous when the compiler gets
more of a chance to look at them. The new CMake option enables both
compile options for kernel code.
This also optionally generates a test suite from the WebAssembly
testsuite, which can be enabled via passing `INCLUDE_WASM_SPEC_TESTS`
to cmake, which will generate test-wasm-compatible tests and the
required fixtures.
The generated directories are excluded from git since there's no point
in committing them.
This only tests "can it be parsed", but the goal of this commit is to
provide a test framework that can be built upon :)
The conformance tests are downloaded, compiled* and installed only if
the INCLUDE_WASM_SPEC_TESTS cmake option is enabled.
(*) Since we do not yet have a wast parser, the compilation is delegated
to an external tool from binaryen, `wasm-as`, which is required for the
test suite download/install to succeed.
This *does* run the tests in CI, but it currently does not include the
spec conformance tests.
This had very bad interactions with ccache, often leading to rebuilds
with 100% cache misses, etc. Ali says it wasn't that big of a speedup
in the end anyway, so let's not bother with it.
We can always bring it back in the future if it seems like a good idea.
This program turns a description of a state machine that takes its input
byte-by-byte into C++ code. The state machine is described in a custom
format as specified below:
```
// Comments are started by two slashes, and cause the rest of the line
// to be ignored
@name ExampleStateMachine // sets the name of the generated class
@namespace Test // sets the namespace (optional)
@begin Begin // sets the state the parser will start in
// The rest of the file contains one or more states and an optional
// @anywhere directive. Each of these is a curly bracket delimited set
// of state transitions. State transitions contain a selector, the
// literal "=>" and a (new_state, action) tuple. Examples:
// 0x0a => (Begin, PrintLine)
// [0x00..0x1f] => (_, Warn) // '_' means no change
// [0x41..0x5a] => (BeginWord, _) // '_' means no action
// Rules common to all states. These take precedence over rules in the
// specific states.
@anywhere {
0x0a => (Begin, PrintLine)
[0x00..0x1f] => (_, Warn)
}
Begin {
[0x41..0x5a] => (Word, _)
[0x61..0x7a] => (Word, _)
// For missing values, the transition (_, _) is implied
}
Word {
// The entry action is run when we transition to this state from a
// *different* state. @anywhere can't have this
@entry IncreaseWordCount
0x09 => (Begin, _)
0x20 => (Begin, _)
// The exit action is run before we transition to any *other* state
// from here. @anywhere can't have this
@exit EndOfWord
}
```
The generated code consists of a single class which takes a
`Function<Action, u8>` as a parameter in its constructor. This gets
called whenever an action is to be done. This is because some input
might not produce an action, but others might produce up to 3 (exit,
state transition, entry). The actions allow us to build a more
advanced parser over the simple state machine.
The sole public method, `void advance(u8)`, handles the input
byte-by-byte, managing the state changes and requesting the appropriate
Action from the handler.
Internally, the state transitions are resolved via a lookup table. This
is a bit wasteful for more complex state machines, therefore the
generator is designed to be easily extendable with a switch-based
resolver; only the private `lookup_state_transition` method needs to be
re-implemented.
My goal for this tool is to use it for implementing a standard-compliant
ANSI escape sequence parser for LibVT, as described on
<https://vt100.net/emu/dec_ansi_parser>
Make messages which should be fatal, actually fail the build.
- FATAL is not a valid mode keyword. The full list is available in the
docs: https://cmake.org/cmake/help/v3.19/command/message.html
- SEND_ERROR doesn't immediately stop processing, FATAL_ERROR does.
We should immediately stop if the Toolchain is not present.
- The app icon size validation was just a WARNING that is easy to
overlook. We should promote it to a FATAL_ERROR so that people will
not overlook the issue when adding a new application. We can only make
the small icon message FATAL_ERROR, as there is currently one
violation of the medium app icon validation.
With the goal of centralizing all tests in the system, this is a
first step to establish a Tests sub-tree. It will contain all of
the unit tests and test harnesses for the various components in the
system.
When building libraries on macOS they'd be missing the SONAME
attribute which causes the linker to embed relative paths into
other libraries and executables:
Dynamic section at offset 0x52794 contains 28 entries:
Type Name/Value
(NEEDED) Shared library: [libgcc_s.so]
(NEEDED) Shared library: [Userland/Libraries/LibCrypt/libcrypt.so]
(NEEDED) Shared library: [Userland/Libraries/LibCrypto/libcrypto.so]
(NEEDED) Shared library: [Userland/Libraries/LibC/libc.so]
(NEEDED) Shared library: [libsystem.so]
(NEEDED) Shared library: [libm.so]
(NEEDED) Shared library: [libc.so]
The dynamic linker then fails to load those libraries which makes
the system unbootable.
Make this stuff a bit easier to maintain by using the
root level variables to build up the Toolchain paths.
Also leave a note for future editors of BuildIt.sh to
give them warning about the other changes they'll need
to make.
While this has a rather significant impact for me, it appears to have
very minimal build time improvements (or in some cases, regressions).
Also appears to cause some issues when building on macOS.
So disable it by default, but leave the option so people that get
something out of it (seems to mostly be a case of "is reading the
headers fast enough") can turn it on for their builds.
Until we get the goodness that C++ modules are supposed to be, let's try
to shave off some parse time using precompiled headers.
This commit only adds some very common AK headers, only to binaries,
libraries and the kernel (tests are not covered due to incompatibility
with AK/TestSuite.h).
This option is on by default, but can be disabled by passing
`-DPRECOMPILE_COMMON_HEADERS=OFF` to cmake, which will disable all
header precompilations.
This makes the build about 30 seconds faster on my machine (about 7%).
Problem:
- Newer versions of clang (ToT) have a similar `-Wliteral-suffix`
warning as GCC. A previous commit enabled it for all compilers. This
needs to be silenced for the entire build, but it currently only is
silenced for some directories.
Solution:
- Move the `-Wno-literal-suffix` option up in the CMakeLists.txt so
that it gets applied everywhere.
Problem:
- There are redundant options being set for some directories.
- Clang ToT fails to compile the project.
Solution:
- Remove redundancies.
- Fix clang error list.
This warning informs of float-to-double conversions. The best solution
seems to be to do math *either* in 32-bit *or* in 64-bit, and only to
cross over when absolutely necessary.
This flag warns on classes which have `virtual` functions but do not
have a `virtual` destructor.
This patch adds both the flag and missing destructors. The access level
of the destructors was determined by a two rules of thumb:
1. A destructor should have a similar or lower access level to that of a
constructor.
2. Having a `private` destructor implicitly deletes the default
constructor, which is probably undesirable for "interface" types
(classes with only virtual functions and no data).
In short, most of the added destructors are `protected`, unless the
compiler complained about access.
The following warnings do not occur anywhere in the codebase and so
enabling them is effectivly free:
* `-Wcast-align`
* `-Wduplicated-cond`
* `-Wformat=2`
* `-Wlogical-op`
* `-Wmisleading-indentation`
* `-Wunused`
These are taken as a strict subset of the list in #5487.
install-ports copys the necessary files from Ports/ to /usr/Ports. Also
refactor the compiler and destiation variables from .port_include.sh
into .hosted_defs.sh. .hosted_defs.sh does not exists when ports are
built in serenity