This library is used by virtually all executables in the Clang
toolchain. By default, it is linked statically, which leads to huge
file sizes and us running out of artifact storage disk space on CI.
This contains all the bits and pieces necessary to build a Clang binary
that will correctly compile SerenityOS.
I had some trouble with getting LLVM building with a single command, so
for now, I decided to build each LLVM component in a separate command
invocation. In the future, we can also make the main llvm build step
architecture-independent, but that would come with extra work to make
library and include paths work.
The binutils build invocation and related boilerplate is duplicated
because we only use `objdump` from GNU binutils in the Clang toolchain,
so most features can be disabled.
CMake specifies -arch arm64 for our toolchain. Unfortunately that's an
option GCC only understands when built for macOS. This causes the build
to fail.
I haven't been able to get CMake to not specify that option so this adds
a dummy option to GCC.
Previously we'd place the QEMU binaries into the architecture-specific
toolchain directory. This is a problem because the BuildIt.sh script
clears those directories which also removes the QEMU binaries users
may have built earlier. Also, the QEMU binaries are not specific to
the target architecture.
Docker is a nice way of doing build automation, or just
containerizing builds for increased safety and isolating unstable
packages. The old Dockerfile in the toolchain did not satisfy these
needs. The new Dockerfile is known to run successfully on Docker
version 20.10.7. It clones the SerenityOS repo and builds the
toolchain. In this way, it is intended to be a starting point for other
Docker images that can e.g. run builds. For example, one can simply run
this docker image as-is, exec a shell in it and run a build there.
Rather than having the toolchain build fail half-way through we should
check whether the user has installed all the required tools and
libraries early on.
Previously the buildstep function would obscure error codes because
the return value of the function was the exit code for the sed command
which caused us to continue execution even though one of the build
steps had failed.
With set -o pipefail the return value of the buildstep function is
the real command's exit code.
This ensures inter-machine compatibility by not emitting any processor
specific instructions. This fixes the issue raised by the non AVX-512
supporting GitHub actions runners.
-march=native specializes the binaries for the CPU features available on
the CPU the binary is being compiled on. This matches the needs of the
Toolchain, as it's always built and used on that machine only.
This should be safe for the github actions VMs as well, as they all run
on a standard VM SKU in "the cloud".
I saw small but notable improvements in end-2-end build times in my
local testing. Each compilation unit is on average around a second
faster on my Intel(R) Core(TM) i7-8705G CPU @ 3.10GHz.
This makes stdlib.h and stdio.h functions available in the std
namespace for C++.
libstdc++v3's link tests can fail if you don't have an up-to-date
build directory, for example:
1. Have libc with missing _Exit symbol because you haven't done
a build since that was added.
2. Run toolchain rebuild. libstdc++v3's configure script will
realize that it can do link tests in general but will fail
later on when it tries to link a program that tests for _Exit.
Even though this is a toolchain patch this does not necessarily
require rebuilding the toolchain right away. This is only required
once we start using any of these new members in the std namespace,
e.g. for ports.
This fixes the -nodefaultlibs flag for gcc which previously
linked against libgcc_s anyway. Even though this is a toolchain
patch we don't need to rebuild the toolchain right away.
BuildIt.sh had a bunch of SC2086 errors, where we were not quoting
variables in variable expansions. The logic being:
Quoting variables prevents word splitting and glob expansion,
and prevents the script from breaking when input contains spaces,
line feeds, glob characters and such.
Reference: https://github.com/koalaman/shellcheck/wiki/SC2086
As bcoles noticed in #6772, shellcheck actually found a real bug here,
where the user's build directory included spaces.
Close: #6772
BuildFuseExt2.sh was saying it should be run under /bin/sh but it is
using bash extensions like pushd/popd, ${BASH_SOURCE[0]}, etc. So just
run it under bash to avoid any potential issues.
Ordinarily this would force the compiler to not inline certain
symbols and call them via the PLT instead. To counteract this
I've also added -fno-semantic-interposition which disables
ELF symbol interposition. Our dynamic loader doesn't support
this anyway and we might even consider not implementing this
at all.
Even though this is a toolchain change this doesn't require
rebuilding the toolchain unless you're planning to build
for the x86_64 arch.
Previously the toolchain's binutils would not have been able to
build binaries on 32-bit host systems (not that this would be
much of an issue nowadays) because one of the #ifdefs was in
the wrong place.
I moved the #ifdef in the port's patch and this now updates
the toolchain's patch file to match the port's patch.
Changes since rc4:
0cef06d187: Update version for v6.0.0-rc5 release
5351fb7cb2: hw/block/nvme: fix invalid msix exclusive uninit
ffa090bc56: target/s390x: fix s390_probe_access to check PAGE_WRITE_ORG
bc38e31b4e: net: check the existence of peer before trying to pad
Make this stuff a bit easier to maintain by using the
root level variables to build up the Toolchain paths.
Also leave a note for future editors of BuildIt.sh to
give them warning about the other changes they'll need
to make.
This enables building usermode programs with exception handling. It also
builds a libstdc++ without exception support for the kernel.
This is necessary because the libstdc++ that gets built is different
when exceptions are enabled. Using the same library binary would
require extensive stubs for exception-related functionality in the
kernel.
Instead GCC should be used to automatically link against crt0
and crt0_shared depending on the type of object file that is being
built.
Unfortunately this requires a rebuild of the toolchain as well
as everything that has been built with the old GCC.
GCC determines whether the system's <limits.h> header is usable
and installs a different version of its own <limits.h> header
depending on whether the system header file exists.
If the system header is missing GCC's <limits.h> header does not
include the system header via #include_next.
For this to work we need to install LibC's headers before
attempting to build GCC.
Also, re-running BuildIt.sh "hides" this problem because at that
point the sysroot directory also already has a <limits.h> header
file from the previous build.
Our TLS implementation relies on the TLS model being "initial-exec".
We previously enforced this by adding the '-ftls-model=initial-exec'
flag in the root CmakeLists file, but that did not affect ports - So
now we put that flag in the gcc spec files.
Closes#5366
realpath(1) is specific to coreutils and its behavior can be had
with readlink -f
Create the Toolchain Build directory if it doesn't exist before
calling readlink, since realpath(3) on at least OpenBSD will error
on a non-existent path
The current version of our Python port (3.6.0) is over four years old by
now and has (or had, I haven't actually tried it in a while) some
limitations - time for an upgrade! The latest Python release is 3.9.1,
so I used that version. It's a from-scratch port, no patches are taken
from the previous port to ensure the smallest possible amount of code is
patched. The BuildPython.sh script is useful so I kept it, with some
tweaks. I added a short document explaining each patch to ease judging
their underlying problem and necessity in the future.
Compared to the old Python port, this one does support both the time
module as well as threading (at least _thread) just fine. Importing
modules written in C (everything in /usr/local/lib/python3.9/lib-dynload)
currently asserts in Serenity's dynamic loader, which is unfortunate but
probably solvable. Possibly related to #4642. I didn't try building
Python statically, which might be one possibility to circumvent this
issue.
I also renamed the directory to just "python3", which is analogous to
the Python 3.x package most Linux distributions provide. That implicitly
means that we likely will not support multiple versions of the Python
port at any given time, but again, neither do many other systems by
default. Recent versions are usually backwards compatible anyway though,
so having the latest shouldn't be a problem.
On the other hand bumping the version should now be be as simple as
updating the variables in version.sh, given that no new patches are
required.
These core modules to currently not build - I chose to ignore that for
now rather than adding more patches to make them work somehow, which
means they're fully unavailable. This should probably be fixed in
Serenity itself.
_ctypes, _decimal, _socket, mmap, resource, termios
These optional modules requiring 3rd-party dependencies do currently not
build (even with depends="ncurses openssl zlib"). Especially the absence
of a readline port makes the REPL a bit painful to use. :^)
_bz2, _curses, _curses_panel, _dbm, _gdbm, _hashlib, _lzma, _sqlite3,
_ssl, _tkinter, _uuid, nis, ossaudiodev, readline, spwd, zlib
I did some work on LibC and LibM beforehand to add at least stubs of
missing required functions, it still encounters an ASSERT_NOT_REACHED()
/ TODO() every now and then, notably frexp() (implementations of that
can be found online easily if you want to get that working right now).
But then again that's our fault and not this port's. :^)
We now configure gcc to always use the -fno-exceptions flag.
This does not affect our code since we do not use exceptions, and also
fixes the gcc port.
RTTI is still disabled for the Kernel, and for the Dynamic Loader. This
allows for much less awkward navigation of class heirarchies in LibCore,
LibGUI, LibWeb, and LibJS (eventually). Measured RootFS size increase
was < 1%, and libgui.so binary size was ~3.3%. The small binary size
increase here seems worth it :^)
* Add SERENITY_ARCH option to CMake for selecting the target toolchain
* Port all build scripts but continue to use i686
* Update GitHub Actions cache to include BuildIt.sh
A good number of contributors use macOS. However, we have a bit of
a tendency of breaking the macOS build without realising it.
Luckily, GitHub Actions does actually supply macOS environments,
so let's use it.
We now configure the gcc spec files to use a different crt files for
static & PIE binaries.
This relieves us from the need to explicitly specify the desired crt0
file in cmake scripts.
This is necessary because cache reusability will be determined by Github Actions.
Note that we only cache if explicitly asked to do so,
which only happens on Github Actions.
When libstdc++ was added in 4977fd22b8, just calling
'make install' was the easiest way to install the headers. And the headers are all
that is needed for libstdc++ to determine the ABI. Since then, BuildIt.sh was
rewritten again and again, and somehow everyone just silently assumed that
libstdc++ also depends on libc.a and libm.a, because surely it does?
Turns out, it doesn't! This massively reduces the dependencies of libstdc++,
hopefully meaning that the Toolchain doesn't need to be rebuilt so often on Travis.
Furthermore, the old method of trying to determine the dependency tree with
bash/grep/etc. has finally broken anyways:
https://travis-ci.com/github/SerenityOS/serenity/builds/179805569#L567
In summary, this should eliminate most of the Toolchain rebuilds on Travis,
and therefore make Travis build blazingly fast! :^)
./configure generates about 3500 lines in a few seconds. Noone will ever read
those lines and they make loading the Travis webpage slower. And if there is
ever a problem, it will be because the Travis base image changed (which happens
only rarely) in a way that interferes with compiling gcc (which is incredibly
unlikely), or we update gcc (which happens very rarely) and gcc doesn't like
the Travis iamge (which again is incredibly unlikely). In all of these cases,
finding the culprit will be self-evident.
Empirically, every single push or PR has to download *and then upload*
about 3.6 GiB of "cache stuff", which takes up about 400 seconds:
https://travis-ci.com/github/SerenityOS/serenity/builds/177500795
On every single push/PR! No matter what!
Those 3.6 GB consist of:
- 3.2 GB Toolchain cache (around 260 MB per compressed item)
- 0.4 GB ccache, but is capped at 0.5 GB: https://travis-ci.com/github/BenWiederhake/serenity/builds/177528549
- (And 200 KB for some weird debian package? Dunno.)
Investigating in the size, the Toolchain consists mostly of *DEBUG SYMBOLS IN
THE COMPILER BINARIES* which comically misses the point. If we ever run into
compiler crashes, any stacktrace would be lost anyway as soon as the Travis VM
shuts down. Furthermore, Travis will only ever compile Serenity itself, and
Serenity forbids C in it's Contribution Guidelines. That's another 20 MB we
don't need to cache.
Stripping the binaries and deleting the C compiler reduces the uncompressed size
from 1200 MB down to 220 MB. The compressed size gets reduced from 260 MB to 70MB.
That's a reduction of 73%.
It'll take a while until the 'old' toolchains get deleted.
I guess it'll take less than a week.
From that point onward, the Travis cache will be 1.2 GB, consisting of:
- 0.7 GB Toolchain cache
- 0.5 GB ccache
- (And that weird 200 KB deb file)
If network speeds are linear, then this should reduce the "cache network
overhead time" from about 400 seconds to about 120 seconds.
tl;dr: Strip unnecessary debug infos, delete an unused files, and speed
everything up by two minutes. (Both Toolchain cache hits and Toolchain rebuilds!)