Commit graph

131 commits

Author SHA1 Message Date
Sönke Holz
9437b29b43 LibELF+LibC: Add support for Variant I of the TLS data structures
We currently only supported Variant II which is used by x86-64.
Variant I is used by both AArch64 (when using the traditional
non-TLSDESC model) and RISC-V, although with small differences.

The TLS layout for Variant I is essentially flipped. The static TLS
blocks are after the thread pointer for Variant I, while on Variant II
they are before it.

Some code using ELF TLS already worked on AArch64 and RISC-V even though
we only support Variant II. This is because only the local-exec model
directly uses TLS offsets, other models use relocations or
__tls_get_addr().
2024-04-19 16:46:47 -06:00
Sönke Holz
aa44fe860d LibELF: Change copy_initial_tls_data_into argument type to Bytes 2024-04-19 16:46:47 -06:00
Dan Klishch
5ed7cd6e32 Everywhere: Use east const in more places
These changes are compatible with clang-format 16 and will be mandatory
when we eventually bump clang-format version. So, since there are no
real downsides, let's commit them now.
2024-04-19 06:31:19 -04:00
Sönke Holz
0b0ea19d12 LibELF+readelf: Add support for RISC-V dynamic relocation types 2024-02-24 16:05:50 -07:00
Sönke Holz
f8628f94b8 LibELF: Refactor how arch-specific dynamic relocation types are handled
We currently expect that the relocation type numbers are unique across
all architectures. But RISC-V and x86_64 use the same numbers for
different relocation types (R_X86_64_COPY = R_RISCV_JUMP_SLOT = 5).

So create a generic reloc type enum which maps to the arch-specific
reloc types instead of checking for all arch reloc types individually
everywhere.
2024-02-24 16:05:50 -07:00
Sönke Holz
525555181e LibELF: Add riscv64 PLT trampoline
This code is based on the aarch64 implementation.
2024-02-24 15:41:23 -07:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Andrew Kaster
87cbc63334 LibELF: Remove loader reservation after most allocating operations
It's possible for a malloc inside load_program_headers() to steal the
reserved memory space we created for the program headers. Remove the
reservation later in the method.
2023-12-12 17:41:44 +01:00
Daniel Bertalan
45d81dceed Everywhere: Replace ElfW(type) macro usage with Elf_type
This works around a `clang-format-17` bug which caused certain usages to
be misformatted and fail to compile.

Fixes #8315
2023-12-01 10:02:39 +02:00
Sönke Holz
9d7e217566 LibELF: Handle TLSDESC relocations in .rela.plt for GNU ld
GNU ld for some reason might put R_*_TLSDESC relocations in .rela.plt.
2023-10-14 19:16:22 +02:00
Sönke Holz
0bff1f61b6 LibC+LibELF: Correctly call destructors on exit()
We currently don't call any DT_FINI_ARRAY functions, so change that.

The call to `_fini` in `exit` is unnecessary, as we now call the
function referenced by DT_FINI in `__call_fini_functions`.
2023-10-12 15:20:50 +02:00
Andrew Kaster
1cd3826ad6 Userland+Tests: Don't use MAP_FILE when mmap-ing
MAP_FILE is not in POSIX, and is simply in most LibCs as a "default"
mode. Our own LibC defines it as 0, meaning "no flags". It is also not
defined in some OS's, such as Haiku. Let's be more portable and not use
the unnecessary flag.
2023-09-01 19:50:35 +02:00
Daniel Bertalan
1adf06c9f0 LibELF: Cache consecutive lookups for the same symbol
This reduces the startup time of LibWeb by 10%, and eliminates 156'000
of the total 481'000 global symbol lookups during a self-test run.
2023-08-19 05:15:08 +02:00
Daniel Bertalan
ad9e674fa0 LibC+LibELF: Support loading shared libraries compiled with dynamic TLS
This is a prerequisite for upstreaming our LLVM patches, as our current
hack forcing `-ftls-model=initial-exec` in the Clang driver is not
acceptable upstream.

Currently, our kernel-managed TLS implementation limits us to only
having a single block of storage for all thread-local variables that's
initialized at load time. This PR merely implements the dynamic TLS
interface (`__tls_get_addr` and TLSDESC) on top of our static TLS
infrastructure. The current model's limitations still stand:
- a single static TLS block is reserved at load time, `dlopen()`-ing
  shared libraries that define thread-local variables might cause us to
  run out of space.
- the initial TLS image is not changeable post-load, so `dlopen()`-ing
  libraries with non-zero-initialized TLS variables is not supported.

The way we repurpose `ti_module` to mean "offset within static TLS
block" instead of "module index" is not ABI-compliant.
2023-08-18 16:20:13 +02:00
Daniel Bertalan
70fcbcf54b LibELF+readelf: Add missing constants for dynamic relocations
These should cover all relocation types we can possibly see in an x86_64
or AArch64 final linked ELF image.
2023-08-18 16:20:13 +02:00
Daniel Bertalan
e2b1f9447c LibELF: Only call IFUNC resolvers after populating the PLT
As IFUNC resolvers may call arbitrary functions though the PLT, they can
only be called after the PLT has been populated. This is true of the
`[[gnu::target_clones]]` attribute, which makes a call to
`__cpu_indicator_init`, which is defined in `libgcc_s.so`, through the
PLT.

`do_plt_relocation` and `do_direct_relocation` are given a parameter
that controls whether IFUNCs are immediately resolved. In the first
pass, relocations pointing to IFUNCs are put on a worklist, while all
other relocations are performed. Only after non-IFUNC relocations are
done and the PLT is set up do we deal with these.
2023-05-14 13:47:53 +02:00
Daniel Bertalan
cd45c2d295 LibELF: Split do_relocation into do_{direct,plt}_relocation
No functional changes intended. This is in preparation of a commit that
overhauls how IFUNCs are resolved.

This commit lets us move the implementation of PLT patching from
`DynamicObject` to `DynamicLoader` where all other relocation code
lives. For this, got[2] now stores the loader's address instead of the
object's.
2023-05-14 13:47:53 +02:00
Daniel Bertalan
c4e0f5e5ee LibC+LibELF: Handle the R_AARCH64_IRELATIVE relocation type
This is the AArch64 equivalent of `R_X86_64_IRELATIVE`, which specifies
a symbol whose address is determined by calling a local IFUNC resolver
function.
2023-05-14 13:47:53 +02:00
Idan Horowitz
f412e73bba DynamicLoader: Remove the unused load_regions vector 2023-04-09 11:10:37 +03:00
Timon Kruiper
ed3be5b7f5 LibELF+LibC: Add support for aarch64 relocations
This commit adds the used relocation types to elf.h, and handles the
types in DynamicLoader and DynamicObject. No new functionalitty has to
be added, as the same code can be reused between aarch64 and x86_64.
2023-02-15 22:53:19 +01:00
Ben Wiederhake
8a331d4fa0 Everywhere: Move AK/Debug.h include to using files or remove 2023-01-02 20:27:20 -05:00
Liav A
a4c87fac56 LibELF+LibSymbolication: Remove i686 support 2022-12-28 11:53:41 +01:00
Linus Groh
57dc179b1f Everywhere: Rename to_{string => deprecated_string}() where applicable
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.

One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
2022-12-06 08:54:33 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Tim Schumacher
d0d494a151 LibELF: Drop the separate file name member from DynamicLoader 2022-10-31 19:23:02 +00:00
Tim Schumacher
177a5baf60 LibELF: Ensure that DynamicLoader only receives absolute paths
While at it, start renaming variables where we know that they store a
path, so that we will get less confused in the future.
2022-10-31 19:23:02 +00:00
Andrew Kaster
828441852f Everywhere: Replace uses of __serenity__ with AK_OS_SERENITY
Now that we have OS macros for essentially every supported OS, let's try
to use them everywhere.
2022-10-10 12:23:12 +02:00
Tim Schumacher
e2c55ee0a8 LibC: Move dlfcn_integration.h to the bits directory 2022-09-05 10:12:02 +01:00
Tim Schumacher
27bfb81702 Everywhere: Refer to dlfcn*.h by its non-prefixed name 2022-09-05 10:12:02 +01:00
Tim Schumacher
3f59cb5e70 LibELF: Copy the entire TLS segment instead of each symbol one-by-one
This automatically fixes an issue where we were accidentally copying
garbage data from beyond the TLS segment as uninitialized data isn't
actually stored inside the image.
2022-07-20 18:24:13 +02:00
Tim Schumacher
6799b271bf LibELF: Remove outdated TLS handling in generic program header code 2022-07-20 18:24:13 +02:00
sin-ack
3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Idan Horowitz
fbeef409c6 DynamicLoader: Stop performing relative relocations on non-pie objects
Co-authored-by: Daniel Bertalan <dani@danielbertalan.dev>
2022-07-10 14:24:34 +02:00
Idan Horowitz
753844ec96 LibELF: Take TLS segment alignment into account in DynamicLoader
Previously we would just tightly pack the different libraries' TLS
segments together, but that is incorrect, as they might require some
kind of minimum alignment for their TLS base address.

We now plumb the required TLS segment alignment down to the TLS block
linear allocator and align the base address down to the appropriate
alignment.
2022-07-05 11:26:10 +02:00
Tim Schumacher
e2036ca2ca LibELF: Store the full file path in DynamicObject
Otherwise, our `dirname` call on the parent object will always be empty
when trying to resolve dependencies.
2022-06-30 11:57:10 +02:00
Tim Schumacher
6732fec8b8 LibELF: Warn on self-dlopening libraries while initializing 2022-06-24 11:28:05 +01:00
Tim Schumacher
082a7baa3b LibELF: Check if initializers ran instead of trusting s_global_objects
The original heuristic of "a library being in `s_global_objects` means
that it was fully initialized already" doesn't hold up anymore since we
changed the loading order. This was causing us to skip parts of the
initialization of dependency libraries when running dlopen (since it was
the only user of that setting).

Instead, set a flag after we run stage 4 (which is the "run the global
initializers" stage) and check that flag when determining unfinished
dependencies. This entirely replaces the `skip_global_objects` logic.
2022-06-24 11:28:05 +01:00
Tim Schumacher
d2b87419ac LibELF: Only collect region sizes before reserving memory
This keeps us from needlessly allocating storage via `malloc` as part
of the `Vector`s that early, which we might conflict on while reserving
memory for the main executable.
2022-06-21 22:38:15 +01:00
Tim Schumacher
c0b31796a8 LibELF: Unmap the source file temporarily while reserving space
This further reduces the chance that we will conflict with data that is
already present at the target location.
2022-06-21 22:38:15 +01:00
Tim Schumacher
c1d8612eb5 LibELF: Store DynamicLoader ELF images using an OwnPtr
This is preparation work for the next commit, where we will replace the
stored ELF image mid-load.
2022-06-21 22:38:15 +01:00
Tim Schumacher
89da0f2da5 LibELF: Name library maps with the full file path 2022-05-07 20:02:00 +02:00
Daniel Bertalan
7aca408993 LibELF: Fail gracefully when IFUNC resolver's object has textrels
.text sections of objects that contain textrels have to be writable
during the relocation procedure. Because of this, we would segfault if
we tried to execute IFUNC resolvers defined in them. Let's print a
meaningful error message instead.

Additionally, a warning is now printed when we load objects with
textrels, as in the future, additional security mitigations might
interfere with them being loaded.
2022-05-01 12:42:01 +02:00
Daniel Bertalan
08c459e495 LibELF: Add support for IFUNCs
IFUNC is a GNU extension to the ELF standard that allows a function to
have multiple implementations. A resolver function has to be called at
load time to choose the right one to use. The PLT will contain the entry
to the resolved function, so branching and more indirect jumps can be
avoided at run-time.

This mechanism is usually used when a routine can be made faster using
CPU features that are available in only some models, and a fallback
implementation has to exist for others.

We will use this feature to have two separate memset implementations for
CPUs with and without ERMS (Enhanced REP MOVSB/STOSB) support.
2022-05-01 12:42:01 +02:00
Daniel Bertalan
ed5f110b40 LibELF: Perform .relr.dyn relocations before .rel.dyn
IFUNC resolvers depend on the resolved function's address having been
relocated by the time they are called. This means that relative
relocations have to be done first.

The linker is kind enough to put R_*_RELATIVE before R_*_IRELATIVE in
.rel.dyn, but .relr.dyn contains relative relocations too.
2022-05-01 12:42:01 +02:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Brian Gianforcaro
7d667b9f69 LibELF: Remove unused m_program_interpreter member from DynamicLoader
While profiling I realized that this member is unused, so the
StringBuilder and String allocation are completely un-necessary.
2022-03-31 10:18:07 +02:00
Daniel Bertalan
3974cac148 LibELF: Implement support for DT_RELR relative relocations
The DT_RELR relocation is a relatively new relocation encoding designed
to achieve space-efficient relative relocations in PIE programs.

The description of the format is available here:
https://groups.google.com/g/generic-abi/c/bX460iggiKg/m/Pi9aSwwABgAJ

It works by using a bitmap to store the offsets which need to be
relocated. Even entries are *address* entries: they contain an address
(relative to the base of the executable) which needs to be relocated.
Subsequent even entries are *bitmap* entries: "1" bits encode offsets
(in word size increments) relative to the last address entry which need
to be relocated.

This is in contrast to the REL/RELA format, where each entry takes up
2/3 machine words. Certain kinds of relocations store useful data in
that space (like the name of the referenced symbol), so not everything
can be encoded in this format. But as position-independent executables
and shared libraries tend to have a lot of relative relocations, a
specialized encoding for them absolutely makes sense.

The authors of the format suggest an overall 5-20% reduction in the file
size of various programs. Due to our extensive use of dynamic linking
and us not stripping debug info, relative relocations don't make up such
a large portion of the binary's size, so the measurements will tend to
skew to the lower side of the spectrum.

The following measurements were made with the x86-64 Clang toolchain:

- The kernel contains 290989 relocations. Enabling RELR decreased its
  size from 30 MiB to 23 MiB.
- LibUnicodeData contains 190262 relocations, almost all of them
  relative. Its file size changed from 17 MiB to 13 MiB.
- /bin/WebContent contains 1300 relocations, 66% of which are relative
  relocations. With RELR, its size changed from 832 KiB to 812 KiB.

This change was inspired by the following blog post:
https://maskray.me/blog/2021-10-31-relative-relocations-and-relr
2022-02-11 18:07:53 +01:00
Andreas Kling
c482508aa1 LibELF: Use shared memory mapping when loading ELF objects
There's no reason to make a private read-only mapping just for reading
(and validating) the ELF headers, and copying out the data segments.
2022-01-15 19:51:15 +01:00
Idan Horowitz
cfb9f889ac LibELF: Accept Span instead of Pointer+Size in validate_program_headers 2022-01-13 22:40:25 +01:00
Idan Horowitz
3e959618c3 LibELF: Use StringBuilders instead of Strings for the interpreter path
This is required for the Kernel's usage of LibELF, since Strings do not
expose allocation failure.
2022-01-13 22:40:25 +01:00