Commit graph

39 commits

Author SHA1 Message Date
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Tim Schumacher
d0d494a151 LibELF: Drop the separate file name member from DynamicLoader 2022-10-31 19:23:02 +00:00
Tim Schumacher
e2c55ee0a8 LibC: Move dlfcn_integration.h to the bits directory 2022-09-05 10:12:02 +01:00
Tim Schumacher
27bfb81702 Everywhere: Refer to dlfcn*.h by its non-prefixed name 2022-09-05 10:12:02 +01:00
Tim Schumacher
3f59cb5e70 LibELF: Copy the entire TLS segment instead of each symbol one-by-one
This automatically fixes an issue where we were accidentally copying
garbage data from beyond the TLS segment as uninitialized data isn't
actually stored inside the image.
2022-07-20 18:24:13 +02:00
Idan Horowitz
753844ec96 LibELF: Take TLS segment alignment into account in DynamicLoader
Previously we would just tightly pack the different libraries' TLS
segments together, but that is incorrect, as they might require some
kind of minimum alignment for their TLS base address.

We now plumb the required TLS segment alignment down to the TLS block
linear allocator and align the base address down to the appropriate
alignment.
2022-07-05 11:26:10 +02:00
Tim Schumacher
6732fec8b8 LibELF: Warn on self-dlopening libraries while initializing 2022-06-24 11:28:05 +01:00
Tim Schumacher
082a7baa3b LibELF: Check if initializers ran instead of trusting s_global_objects
The original heuristic of "a library being in `s_global_objects` means
that it was fully initialized already" doesn't hold up anymore since we
changed the loading order. This was causing us to skip parts of the
initialization of dependency libraries when running dlopen (since it was
the only user of that setting).

Instead, set a flag after we run stage 4 (which is the "run the global
initializers" stage) and check that flag when determining unfinished
dependencies. This entirely replaces the `skip_global_objects` logic.
2022-06-24 11:28:05 +01:00
Tim Schumacher
c1d8612eb5 LibELF: Store DynamicLoader ELF images using an OwnPtr
This is preparation work for the next commit, where we will replace the
stored ELF image mid-load.
2022-06-21 22:38:15 +01:00
Tim Schumacher
89da0f2da5 LibELF: Name library maps with the full file path 2022-05-07 20:02:00 +02:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Brian Gianforcaro
7d667b9f69 LibELF: Remove unused m_program_interpreter member from DynamicLoader
While profiling I realized that this member is unused, so the
StringBuilder and String allocation are completely un-necessary.
2022-03-31 10:18:07 +02:00
Tim Schumacher
7bd0a3e9ba DynamicLoader: Make the cached DynamicObject publicly accessible 2022-03-08 23:21:35 +01:00
Daniel Bertalan
3974cac148 LibELF: Implement support for DT_RELR relative relocations
The DT_RELR relocation is a relatively new relocation encoding designed
to achieve space-efficient relative relocations in PIE programs.

The description of the format is available here:
https://groups.google.com/g/generic-abi/c/bX460iggiKg/m/Pi9aSwwABgAJ

It works by using a bitmap to store the offsets which need to be
relocated. Even entries are *address* entries: they contain an address
(relative to the base of the executable) which needs to be relocated.
Subsequent even entries are *bitmap* entries: "1" bits encode offsets
(in word size increments) relative to the last address entry which need
to be relocated.

This is in contrast to the REL/RELA format, where each entry takes up
2/3 machine words. Certain kinds of relocations store useful data in
that space (like the name of the referenced symbol), so not everything
can be encoded in this format. But as position-independent executables
and shared libraries tend to have a lot of relative relocations, a
specialized encoding for them absolutely makes sense.

The authors of the format suggest an overall 5-20% reduction in the file
size of various programs. Due to our extensive use of dynamic linking
and us not stripping debug info, relative relocations don't make up such
a large portion of the binary's size, so the measurements will tend to
skew to the lower side of the spectrum.

The following measurements were made with the x86-64 Clang toolchain:

- The kernel contains 290989 relocations. Enabling RELR decreased its
  size from 30 MiB to 23 MiB.
- LibUnicodeData contains 190262 relocations, almost all of them
  relative. Its file size changed from 17 MiB to 13 MiB.
- /bin/WebContent contains 1300 relocations, 66% of which are relative
  relocations. With RELR, its size changed from 832 KiB to 812 KiB.

This change was inspired by the following blog post:
https://maskray.me/blog/2021-10-31-relative-relocations-and-relr
2022-02-11 18:07:53 +01:00
Gunnar Beutner
371c852fc0 LibELF: Swap the arguments for negative_offset_from_tls_block_end
Now that m_tls_offset points to the start of the TLS block the argument
order makes more sense this way.
2021-07-04 01:07:28 +02:00
Gunnar Beutner
5f6ee4c539 LibELF: Save the negative TLS offset in m_tls_offset
This makes it unnecessary to track the symbol size which just isn't
available for unexported symbols (e.g. for 'static __thread').
2021-07-04 01:07:28 +02:00
Gunnar Beutner
158355e0d7 Kernel+LibELF: Add support for validating and loading ELF64 executables 2021-06-28 22:29:28 +02:00
Andrew Kaster
7b4dc590e7 AK+Userland: Use akaster@serenityos.org for my copyright headers 2021-05-30 14:35:34 +01:00
Itamar
101ac45c1a LibELF: Change TLS offset calculation
This changes the TLS offset calculation logic to be based on the
symbol's size instead of the total size of the TLS.

Because of this change, we no longer need to pipe "m_tls_size" to so
many functions.

Also, After this patch, the TLS data of the main program exists at the
"end" of the TLS block (Highest addresses).

This fixes a part of #6609.
2021-04-30 18:47:39 +02:00
Itamar
6bbd2ebf83 Kernel+LibELF: Support initializing values of TLS data
Previously, TLS data was always zero-initialized.

To support initializing the values of TLS data, sys$allocate_tls now
receives a buffer with the desired initial data, and copies it to the
master TLS region of the process.

The DynamicLinker gathers the initial TLS image and passes it to
sys$allocate_tls.

We also now require the size passed to sys$allocate_tls to be
page-aligned, to make things easier. Note that this doesn't waste memory
as the TLS data has to be allocated in separate pages anyway.
2021-04-30 18:47:39 +02:00
Itamar
db76702d71 LibELF: Rename tls_size to tls_size_of_current_object 2021-04-30 18:47:39 +02:00
Itamar
1c24388d74 LibELF: Extract TLS offset calculation logic to separate function 2021-04-30 18:47:39 +02:00
Gunnar Beutner
f40ee1b03f LibC+LibELF: Implement more fully-features dlfcn functionality
This implements more of the dlfcn functionality. Most notably:

* It's now possible to dlopen() libraries which were already
  loaded at program startup time. This does not cause those
  libraries to be loaded twice.
* Errors are reported via dlerror() rather than by crashing
  the program.
* Calls to the dl*() functions are thread-safe.
2021-04-25 10:14:50 +02:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Gunnar Beutner
1dab5ca5fd LibELF: Fix support for relocating weak symbols
Having unresolved weak symbols is allowed and we should initialize
them to zero.
2021-04-19 12:00:40 +02:00
Gunnar Beutner
6cb28ecee8 LibC+LibELF: Implement support for the dl_iterate_phdr helper
This helper is used by libgcc_s to figure out where the .eh_frame sections
are located for all loaded shared objects.
2021-04-18 10:55:25 +02:00
Gunnar Beutner
cd7512a2ad LibELF: Add support for loading objects with multiple data and text segments
This enables loading executables with multiple data and text segments. Also
it fixes loading executables where the text segment has a non-zero offset.

Example:

  $ echo "main () {}" > test.c
  $ gcc -Wl,-z,separate-code -o test test.c
  $ objdump -p test
  test:     file format elf32-i386

  Program Header:
      PHDR off    0x00000034 vaddr 0x00000034 paddr 0x00000034 align 2**2
           filesz 0x000000e0 memsz 0x000000e0 flags r--
    INTERP off    0x00000114 vaddr 0x00000114 paddr 0x00000114 align 2**0
           filesz 0x00000013 memsz 0x00000013 flags r--
      LOAD off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**12
           filesz 0x000003c4 memsz 0x000003c4 flags r--
      LOAD off    0x00001000 vaddr 0x00001000 paddr 0x00001000 align 2**12
           filesz 0x00000279 memsz 0x00000279 flags r-x
      LOAD off    0x00002000 vaddr 0x00002000 paddr 0x00002000 align 2**12
           filesz 0x00000004 memsz 0x00000004 flags r--
      LOAD off    0x00002004 vaddr 0x00003004 paddr 0x00003004 align 2**12
           filesz 0x00000100 memsz 0x00000124 flags rw-
   DYNAMIC off    0x00002014 vaddr 0x00003014 paddr 0x00003014 align 2**2
           filesz 0x000000c8 memsz 0x000000c8 flags rw-
2021-04-14 13:12:52 +02:00
Andreas Kling
79889ef052 LibELF: Consolidate main executable loading a bit
Merge the load_elf() and commit_elf() functions into a single
load_main_executable() function that takes care of both things.

Also split "stage 3" into two separate stages, keeping the lazy
relocations in stage 3, and adding a stage 4 for calling library
initialization functions.

We also make sure to map the main executable before dealing with
any of its dependencies, to ensure that non-PIE executables get
loaded at their desired address.
2021-02-26 14:49:55 +01:00
Andreas Kling
f23b29f605 LibELF: Move DynamicObject::lookup_symbol() to DynamicLoader
Also simplify it by removing an unreachable code path.
2021-02-21 00:29:52 +01:00
Andreas Kling
01f1e480e5 LibELF: Fix various clang-tidy warnings
Remove a bunch of unused code, unnecessary const, and make some
non-object-specific member functions static.
2021-02-21 00:02:21 +01:00
Andreas Kling
0c0127dc3f LibELF: Use StringView instead of "const char*" in dynamic linker code
There's no reason to use C strings more than absolutely necessary.
2021-02-20 22:29:12 +01:00
Andreas Kling
08476e7fe7 DynamicLoader: Always make .data segment read+write
Let's just ignore the program header and always go with read+write.
Nothing else makes sense anyway.
2021-02-20 19:02:48 +01:00
Andreas Kling
fa4c249425 LibELF+Userland: Enable RELRO for all userland executables :^)
The dynamic loader will now mark RELRO segments read-only after
performing relocations. This is pretty cool!

Note that this only applies to main executables so far,.
RELRO support for shared libraries will require some reorganizing
of the dynamic loader.
2021-02-18 18:55:19 +01:00
Andreas Kling
0d3866e84c DynamicLoader: Some ELF data segments were allocated too small
For a data segment that starts at a non-zero offset into a 4KB page and
crosses a 4KB page boundary, we were failing to pad the VM allocation,
which would cause the memcpy() to fail.

Make sure we round the segment bases down, and segment ends up, and the
issue goes away.
2021-02-18 18:14:59 +01:00
Andreas Kling
e313323317 LibELF: Split the DynamicLoader's loading mechanism into two steps
load_from_image() becomes map() and link(). This allows us to map
an object before mapping its dependencies.

This solves an issue where fixed-position executables (like GCC)
would clash with the ASLR placement of their own shared libraries.
2021-01-31 11:46:00 +01:00
Andreas Kling
68576bcf1b LibELF: Call mmap() before constructing the DynamicLoader object
Refactor DynamicLoader construction with a try_create() helper so that
we can call mmap() before making a loader. This way the loader doesn't
need to have an "mmap failed" state.

This patch also takes care of determining the ELF file size in
try_create() instead of expecting callers to provide it.
2021-01-31 11:06:00 +01:00
Andreas Kling
adcc1c1eff LibELF: Cache the DynamicObject in DynamicLoader
This avoids reparsing the same dynamic library file multiple times.
2021-01-25 18:57:06 +01:00
Andreas Kling
41d8734288 LibELF: Use Optional<SymbolLookupResult> as a return type
Instead of storing a "found" state inside the result object.
2021-01-25 18:57:06 +01:00
Andreas Kling
13d7c09125 Libraries: Move to Userland/Libraries/ 2021-01-12 12:17:46 +01:00
Renamed from Libraries/LibELF/DynamicLoader.h (Browse further)