Commit graph

112 commits

Author SHA1 Message Date
Gunnar Beutner
811f9d562d LibELF: Make sure the mmap() regions are large enough
Sometimes we'd end up requesting a smaller range for .text and .data
than was actually necessary.
2021-06-29 20:03:36 +02:00
Gunnar Beutner
0cb937416b Meta: Install 64-bit libgcc_s.so for x86_64 targets 2021-06-28 22:29:28 +02:00
Gunnar Beutner
158355e0d7 Kernel+LibELF: Add support for validating and loading ELF64 executables 2021-06-28 22:29:28 +02:00
Andrew Kaster
7b4dc590e7 AK+Userland: Use akaster@serenityos.org for my copyright headers 2021-05-30 14:35:34 +01:00
Nicholas Baron
aa4d41fe2c
AK+Kernel+LibELF: Remove the need for IteratorDecision::Continue
By constraining two implementations, the compiler will select the best
fitting one. All this will require is duplicating the implementation and
simplifying for the `void` case.

This constraining also informs both the caller and compiler by passing
the callback parameter types as part of the constraint
(e.g.: `IterationFunction<int>`).

Some `for_each` functions in LibELF only take functions which return
`void`. This is a minimal correctness check, as it removes one way for a
function to incompletely do something.

There seems to be a possible idiom where inside a lambda, a `return;` is
the same as `continue;` in a for-loop.
2021-05-16 10:36:52 +01:00
Gunnar Beutner
0ab37dbd03 LibELF: Propagate ELF image validation errors to the caller
With this fixed dlopen() no longer crashes when given an invalid
ELF image and instead returns an error code that can be retrieved
with dlerror().

Fixes #6995.
2021-05-10 21:27:11 +02:00
Gunnar Beutner
a050b43290 LibELF: Implement x86_64 relocation support
There are definitely some relocations missing and this is untested
for now.
2021-05-03 08:42:39 +02:00
Itamar
101ac45c1a LibELF: Change TLS offset calculation
This changes the TLS offset calculation logic to be based on the
symbol's size instead of the total size of the TLS.

Because of this change, we no longer need to pipe "m_tls_size" to so
many functions.

Also, After this patch, the TLS data of the main program exists at the
"end" of the TLS block (Highest addresses).

This fixes a part of #6609.
2021-04-30 18:47:39 +02:00
Itamar
6bbd2ebf83 Kernel+LibELF: Support initializing values of TLS data
Previously, TLS data was always zero-initialized.

To support initializing the values of TLS data, sys$allocate_tls now
receives a buffer with the desired initial data, and copies it to the
master TLS region of the process.

The DynamicLinker gathers the initial TLS image and passes it to
sys$allocate_tls.

We also now require the size passed to sys$allocate_tls to be
page-aligned, to make things easier. Note that this doesn't waste memory
as the TLS data has to be allocated in separate pages anyway.
2021-04-30 18:47:39 +02:00
Itamar
db76702d71 LibELF: Rename tls_size to tls_size_of_current_object 2021-04-30 18:47:39 +02:00
Itamar
1c24388d74 LibELF: Extract TLS offset calculation logic to separate function 2021-04-30 18:47:39 +02:00
Gunnar Beutner
f40ee1b03f LibC+LibELF: Implement more fully-features dlfcn functionality
This implements more of the dlfcn functionality. Most notably:

* It's now possible to dlopen() libraries which were already
  loaded at program startup time. This does not cause those
  libraries to be loaded twice.
* Errors are reported via dlerror() rather than by crashing
  the program.
* Calls to the dl*() functions are thread-safe.
2021-04-25 10:14:50 +02:00
Gunnar Beutner
f74b8a2d1f LibELF: Avoid calculating symbol hashes when we don't need them 2021-04-23 23:35:36 +02:00
Andreas Kling
b91c49364d AK: Rename adopt() to adopt_ref()
This makes it more symmetrical with adopt_own() (which is used to
create a NonnullOwnPtr from the result of a naked new.)
2021-04-23 16:46:57 +02:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Gunnar Beutner
6c729993a8 LibELF: Allow shared objects which don't have a text segment
Shared objects without a text segment are perfectly OK. For
example libicudata.so has only data segments:

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .hash         00000014  00000094  00000094  00000094  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .dynsym       00000020  000000a8  000000a8  000000a8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .dynstr       0000002a  000000c8  000000c8  000000c8  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .rodata       01b562d0  00000100  00000100  00000100  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .eh_frame     00000000  01b563d0  01b563d0  01b563d0  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .dynamic      00000070  01b573d0  01b573d0  01b563d0  2**2
2021-04-19 20:39:22 +02:00
Gunnar Beutner
c32b58873a LibELF: Fix calculation for TLS relocations
The calculation for TLS relocations was incorrect which would
result in overlapping TLS variables when more than one shared
object used TLS variables.

This bug can be reproduced with a shared library and a program
like this:

    $ cat tlstest.c
    #include <string.h>
    __thread char tls_val[1024];
    void set_val() { memset(tls_val, 0, sizeof(tls_val)); }

    $ gcc -g -shared -o usr/lib/libtlstest.so tlstest.c

    $ cat test.c
    void set_val();
    int main() { set_val(); }
    $ gcc -g -o tls test.c -ltlstest

Due to the way the TLS relocations are done this program would
clobber libc's TLS variables (e.g. errno).
2021-04-19 12:14:43 +02:00
Gunnar Beutner
1dab5ca5fd LibELF: Fix support for relocating weak symbols
Having unresolved weak symbols is allowed and we should initialize
them to zero.
2021-04-19 12:00:40 +02:00
Gunnar Beutner
97d7450571 LibELF: Remove VERIFY() calls and let control flow return to the caller
This way we get better error messages for unresolved symbols because
the caller logs the file and symbol names.
2021-04-19 12:00:40 +02:00
Gunnar Beutner
6cb28ecee8 LibC+LibELF: Implement support for the dl_iterate_phdr helper
This helper is used by libgcc_s to figure out where the .eh_frame sections
are located for all loaded shared objects.
2021-04-18 10:55:25 +02:00
Gunnar Beutner
cd7512a2ad LibELF: Add support for loading objects with multiple data and text segments
This enables loading executables with multiple data and text segments. Also
it fixes loading executables where the text segment has a non-zero offset.

Example:

  $ echo "main () {}" > test.c
  $ gcc -Wl,-z,separate-code -o test test.c
  $ objdump -p test
  test:     file format elf32-i386

  Program Header:
      PHDR off    0x00000034 vaddr 0x00000034 paddr 0x00000034 align 2**2
           filesz 0x000000e0 memsz 0x000000e0 flags r--
    INTERP off    0x00000114 vaddr 0x00000114 paddr 0x00000114 align 2**0
           filesz 0x00000013 memsz 0x00000013 flags r--
      LOAD off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**12
           filesz 0x000003c4 memsz 0x000003c4 flags r--
      LOAD off    0x00001000 vaddr 0x00001000 paddr 0x00001000 align 2**12
           filesz 0x00000279 memsz 0x00000279 flags r-x
      LOAD off    0x00002000 vaddr 0x00002000 paddr 0x00002000 align 2**12
           filesz 0x00000004 memsz 0x00000004 flags r--
      LOAD off    0x00002004 vaddr 0x00003004 paddr 0x00003004 align 2**12
           filesz 0x00000100 memsz 0x00000124 flags rw-
   DYNAMIC off    0x00002014 vaddr 0x00003014 paddr 0x00003014 align 2**2
           filesz 0x000000c8 memsz 0x000000c8 flags rw-
2021-04-14 13:12:52 +02:00
Andreas Kling
ef1e5db1d0 Everywhere: Remove klog(), dbg() and purge all LogStream usage :^)
Good-bye LogStream. Long live AK::Format!
2021-03-12 17:29:37 +01:00
Andreas Kling
79889ef052 LibELF: Consolidate main executable loading a bit
Merge the load_elf() and commit_elf() functions into a single
load_main_executable() function that takes care of both things.

Also split "stage 3" into two separate stages, keeping the lazy
relocations in stage 3, and adding a stage 4 for calling library
initialization functions.

We also make sure to map the main executable before dealing with
any of its dependencies, to ensure that non-PIE executables get
loaded at their desired address.
2021-02-26 14:49:55 +01:00
Andreas Kling
7db8ccc0e4 LibC+DynamicLoader: Move "transactional memory" GCC stubs to LibC
Instead of having a special case in the dynamic loader where we ignore
TM-related GCC symbols, just stub them out in LibC like we already do
for various other things we don't support.
2021-02-24 14:54:26 +01:00
Brian Gianforcaro
069fd58381 LibELF: Convert more string literals to StringView literals.
Most of these won't have perf impact, but the optimization is
practically free, so no harm in fixing these up.
2021-02-24 14:45:34 +01:00
Andreas Kling
5d180d1f99 Everywhere: Rename ASSERT => VERIFY
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)

Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.

We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
2021-02-23 20:56:54 +01:00
Andreas Kling
d6af3302e8 LibELF: Don't recompute the same ELF hashes over and over
When performing a global symbol lookup, we were recomputing the symbol
hashes once for every dynamic object searched. The hash function was
at the very top of a profile (15%) of program startup.

With this change, the hash function is no longer visible among the top
stacks in the profile. :^)
2021-02-23 19:43:44 +01:00
Andreas Kling
af6a633468 LibELF: Remove an ungodly amount of DYNAMIC_LOAD_DEBUG logging
This logging mode was unusable anyway since it spams way too much.
The dynamic loader is in a pretty good place now anyway, so I think
it's okay for us to drop some of the bring-up debug logging. :^)

Also, we have to be careful with dbgln_if(FOO_DEBUG, "{}", foo())
where foo() is something expensive, since it might get evaluated
even if !FOO_DEBUG.
2021-02-23 19:43:44 +01:00
Andreas Kling
f23b29f605 LibELF: Move DynamicObject::lookup_symbol() to DynamicLoader
Also simplify it by removing an unreachable code path.
2021-02-21 00:29:52 +01:00
Andreas Kling
a43910acc3 LibELF: Make SymbolLookupResult::address a VirtualAddress
Let's use a stronger type than void* for this since we're talking
specifically about a virtual address and not necessarily a pointer
to something actually in memory (yet).
2021-02-21 00:02:21 +01:00
Andreas Kling
1997d5de1a LibELF: Make symbol lookup functions return Optional<Symbol>
It was very confusing how these functions used the "undefined" state
of Symbol to signal lookup failure. Let's use Optional<T> to make things
a bit more understandable.
2021-02-21 00:02:21 +01:00
Andreas Kling
46c3ff2acf LibELF: Remove "always bind now" global flag
This looked like someone's forgotten debug mechanism.
2021-02-21 00:02:21 +01:00
Andreas Kling
ba1eea9898 LibELF+DynamicLoader: Rename DynamicObject::construct() => create() 2021-02-21 00:02:21 +01:00
Andreas Kling
01f1e480e5 LibELF: Fix various clang-tidy warnings
Remove a bunch of unused code, unnecessary const, and make some
non-object-specific member functions static.
2021-02-21 00:02:21 +01:00
Andreas Kling
0c0127dc3f LibELF: Use StringView instead of "const char*" in dynamic linker code
There's no reason to use C strings more than absolutely necessary.
2021-02-20 22:29:12 +01:00
Andreas Kling
08476e7fe7 DynamicLoader: Always make .data segment read+write
Let's just ignore the program header and always go with read+write.
Nothing else makes sense anyway.
2021-02-20 19:02:48 +01:00
Andreas Kling
fa4c249425 LibELF+Userland: Enable RELRO for all userland executables :^)
The dynamic loader will now mark RELRO segments read-only after
performing relocations. This is pretty cool!

Note that this only applies to main executables so far,.
RELRO support for shared libraries will require some reorganizing
of the dynamic loader.
2021-02-18 18:55:19 +01:00
Andreas Kling
0d3866e84c DynamicLoader: Some ELF data segments were allocated too small
For a data segment that starts at a non-zero offset into a 4KB page and
crosses a 4KB page boundary, we were failing to pad the VM allocation,
which would cause the memcpy() to fail.

Make sure we round the segment bases down, and segment ends up, and the
issue goes away.
2021-02-18 18:14:59 +01:00
Andreas Kling
40a5487bab LibELF: Unmap and close the main executable after dynamic load
We don't need to keep the whole main executable in memory after
completing the dynamic loading process. We can also close the fd.
2021-02-13 13:46:20 +01:00
AnotherTest
09a43969ba Everywhere: Replace dbgln<flag>(...) with dbgln_if(flag, ...)
Replacement made by `find Kernel Userland -name '*.h' -o -name '*.cpp' | sed -i -Ee 's/dbgln\b<(\w+)>\(/dbgln_if(\1, /g'`
2021-02-08 18:08:55 +01:00
AnotherTest
53ce923e10 Everywhere: Fix obvious dbgln() bugs
This will allow compiletime dbgln() checks to pass
2021-02-08 18:08:55 +01:00
Andreas Kling
4df3a34bc2 LibELF: Only set up PLT trampoline for objects with a PLT 2021-02-05 12:10:45 +01:00
Andreas Kling
349cf6ad67 LibELF: Randomize the VM reservation (so we don't break ASLR) 2021-02-04 00:04:26 +01:00
Andreas Kling
3a3270eb68 LibELF: Make a dummy VM reservation before mapping dynamic objects
Using the text segment for the VM reservation ran into trouble when
there was a discrepancy between the p_filesz and p_memsz.

Simplify this mechanism and avoid trouble by making the reservation
as a MAP_PRIVATE | MAP_NORESERVE throwaway mapping instead.

Fixes #5225.
2021-02-03 23:42:18 +01:00
Andreas Kling
1d394b8a76 LibELF: Close dynamic objects after mapping and linking them
Oops, we were leaving the file descriptors open.
2021-02-01 20:10:03 +01:00
Andreas Kling
e313323317 LibELF: Split the DynamicLoader's loading mechanism into two steps
load_from_image() becomes map() and link(). This allows us to map
an object before mapping its dependencies.

This solves an issue where fixed-position executables (like GCC)
would clash with the ASLR placement of their own shared libraries.
2021-01-31 11:46:00 +01:00
Andreas Kling
36525c0572 LibELF: Assert on multiple calls to DynamicLoader::load_from_image()
It would be a mistake to recreate the cached DynamicObject.
2021-01-31 11:32:16 +01:00
Andreas Kling
2b862e4569 LibELF: Don't validate ELF twice in DynamicLoader
Validation was happening in two steps, some in the constructor, and then
some later on, in load_from_image().

This made no sense so just move all the validation to the constructor.
2021-01-31 11:29:23 +01:00
Andreas Kling
68576bcf1b LibELF: Call mmap() before constructing the DynamicLoader object
Refactor DynamicLoader construction with a try_create() helper so that
we can call mmap() before making a loader. This way the loader doesn't
need to have an "mmap failed" state.

This patch also takes care of determining the ELF file size in
try_create() instead of expecting callers to provide it.
2021-01-31 11:06:00 +01:00
Jorropo
8f8bbd1bcd
DynamicLoader: load_program_headers use variables to store regions (#5173)
Previously regions were stored in a vector and then a pointer to
regions in this vector were taken and stored. The problem is the vector
were still appended after pointers were taken, if enough regions were
present the vector would grow so large that it needed a resize, this
cause his memory to moved and now the previous pointers are now
pointing to old memory we just freed.

Fixes #5160
2021-01-30 09:21:54 +01:00