Commit graph

105 commits

Author SHA1 Message Date
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Tim Schumacher
35e5024b7d DynamicLinker: Replace $ORIGIN with the executable path 2022-03-08 23:21:35 +01:00
Tim Schumacher
e7f861f34c DynamicLinker: Implement support for RPATH and RUNPATH 2022-03-08 23:21:35 +01:00
Sam Atkins
45cf40653a Everywhere: Convert ByteBuffer factory methods from Optional -> ErrorOr
Apologies for the enormous commit, but I don't see a way to split this
up nicely. In the vast majority of cases it's a simple change. A few
extra places can use TRY instead of manual error checking though. :^)
2022-01-24 22:36:09 +01:00
Jesse Buhagiar
48c9350036 LibELF: Add LD_LIBRARY_PATH envvar support :^)
The dynamic linker now supports having custom library paths
as specified by the user.
2022-01-05 15:01:14 +02:00
Sam Atkins
c67c1b583a LibELF: Cast unused smart-pointer return value to void 2021-12-05 15:31:03 +01:00
Andreas Kling
8b1108e485 Everywhere: Pass AK::StringView by value 2021-11-11 01:27:46 +01:00
Rodrigo Tobar
4b091a7cc2 LibELF: Fix dynamic linking of dlopen()-ed libs
Consider the situation where two shared libraries libA and libB, both
depending (as in having a NEEDED dtag) on libC. libA is first
dlopen()-ed, which produces libC to be mapped and linked. When libB is
dlopen()-ed the DynamicLinker would re-map and re-link libC though,
causing any previous references to its old location to be invalid. And
if libA's PLT has been patched to point to libC's symbols, then any
further invocations to libA will cause the code to jump to a virtual
address that isn't mapped anymore, therefore causing a crash. This
situation was reported in #10014, although the setup was more convolved
in the ticket.

This commit fixes the issue by distinguishing between a main program
loading being performed by Loader.so, and a dlopen() call. The main
difference between these two cases is that in the former the
s_globals_objects maps is always empty, while in the latter it might
already contain dependencies for the library being dlopen()-ed. Hence,
when collecting dependencies to map and link, dlopen() should skip those
that are present in the global map to avoid the issue described above.

With this patch the original issue seen in #10014 is gone, with all
python3 modules (so far) loading correctly.

A unit test reproducing a simplified issue is also included in this
commit. The unit test includes the building of two dynamic libraries A
and B with both depending on libline.so (and B also depending on A); the
test then dlopen()s libA, invokes one its function, then does the same
with libB.
2021-10-06 12:33:21 +02:00
Ali Mohammad Pur
97e97bccab Everywhere: Make ByteBuffer::{create_*,copy}() OOM-safe 2021-09-06 01:53:26 +02:00
Peter Bindels
ca9c53c1a8
LibELF/DynamicLinker: Evaluate symbols in library insertion order (#8802)
When loading libraries, it is required that each library uses the same
instance of each symbol, and that they use the one from the executable
if any. This is barely noticeable if done incorrectly; except that it
completely breaks RTTI on Clang. This switches the hash map to be
ordered; tested to work for Clang by @Bertaland
2021-07-16 11:55:01 +02:00
Gunnar Beutner
06883ed8a3 Kernel+Userland: Make the stack alignment comply with the System V ABI
The System V ABI for both x86 and x86_64 requires that the stack pointer
is 16-byte aligned on entry. Previously we did not align the stack
pointer properly.

As far as "main" was concerned the stack alignment was correct even
without this patch due to how the C++ _start function and the kernel
interacted, i.e. the kernel misaligned the stack as far as the ABI
was concerned but that misalignment (read: it was properly aligned for
a regular function call - but misaligned in terms of what the ABI
dictates) was actually expected by our _start function.
2021-07-10 01:41:57 +02:00
Daniel Bertalan
64b1740913 LibELF: Fix syscall regions for .text segments with a non-zero offset
Previously, we assumed that the `.text` segment was loaded at vaddr 0 in
all dynamic libraries, so we used the dynamic object's base address with
`msyscall`. This did not work with the LLVM toolchain, as it likes to
shuffle these segments around.

This now also handles the case when there are multiple text segments for
some reason correctly.
2021-07-07 22:26:53 +02:00
Gunnar Beutner
5f6ee4c539 LibELF: Save the negative TLS offset in m_tls_offset
This makes it unnecessary to track the symbol size which just isn't
available for unexported symbols (e.g. for 'static __thread').
2021-07-04 01:07:28 +02:00
Brian Gianforcaro
179d8f6815 LibELF: Use StringView literal to avoid string allocations 2021-07-02 10:51:20 +04:30
Max Wipfli
fc6d051dfd AK+Everywhere: Add and use static APIs for LexicalPath
The LexicalPath instance methods dirname(), basename(), title() and
extension() will be changed to return StringView const& in a further
commit. Due to this, users creating temporary LexicalPath objects just
to call one of those getters will recieve a StringView const& pointing
to a possible freed buffer.

To avoid this, static methods for those APIs have been added, which will
return a String by value to avoid those problems. All cases where
temporary LexicalPath objects have been used as described above haven
been changed to use the static APIs.
2021-06-30 11:13:54 +02:00
Gunnar Beutner
89a38b72b7 LibC+LibELF: Implement dladdr()
This implements the dladdr() function which lets the caller look up
the symbol name, symbol address as well as library name and library
base address for an arbitrary address.
2021-06-06 22:16:11 +02:00
Nicholas Baron
aa4d41fe2c
AK+Kernel+LibELF: Remove the need for IteratorDecision::Continue
By constraining two implementations, the compiler will select the best
fitting one. All this will require is duplicating the implementation and
simplifying for the `void` case.

This constraining also informs both the caller and compiler by passing
the callback parameter types as part of the constraint
(e.g.: `IterationFunction<int>`).

Some `for_each` functions in LibELF only take functions which return
`void`. This is a minimal correctness check, as it removes one way for a
function to incompletely do something.

There seems to be a possible idiom where inside a lambda, a `return;` is
the same as `continue;` in a for-loop.
2021-05-16 10:36:52 +01:00
Jean-Baptiste Boric
eecf7a2097 LibC: Move mman.h to sys/mman.h
POSIX mandates that it is placed there.
2021-05-14 22:24:02 +02:00
Itamar
8a01167c7d AK: Add missing GenericTraits<NonnullRefPtr>
This enables us to use keys of type NonnullRefPtr in HashMaps and
HashTables.

This commit also includes fixes in various places that used
HashMap<T, NonnullRefPtr<U>>::get() and expected to get an
Optional<NonnullRefPtr<U>> and now get an Optional<U*>.
2021-05-08 18:10:56 +02:00
Itamar
7bd796b7e3 LibELF: Perform verification of TLS data in dlopen
When loading a library at runtime with dlopen(), we now check that:
1. The library's TLS size does not overflow the size of the allocated
TLS block.
2. The Library's TLS data is all zeroed.

We check for both of these cases because we currently do not support
them correctly. When we do add support for them, we can remove these
checks.
2021-04-30 18:47:39 +02:00
Itamar
101ac45c1a LibELF: Change TLS offset calculation
This changes the TLS offset calculation logic to be based on the
symbol's size instead of the total size of the TLS.

Because of this change, we no longer need to pipe "m_tls_size" to so
many functions.

Also, After this patch, the TLS data of the main program exists at the
"end" of the TLS block (Highest addresses).

This fixes a part of #6609.
2021-04-30 18:47:39 +02:00
Itamar
6bbd2ebf83 Kernel+LibELF: Support initializing values of TLS data
Previously, TLS data was always zero-initialized.

To support initializing the values of TLS data, sys$allocate_tls now
receives a buffer with the desired initial data, and copies it to the
master TLS region of the process.

The DynamicLinker gathers the initial TLS image and passes it to
sys$allocate_tls.

We also now require the size passed to sys$allocate_tls to be
page-aligned, to make things easier. Note that this doesn't waste memory
as the TLS data has to be allocated in separate pages anyway.
2021-04-30 18:47:39 +02:00
Itamar
db76702d71 LibELF: Rename tls_size to tls_size_of_current_object 2021-04-30 18:47:39 +02:00
Itamar
2c9541315d LibELF: Fix TLS offset calculation for libraries
This fixes a regression that was introduced in f40ee1b and caused the
tls_offset of all objects other than the main program to be 0.

After this fix map_library's is_program argument is no longer used, so
it was removed.
2021-04-30 18:47:39 +02:00
Gunnar Beutner
f40ee1b03f LibC+LibELF: Implement more fully-features dlfcn functionality
This implements more of the dlfcn functionality. Most notably:

* It's now possible to dlopen() libraries which were already
  loaded at program startup time. This does not cause those
  libraries to be loaded twice.
* Errors are reported via dlerror() rather than by crashing
  the program.
* Calls to the dl*() functions are thread-safe.
2021-04-25 10:14:50 +02:00
Gunnar Beutner
f74b8a2d1f LibELF: Avoid calculating symbol hashes when we don't need them 2021-04-23 23:35:36 +02:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Gunnar Beutner
6cb28ecee8 LibC+LibELF: Implement support for the dl_iterate_phdr helper
This helper is used by libgcc_s to figure out where the .eh_frame sections
are located for all loaded shared objects.
2021-04-18 10:55:25 +02:00
Andreas Kling
94b247c5a9 LibELF: Make get_library_name() take String instead of StringView 2021-04-17 01:27:31 +02:00
Gunnar Beutner
960079b020 LibELF: Add support for loading libraries from /usr/local 2021-04-16 19:04:24 +02:00
Gunnar Beutner
f2ff8f2658 LibELF: Improve error messages for missing shared libraries 2021-04-14 13:13:06 +02:00
Gunnar Beutner
cd7512a2ad LibELF: Add support for loading objects with multiple data and text segments
This enables loading executables with multiple data and text segments. Also
it fixes loading executables where the text segment has a non-zero offset.

Example:

  $ echo "main () {}" > test.c
  $ gcc -Wl,-z,separate-code -o test test.c
  $ objdump -p test
  test:     file format elf32-i386

  Program Header:
      PHDR off    0x00000034 vaddr 0x00000034 paddr 0x00000034 align 2**2
           filesz 0x000000e0 memsz 0x000000e0 flags r--
    INTERP off    0x00000114 vaddr 0x00000114 paddr 0x00000114 align 2**0
           filesz 0x00000013 memsz 0x00000013 flags r--
      LOAD off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**12
           filesz 0x000003c4 memsz 0x000003c4 flags r--
      LOAD off    0x00001000 vaddr 0x00001000 paddr 0x00001000 align 2**12
           filesz 0x00000279 memsz 0x00000279 flags r-x
      LOAD off    0x00002000 vaddr 0x00002000 paddr 0x00002000 align 2**12
           filesz 0x00000004 memsz 0x00000004 flags r--
      LOAD off    0x00002004 vaddr 0x00003004 paddr 0x00003004 align 2**12
           filesz 0x00000100 memsz 0x00000124 flags rw-
   DYNAMIC off    0x00002014 vaddr 0x00003014 paddr 0x00003014 align 2**2
           filesz 0x000000c8 memsz 0x000000c8 flags rw-
2021-04-14 13:12:52 +02:00
Andreas Kling
ef1e5db1d0 Everywhere: Remove klog(), dbg() and purge all LogStream usage :^)
Good-bye LogStream. Long live AK::Format!
2021-03-12 17:29:37 +01:00
Andreas Kling
79889ef052 LibELF: Consolidate main executable loading a bit
Merge the load_elf() and commit_elf() functions into a single
load_main_executable() function that takes care of both things.

Also split "stage 3" into two separate stages, keeping the lazy
relocations in stage 3, and adding a stage 4 for calling library
initialization functions.

We also make sure to map the main executable before dealing with
any of its dependencies, to ensure that non-PIE executables get
loaded at their desired address.
2021-02-26 14:49:55 +01:00
Brian Gianforcaro
069fd58381 LibELF: Convert more string literals to StringView literals.
Most of these won't have perf impact, but the optimization is
practically free, so no harm in fixing these up.
2021-02-24 14:45:34 +01:00
Andreas Kling
5d180d1f99 Everywhere: Rename ASSERT => VERIFY
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)

Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.

We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
2021-02-23 20:56:54 +01:00
Andreas Kling
d6af3302e8 LibELF: Don't recompute the same ELF hashes over and over
When performing a global symbol lookup, we were recomputing the symbol
hashes once for every dynamic object searched. The hash function was
at the very top of a profile (15%) of program startup.

With this change, the hash function is no longer visible among the top
stacks in the profile. :^)
2021-02-23 19:43:44 +01:00
Andreas Kling
a43910acc3 LibELF: Make SymbolLookupResult::address a VirtualAddress
Let's use a stronger type than void* for this since we're talking
specifically about a virtual address and not necessarily a pointer
to something actually in memory (yet).
2021-02-21 00:02:21 +01:00
Andreas Kling
01f1e480e5 LibELF: Fix various clang-tidy warnings
Remove a bunch of unused code, unnecessary const, and make some
non-object-specific member functions static.
2021-02-21 00:02:21 +01:00
Andreas Kling
0c0127dc3f LibELF: Use StringView instead of "const char*" in dynamic linker code
There's no reason to use C strings more than absolutely necessary.
2021-02-20 22:29:12 +01:00
Andreas Kling
713b3b36be DynamicLoader+Userland: Enable RELRO for shared libraries as well :^)
To support this, I had to reorganize the "load_elf" function into two
passes. First we map all the dynamic objects, to get their symbols
into the global lookup table. Then we link all the dynamic objects.

So many read-only GOT's! :^)
2021-02-19 00:03:03 +01:00
Andreas Kling
40a5487bab LibELF: Unmap and close the main executable after dynamic load
We don't need to keep the whole main executable in memory after
completing the dynamic loading process. We can also close the fd.
2021-02-13 13:46:20 +01:00
AnotherTest
09a43969ba Everywhere: Replace dbgln<flag>(...) with dbgln_if(flag, ...)
Replacement made by `find Kernel Userland -name '*.h' -o -name '*.cpp' | sed -i -Ee 's/dbgln\b<(\w+)>\(/dbgln_if(\1, /g'`
2021-02-08 18:08:55 +01:00
Andreas Kling
e87eac9273 Userland: Add LibSystem and funnel all syscalls through it
This achieves two things:

- Programs can now intentionally perform arbitrary syscalls by calling
  syscall(). This allows us to work on things like syscall fuzzing.

- It restricts the ability of userspace to make syscalls to a single
  4KB page of code. In order to call the kernel directly, an attacker
  must now locate this page and call through it.
2021-02-05 12:23:39 +01:00
Andreas Kling
c9cd5ff6bb LibELF: Remove dynamic loader syscall exception for libkeyboard.so
LibKeyboard no longer needs to make syscalls so remove the exception
we were making for it. :^)
2021-02-03 23:15:53 +01:00
Andreas Kling
db1c6cf9cf LibC+LibELF: Run clang-format 2021-02-03 10:21:04 +01:00
Andreas Kling
603d36c599 LibELF: Make syscall region exceptions for UE and libkeyboard.so
These two are currently making some syscalls so we'll have to make
exceptions for them until we can clean them up.
2021-02-02 20:13:44 +01:00
Andreas Kling
df7ddfb803 LibELF: Mark libc.so and libpthread.so as syscall regions
Also, before calling the main program entry function, inform the kernel
that no more syscall regions can be registered.

This effectively bans syscalls from everywhere except LibC and
LibPthread. Pretty neat! :^)
2021-02-02 20:13:44 +01:00
Andreas Kling
e313323317 LibELF: Split the DynamicLoader's loading mechanism into two steps
load_from_image() becomes map() and link(). This allows us to map
an object before mapping its dependencies.

This solves an issue where fixed-position executables (like GCC)
would clash with the ASLR placement of their own shared libraries.
2021-01-31 11:46:00 +01:00
Andreas Kling
68576bcf1b LibELF: Call mmap() before constructing the DynamicLoader object
Refactor DynamicLoader construction with a try_create() helper so that
we can call mmap() before making a loader. This way the loader doesn't
need to have an "mmap failed" state.

This patch also takes care of determining the ELF file size in
try_create() instead of expecting callers to provide it.
2021-01-31 11:06:00 +01:00