0ct0pu5/ladybird

Author	SHA1	Message	Date
Daniel Bertalan	3974cac148	LibELF: Implement support for DT_RELR relative relocations The DT_RELR relocation is a relatively new relocation encoding designed to achieve space-efficient relative relocations in PIE programs. The description of the format is available here: https://groups.google.com/g/generic-abi/c/bX460iggiKg/m/Pi9aSwwABgAJ It works by using a bitmap to store the offsets which need to be relocated. Even entries are address entries: they contain an address (relative to the base of the executable) which needs to be relocated. Subsequent even entries are bitmap entries: "1" bits encode offsets (in word size increments) relative to the last address entry which need to be relocated. This is in contrast to the REL/RELA format, where each entry takes up 2/3 machine words. Certain kinds of relocations store useful data in that space (like the name of the referenced symbol), so not everything can be encoded in this format. But as position-independent executables and shared libraries tend to have a lot of relative relocations, a specialized encoding for them absolutely makes sense. The authors of the format suggest an overall 5-20% reduction in the file size of various programs. Due to our extensive use of dynamic linking and us not stripping debug info, relative relocations don't make up such a large portion of the binary's size, so the measurements will tend to skew to the lower side of the spectrum. The following measurements were made with the x86-64 Clang toolchain: - The kernel contains 290989 relocations. Enabling RELR decreased its size from 30 MiB to 23 MiB. - LibUnicodeData contains 190262 relocations, almost all of them relative. Its file size changed from 17 MiB to 13 MiB. - /bin/WebContent contains 1300 relocations, 66% of which are relative relocations. With RELR, its size changed from 832 KiB to 812 KiB. This change was inspired by the following blog post: https://maskray.me/blog/2021-10-31-relative-relocations-and-relr	2022-02-11 18:07:53 +01:00
Andreas Kling	c482508aa1	LibELF: Use shared memory mapping when loading ELF objects There's no reason to make a private read-only mapping just for reading (and validating) the ELF headers, and copying out the data segments.	2022-01-15 19:51:15 +01:00
Idan Horowitz	cfb9f889ac	LibELF: Accept Span instead of Pointer+Size in validate_program_headers	2022-01-13 22:40:25 +01:00
Idan Horowitz	3e959618c3	LibELF: Use StringBuilders instead of Strings for the interpreter path This is required for the Kernel's usage of LibELF, since Strings do not expose allocation failure.	2022-01-13 22:40:25 +01:00
Daniel Bertalan	d1ef8e63f7	LibELF: Use MAP_FIXED_NOREPLACE for address space reservation This ensures that we don't corrupt our address space if a non-PIE program's requested address space happens to coincide with memory we already use.	2021-12-23 23:08:10 +01:00
Andreas Kling	b7ee0191ea	LibELF: Name non-executable map regions ".rodata" instead of ".text"	2021-09-04 20:30:56 +02:00
Andreas Kling	9206efaabe	LibELF: Don't copy read-only data sections The dynamic loader was mistakenly assuming that there are only two types of program load headers: text (RX) and data (RW). Now that we're linking with `-z separate-code`, we will also get some read-onlydata (R) segments. These can be memory-mapped directly without making a private per-process copy. To solve this, the code now instead separates the headers into map/copy instead of text/data. Writable segments get copied, while non-writable segments get memory-mapped. :^)	2021-09-01 01:36:18 +02:00
Andreas Kling	0819f0a3fd	LibELF: Allow (but ignore) PT_LOAD headers with zero size GNU ld sometimes generates zero-sized PT_LOAD headers when running with the "-z separate-code" option. Let's not choke on such headers, we can just ignore them and move along.	2021-08-31 16:46:16 +02:00
Gunnar Beutner	e4f0795ae4	LibELF+LibTest: Fix incorrect #ifdef	2021-08-12 08:16:07 +02:00
Daniel Bertalan	18b2484985	LibELF: Remove `(FlatPtr)something.as_ptr()` idiom This is equivalent to `something.get()`, but more verbose.	2021-08-09 23:15:48 +02:00
Daniel Bertalan	e0e3198d51	LibELF: Fix 'applying offset produced null pointer' UBSAN failure These integer => pointer => integer conversions were technically prone to UB, since they were used as offsets (which are perfectly fine to be zero), but we calculated them with pointer arithmetic. This made Clang insert pointer overflow UBSAN checks, which trigger in case of a zero result.	2021-08-09 23:15:48 +02:00
Gunnar Beutner	4cf24c6ba2	Userland: Prefer using ARCH() over __LP64__	2021-07-13 23:19:33 +02:00
Gunnar Beutner	13a14b3112	LibELF: Fix loading libs with a .text segment that's not page-aligned It's perfectly acceptable for the segment's vaddr to not be page aligned as long as the segment itself is page-aligned. We'll just map a few more bytes at the start of the segment that will be unused by the library. We didn't notice this problem because because GCC either always uses 0 for the .text segment's vaddr or at least aligns the vaddr to the page size. LibELF would also fail to load really small libraries (i.e. smaller than 4096 bytes).	2021-07-07 11:53:17 +02:00
Gunnar Beutner	ea8ff03475	LibELF: Fix loading objects with a non-zero load base My previous patch (`1f93ffcd`) broke loading objects whose first PT_LOAD entry had a non-zero vaddr. On top of that the calculations for the relro and dynamic section were also incorrect.	2021-07-04 14:23:52 +02:00
Gunnar Beutner	371c852fc0	LibELF: Swap the arguments for negative_offset_from_tls_block_end Now that m_tls_offset points to the start of the TLS block the argument order makes more sense this way.	2021-07-04 01:07:28 +02:00
Gunnar Beutner	251eaad8f0	LibELF: Fix relocation support for 'static __thread' variables	2021-07-04 01:07:28 +02:00
Gunnar Beutner	5f6ee4c539	LibELF: Save the negative TLS offset in m_tls_offset This makes it unnecessary to track the symbol size which just isn't available for unexported symbols (e.g. for 'static __thread').	2021-07-04 01:07:28 +02:00
Gunnar Beutner	a0a38e1e84	LibELF: Implement TLS relocation support for x86_64	2021-07-04 01:07:28 +02:00
Gunnar Beutner	f9a8c6f053	LibELF: Implement support for RELA relocations	2021-07-01 10:50:00 +02:00
Gunnar Beutner	1f93ffcd72	LibELF: Simplify ELF load address calculations These were unnecessarily complicated.	2021-07-01 10:50:00 +02:00
Gunnar Beutner	2dbd3f83c1	LibELF: Fix incorrect error message	2021-07-01 10:50:00 +02:00
Gunnar Beutner	d3127efc01	LibELF: Implement PLT relocations for x86_64	2021-06-29 20:03:36 +02:00
Gunnar Beutner	5afec84cc2	LibELF: Add stub for R_X86_64_TPOFF64	2021-06-29 20:03:36 +02:00
Gunnar Beutner	811f9d562d	LibELF: Make sure the mmap() regions are large enough Sometimes we'd end up requesting a smaller range for .text and .data than was actually necessary.	2021-06-29 20:03:36 +02:00
Gunnar Beutner	0cb937416b	Meta: Install 64-bit libgcc_s.so for x86_64 targets	2021-06-28 22:29:28 +02:00
Gunnar Beutner	158355e0d7	Kernel+LibELF: Add support for validating and loading ELF64 executables	2021-06-28 22:29:28 +02:00
Andrew Kaster	7b4dc590e7	AK+Userland: Use akaster@serenityos.org for my copyright headers	2021-05-30 14:35:34 +01:00
Nicholas Baron	aa4d41fe2c	AK+Kernel+LibELF: Remove the need for `IteratorDecision::Continue` By constraining two implementations, the compiler will select the best fitting one. All this will require is duplicating the implementation and simplifying for the `void` case. This constraining also informs both the caller and compiler by passing the callback parameter types as part of the constraint (e.g.: `IterationFunction<int>`). Some `for_each` functions in LibELF only take functions which return `void`. This is a minimal correctness check, as it removes one way for a function to incompletely do something. There seems to be a possible idiom where inside a lambda, a `return;` is the same as `continue;` in a for-loop.	2021-05-16 10:36:52 +01:00
Gunnar Beutner	0ab37dbd03	LibELF: Propagate ELF image validation errors to the caller With this fixed dlopen() no longer crashes when given an invalid ELF image and instead returns an error code that can be retrieved with dlerror(). Fixes #6995.	2021-05-10 21:27:11 +02:00
Gunnar Beutner	a050b43290	LibELF: Implement x86_64 relocation support There are definitely some relocations missing and this is untested for now.	2021-05-03 08:42:39 +02:00
Itamar	101ac45c1a	LibELF: Change TLS offset calculation This changes the TLS offset calculation logic to be based on the symbol's size instead of the total size of the TLS. Because of this change, we no longer need to pipe "m_tls_size" to so many functions. Also, After this patch, the TLS data of the main program exists at the "end" of the TLS block (Highest addresses). This fixes a part of #6609.	2021-04-30 18:47:39 +02:00
Itamar	6bbd2ebf83	Kernel+LibELF: Support initializing values of TLS data Previously, TLS data was always zero-initialized. To support initializing the values of TLS data, sys$allocate_tls now receives a buffer with the desired initial data, and copies it to the master TLS region of the process. The DynamicLinker gathers the initial TLS image and passes it to sys$allocate_tls. We also now require the size passed to sys$allocate_tls to be page-aligned, to make things easier. Note that this doesn't waste memory as the TLS data has to be allocated in separate pages anyway.	2021-04-30 18:47:39 +02:00
Itamar	db76702d71	LibELF: Rename tls_size to tls_size_of_current_object	2021-04-30 18:47:39 +02:00
Itamar	1c24388d74	LibELF: Extract TLS offset calculation logic to separate function	2021-04-30 18:47:39 +02:00
Gunnar Beutner	f40ee1b03f	LibC+LibELF: Implement more fully-features dlfcn functionality This implements more of the dlfcn functionality. Most notably: * It's now possible to dlopen() libraries which were already loaded at program startup time. This does not cause those libraries to be loaded twice. * Errors are reported via dlerror() rather than by crashing the program. * Calls to the dl*() functions are thread-safe.	2021-04-25 10:14:50 +02:00
Gunnar Beutner	f74b8a2d1f	LibELF: Avoid calculating symbol hashes when we don't need them	2021-04-23 23:35:36 +02:00
Andreas Kling	b91c49364d	AK: Rename adopt() to adopt_ref() This makes it more symmetrical with adopt_own() (which is used to create a NonnullOwnPtr from the result of a naked new.)	2021-04-23 16:46:57 +02:00
Brian Gianforcaro	1682f0b760	Everything: Move to SPDX license identifiers in all files. SPDX License Identifiers are a more compact / standardized way of representing file license information. See: https://spdx.dev/resources/use/#identifiers This was done with the `ambr` search and replace tool. ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *	2021-04-22 11:22:27 +02:00
Gunnar Beutner	6c729993a8	LibELF: Allow shared objects which don't have a text segment Shared objects without a text segment are perfectly OK. For example libicudata.so has only data segments: Sections: Idx Name Size VMA LMA File off Algn 0 .hash 00000014 00000094 00000094 00000094 22 CONTENTS, ALLOC, LOAD, READONLY, DATA 1 .dynsym 00000020 000000a8 000000a8 000000a8 22 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .dynstr 0000002a 000000c8 000000c8 000000c8 20 CONTENTS, ALLOC, LOAD, READONLY, DATA 3 .rodata 01b562d0 00000100 00000100 00000100 24 CONTENTS, ALLOC, LOAD, READONLY, DATA 4 .eh_frame 00000000 01b563d0 01b563d0 01b563d0 22 CONTENTS, ALLOC, LOAD, READONLY, DATA 5 .dynamic 00000070 01b573d0 01b573d0 01b563d0 22	2021-04-19 20:39:22 +02:00
Gunnar Beutner	c32b58873a	LibELF: Fix calculation for TLS relocations The calculation for TLS relocations was incorrect which would result in overlapping TLS variables when more than one shared object used TLS variables. This bug can be reproduced with a shared library and a program like this: $ cat tlstest.c #include <string.h> __thread char tls_val[1024]; void set_val() { memset(tls_val, 0, sizeof(tls_val)); } $ gcc -g -shared -o usr/lib/libtlstest.so tlstest.c $ cat test.c void set_val(); int main() { set_val(); } $ gcc -g -o tls test.c -ltlstest Due to the way the TLS relocations are done this program would clobber libc's TLS variables (e.g. errno).	2021-04-19 12:14:43 +02:00
Gunnar Beutner	1dab5ca5fd	LibELF: Fix support for relocating weak symbols Having unresolved weak symbols is allowed and we should initialize them to zero.	2021-04-19 12:00:40 +02:00
Gunnar Beutner	97d7450571	LibELF: Remove VERIFY() calls and let control flow return to the caller This way we get better error messages for unresolved symbols because the caller logs the file and symbol names.	2021-04-19 12:00:40 +02:00
Gunnar Beutner	6cb28ecee8	LibC+LibELF: Implement support for the dl_iterate_phdr helper This helper is used by libgcc_s to figure out where the .eh_frame sections are located for all loaded shared objects.	2021-04-18 10:55:25 +02:00
Gunnar Beutner	cd7512a2ad	LibELF: Add support for loading objects with multiple data and text segments This enables loading executables with multiple data and text segments. Also it fixes loading executables where the text segment has a non-zero offset. Example: $ echo "main () {}" > test.c $ gcc -Wl,-z,separate-code -o test test.c $ objdump -p test test: file format elf32-i386 Program Header: PHDR off 0x00000034 vaddr 0x00000034 paddr 0x00000034 align 22 filesz 0x000000e0 memsz 0x000000e0 flags r-- INTERP off 0x00000114 vaddr 0x00000114 paddr 0x00000114 align 20 filesz 0x00000013 memsz 0x00000013 flags r-- LOAD off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 212 filesz 0x000003c4 memsz 0x000003c4 flags r-- LOAD off 0x00001000 vaddr 0x00001000 paddr 0x00001000 align 212 filesz 0x00000279 memsz 0x00000279 flags r-x LOAD off 0x00002000 vaddr 0x00002000 paddr 0x00002000 align 212 filesz 0x00000004 memsz 0x00000004 flags r-- LOAD off 0x00002004 vaddr 0x00003004 paddr 0x00003004 align 212 filesz 0x00000100 memsz 0x00000124 flags rw- DYNAMIC off 0x00002014 vaddr 0x00003014 paddr 0x00003014 align 2**2 filesz 0x000000c8 memsz 0x000000c8 flags rw-	2021-04-14 13:12:52 +02:00
Andreas Kling	ef1e5db1d0	Everywhere: Remove klog(), dbg() and purge all LogStream usage :^) Good-bye LogStream. Long live AK::Format!	2021-03-12 17:29:37 +01:00
Andreas Kling	79889ef052	LibELF: Consolidate main executable loading a bit Merge the load_elf() and commit_elf() functions into a single load_main_executable() function that takes care of both things. Also split "stage 3" into two separate stages, keeping the lazy relocations in stage 3, and adding a stage 4 for calling library initialization functions. We also make sure to map the main executable before dealing with any of its dependencies, to ensure that non-PIE executables get loaded at their desired address.	2021-02-26 14:49:55 +01:00
Andreas Kling	7db8ccc0e4	LibC+DynamicLoader: Move "transactional memory" GCC stubs to LibC Instead of having a special case in the dynamic loader where we ignore TM-related GCC symbols, just stub them out in LibC like we already do for various other things we don't support.	2021-02-24 14:54:26 +01:00
Brian Gianforcaro	069fd58381	LibELF: Convert more string literals to StringView literals. Most of these won't have perf impact, but the optimization is practically free, so no harm in fixing these up.	2021-02-24 14:45:34 +01:00
Andreas Kling	5d180d1f99	Everywhere: Rename ASSERT => VERIFY (...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED) Since all of these checks are done in release builds as well, let's rename them to VERIFY to prevent confusion, as everyone is used to assertions being compiled out in release. We can introduce a new ASSERT macro that is specifically for debug checks, but I'm doing this wholesale conversion first since we've accumulated thousands of these already, and it's not immediately obvious which ones are suitable for ASSERT.	2021-02-23 20:56:54 +01:00
Andreas Kling	d6af3302e8	LibELF: Don't recompute the same ELF hashes over and over When performing a global symbol lookup, we were recomputing the symbol hashes once for every dynamic object searched. The hash function was at the very top of a profile (15%) of program startup. With this change, the hash function is no longer visible among the top stacks in the profile. :^)	2021-02-23 19:43:44 +01:00

1 2

85 commits