We currently only supported Variant II which is used by x86-64.
Variant I is used by both AArch64 (when using the traditional
non-TLSDESC model) and RISC-V, although with small differences.
The TLS layout for Variant I is essentially flipped. The static TLS
blocks are after the thread pointer for Variant I, while on Variant II
they are before it.
Some code using ELF TLS already worked on AArch64 and RISC-V even though
we only support Variant II. This is because only the local-exec model
directly uses TLS offsets, other models use relocations or
__tls_get_addr().
This is a prerequisite for upstreaming our LLVM patches, as our current
hack forcing `-ftls-model=initial-exec` in the Clang driver is not
acceptable upstream.
Currently, our kernel-managed TLS implementation limits us to only
having a single block of storage for all thread-local variables that's
initialized at load time. This PR merely implements the dynamic TLS
interface (`__tls_get_addr` and TLSDESC) on top of our static TLS
infrastructure. The current model's limitations still stand:
- a single static TLS block is reserved at load time, `dlopen()`-ing
shared libraries that define thread-local variables might cause us to
run out of space.
- the initial TLS image is not changeable post-load, so `dlopen()`-ing
libraries with non-zero-initialized TLS variables is not supported.
The way we repurpose `ti_module` to mean "offset within static TLS
block" instead of "module index" is not ABI-compliant.