mirror of
https://github.com/LadybirdBrowser/ladybird.git
synced 2024-11-22 07:30:19 +00:00
Documentation: Begin document on execution and navigation in LibWeb
This commit is contained in:
parent
965bd00cf3
commit
08cacea7d5
Notes:
sideshowbarker
2024-07-16 22:22:13 +09:00
Author: https://github.com/ADKaster Commit: https://github.com/SerenityOS/serenity/commit/08cacea7d5 Pull-request: https://github.com/SerenityOS/serenity/pull/22107 Reviewed-by: https://github.com/linusg
1 changed files with 156 additions and 0 deletions
156
Documentation/Browser/BrowsingContextsAndNavigables.md
Normal file
156
Documentation/Browser/BrowsingContextsAndNavigables.md
Normal file
|
@ -0,0 +1,156 @@
|
|||
# LibWeb: Browsing Contexts and Navigables
|
||||
|
||||
**NOTE: This document is a work in progress!**
|
||||
|
||||
## Introduction: How does code execute, really?
|
||||
|
||||
Before we can dive into how LibWeb and Ladybird implement the HTML web page navigation operations,
|
||||
we need to dive into some fundamental specification concepts. Starting with, how does code actually
|
||||
execute in a (possibly virtual) machine? Next we'll look at what that means for the ECMAScript
|
||||
Specification (JavaScript), and finally how the ECMAScript code execution model ties into the
|
||||
HTML specification to model how to display web content into a browser tab.
|
||||
|
||||
### Native Code Execution: A Primer
|
||||
|
||||
When modeling the execution of a native program written in a popular systems language like
|
||||
C, C++, or Rust, most systems programmers should be familiar with the concepts of *threads*
|
||||
and *processes*. In a "hosted" environment, the execution of one's userspace program generally
|
||||
starts with an underlying operating system creating a process for the application to run in.
|
||||
This process will contain a memory space for program data and code to live in, and an initial,
|
||||
or main thread to start execution on. In order for the operating system to change which
|
||||
thread is executing on a particular CPU core, it needs to save and restore the *Execution Context*
|
||||
for that thread. The Execution Context for a native thread generally consists of a set of
|
||||
CPU registers, any floating point state, a program counter that tracks which instruction should
|
||||
be loaded next, and a stack pointer that points to the local data the thread was using to track
|
||||
its function call stack and local variables. The programmer can also request additional threads
|
||||
through a system call, providing a set of thread attributes and a function to call as the entry
|
||||
point.
|
||||
|
||||
For traditional compiled programs, the concept of accessing variables and functions is split into
|
||||
two phases. At compile time, local variables and arguments are folded into stack slots and
|
||||
allocated into registers. Exported variables and functions are written into the executable object
|
||||
file (ELF, Mach-O, PE, etc.) and are visible to external tools as symbols, as referenced by a
|
||||
symbol table contained within the object file format. Normally local variable and argument
|
||||
names and locations are lost in the compile+link steps, but the compiler can be configured to
|
||||
emit extra debug information to allow debuggers to access and modify them at runtime. In order
|
||||
to support something like the dynamic imports of interpreted languages, the programmer has to
|
||||
call a platform-specific function to load the new module (e.g. ``dlopen`` or ``LoadLibrary``).
|
||||
But after the module is opened, in order to actually refer to any exported symbols from that module the
|
||||
programmer has to retrieve the address of each symbol through another platform specific function
|
||||
(e.g. ``dlsym`` or ``GetProcAddress``), once per symbol.
|
||||
|
||||
### ECMAScript Execution Model: Realms and Agents
|
||||
|
||||
The ECMAScript specification has analogs for almost all of these concepts in the section on
|
||||
[Executable Code and Execution Contexts](https://tc39.es/ecma262/#sec-executable-code-and-execution-contexts).
|
||||
|
||||
Working in the other direction from the native code explanation, ECMAScript describes the accessibility
|
||||
and scopes of functions, variables, and arguments in terms of [*Environment Records*](https://tc39.es/ecma262/#sec-environment-records).
|
||||
Note that these Environment Records are not actually visible to executing code, and are simply a mechanism
|
||||
used by the specification authors to model the language. Every function and module has a type
|
||||
of Environment Record that contains the variables, functions, catch clause bindings, and other
|
||||
language constructs that affect which names are visible at any location in the code. These Environment Records
|
||||
are nested, in a tree-like structure that somewhat matches the Abstract Syntax Tree (AST).
|
||||
|
||||
The root of the tree of Environment Records is the Global Environment Record, which corresponds to the
|
||||
Global Object and its properties. In JavaScript, there is always a ``this`` value representing the current
|
||||
object context. At global scope, the Global Object normally takes that responsibility. In a REPL, that might
|
||||
be some REPL specific global object that has global functions to call for doing things like loading
|
||||
from the filesystem, or even be as complex as Node or Bun. In a Browser context, the Global object is
|
||||
normally the Window, unless there's a Worker involved. For historical reasons the global ``this`` binding for
|
||||
Window contexts is actually a WindowProxy that wraps the Window. This concept is quite different from a native
|
||||
executable, where there's no actual object representing the global scope, simply symbols that the
|
||||
linker and loader make available to each module.
|
||||
|
||||
While the Global Object and its Global Environment represent the root of the tree of identifiers visible
|
||||
to the executing JavaScript code, the Global Object isn't sufficient to model all the state around
|
||||
a conceptual thread of execution in ECMAScript. This is where the two concepts of [*Realms*](https://tc39.es/ecma262/#sec-code-realms)
|
||||
and [*Execution Contexts*](https://tc39.es/ecma262/#sec-execution-contexts) come into play.
|
||||
A [*Realm Record*](https://tc39.es/ecma262/#realm-record) is a container that holds a global object,
|
||||
its associated Global Environment, a set of intrinsic objects, and any *host* (also called an *embedder*
|
||||
in some specification documents) defined extra state that needs to be associated with the realm.
|
||||
In LibWeb, the Host Defined slot holds an object that has the HTML Environment Settings Object for each realm,
|
||||
as well as all the prototypes, constructors, and namespaces that need to be exposed on the Global Object
|
||||
for Web APIs. On top of the Realm abstraction, ECMAScript uses the Execution Context to model the state
|
||||
of execution of one particular script or module. Each Execution Context belongs to an [*execution context stack*](https://tc39.es/ecma262/#execution-context-stack)
|
||||
with the topmost context named the [*running execution context*](https://tc39.es/ecma262/#running-execution-context).
|
||||
An Execution Context has information about the current function, the script or module that the current code block belongs to,
|
||||
additional Environment Records required to access names in the current scope, any running generator state,
|
||||
and most importantly to the thread analogy, the state needed to suspend and resume execution of that script.
|
||||
As with Environment Records, Realms and Execution contexts are not directly accessible to running JavaScript code.
|
||||
|
||||
The final missing piece for the JavaScript execution model is how these stacks of Execution Contexts
|
||||
are actually scheduled to run by the ECMAScript implementation. In the most common case, this means directly
|
||||
mapping the ECMAScript model to the earlier native concepts of threads and processes in a way that
|
||||
allows for flexibility in the implementation strategies. The last thing that the specification authors want
|
||||
to do is constrain implementations so much that innovation and experimentation becomes impossible.
|
||||
The method for this mapping is the two related specification mechanisms [*Agents*](https://tc39.es/ecma262/#sec-agents)
|
||||
and [*Agent Clusters*](https://tc39.es/ecma262/#sec-agent-clusters). The Execution Context stack mentioned
|
||||
above actually belongs to an Agent, which holds said stack, a set of metadata about the memory model,
|
||||
and a shared reference to an [*executing thread*](https://tc39.es/ecma262/#executing-thread).
|
||||
According to ECMAScript, there should always be at least one Execution Context on the stack, to allow concepts
|
||||
such as the running execution context to always refer to the topmost Execution Context of the [*surrounding agent*](https://tc39.es/ecma262/#surrounding-agent).
|
||||
However, the HTML specification opts to remove the default execution context from the execution context stack
|
||||
at creation, and instead manually pushes and pops execution contexts for script, module, and callback execution.
|
||||
The relationship between Realms and Agents is not 1-1, but N-1. In the ECMAScript specification, this manifests
|
||||
as a part of the [*Shadow Realm proposal*](https://tc39.es/proposal-shadowrealm/), while the Web platform
|
||||
requires multiple Realms per Agent to specify the historical behavior of ``<iframe>`` and related elements.
|
||||
|
||||
An Agent holds a stack of Execution Contexts, with the topmost entry being the running execution context.
|
||||
Each Execution Context holds a Realm and a specific script's context, including the current function and
|
||||
any state required to pause and resume the execution for that context. The Realm holds the Global
|
||||
Object for the Execution Context, and any ECMAScript or host-specific intrinsics required to create the
|
||||
desired environment for code to run in. More loosely, an Agent is a specification artefact that somewhat
|
||||
maps the execution of a JavaScript script or module to a native thread of execution. But the specification
|
||||
does so in a way that allows a host/embedder to choose to switch out which Agent is currently executing
|
||||
its running execution context on that native thread, and which Realm within that Agent owns the running execution
|
||||
context.
|
||||
|
||||
SharedArrayBuffers and Atomics add a special kind of wrinkle to the ECMAScript specification. Defining
|
||||
how that work required the formalization of a memory model, similar to what C++11 and C11 and Java 5 had
|
||||
to do before them. The Agent Cluster is the formalism that ties the memory model back to the execution
|
||||
model. As described in the specification, an Agent Cluster is a set of Agents that can communicate
|
||||
via shared memory. The exact mechanism is unspecified, but the hard rule is that all Agents within
|
||||
a particular Agent Cluster must observe the same order of reads and writes to SharedArrayBuffers and
|
||||
as a result of ECMAScript Atomic objects.
|
||||
|
||||
The net result of all this memory model and atomic specification language is that loosely, an Agent models
|
||||
a "candidate execution" of some code module that can execute on a thread, and any suspended execution
|
||||
contexts from things like generators or async that are part of that module and its dynamic imports.
|
||||
An Agent Cluster models the interaction of agents that share the ability to communicate via shared memory.
|
||||
The simplest reading of this is that the specification authors had in mind the type of memory sharing
|
||||
that threads within the same process have in native code execution. So an Agent Cluster loosely models
|
||||
a collection of Agents (read: threads) that execute independently of each other within the same implementation
|
||||
defined manner for sharing memory between different threads (read: process).
|
||||
|
||||
### HTML Execution Model: Global Scopes
|
||||
|
||||
The Document Object Model (DOM) specifications are written in such a way that implementers can
|
||||
create language bindings for any language to access the page. However, experience has shown that the
|
||||
most popular way to script web content in modern web browsers is through JavaScript bindings. As such,
|
||||
the HTML specification is specifically tailored to meet the constraints of JavaScript execution in its
|
||||
scripting APIs and related concepts. Great care is taken to ensure that JavaScript written by different
|
||||
authors cannot interfere with each other, and that arbitrary scripts cannot exfiltrate information about
|
||||
the page content to third-party destinations.
|
||||
|
||||
The HTML specification therefore has a section on [Agents and Agent Clusters](https://html.spec.whatwg.org/multipage/webappapis.html#agents-and-agent-clusters)
|
||||
at the start of the section on how scripting behaves on the Web platform.
|
||||
|
||||
TODO: Finish this section
|
||||
|
||||
## HTML Navigation: Juggling Origins
|
||||
|
||||
|
||||
|
||||
### Global Scopes, Browsing Contexts, Browsing Context Groups, Navigables, and Traversable Navigables
|
||||
|
||||
TODO:
|
||||
|
||||
- Agents defined by the HTML Spec
|
||||
- Global Objects (Global Scopes) defined by the HTML Spec
|
||||
- Agents and Browsing Context Groups
|
||||
- Navigables and their relationship to Browsing Contexts
|
||||
- Walk through construction of a browser tab, its traversable navigable, and its navigation both same and
|
||||
cross-origin
|
||||
- Walk through construction of a browser tab with a nested browsing context and what happens when the
|
||||
nested context within its navigable container navigates on its own
|
Loading…
Reference in a new issue