At a later point this will indicate whether some FunctionObject "has a
[[Construct]] internal method" (separate from the current FunctionObject
call() / construct()), to help with a more spec-compliant implementation
of [[Call]] and [[Construct]].
This means that it is no longer relevant to just NativeFunction.
The old name is the result of the perhaps somewhat confusingly named
abstract operation OrdinaryFunctionCreate(), which creates an "ordinary
object" (https://tc39.es/ecma262/#ordinary-object) in contrast to an
"exotic object" (https://tc39.es/ecma262/#exotic-object).
However, the term "Ordinary Function" is not used anywhere in the spec,
instead the created object is referred to as an "ECMAScript Function
Object" (https://tc39.es/ecma262/#sec-ecmascript-function-objects), so
let's call it that.
The "ordinary" vs. "exotic" distinction is important because there are
also "Built-in Function Objects", which can be either implemented as
ordinary ECMAScript function objects, or as exotic objects (our
NativeFunction).
More work needs to be done to move a lot of infrastructure to
ECMAScriptFunctionObject in order to make FunctionObject nothing more
than an interface for objects that implement [[Call]] and optionally
[[Construct]].
This is much more common across the whole codebase and even these two
files. The same is used to return an empty JS::Value in an exception
check, for example.
When appending two strings together to form a new string, if both of the
strings are already UTF-16, create the new string as UTF-16 as well.
This shaves about 0.5 seconds off the following test262 test:
RegExp/property-escapes/generated/General_Category_-_Decimal_Number.js
This commit does not go out of its way to reduce copying of the string
data yet, but is a minimum set of changes to compile LibJS after making
PrimitiveString hold a Utf16String.
This is a tiny difference and only changes anything for primitives in
strict mode. However this is tested in test262 and can be noticed by
overriding toString of primitive values.
This does now require one to wrap an object in a Value to call invoke
but all code using invoke has been migrated.
Although this is not spec-compliant, we don't have a way to represent
objects larger than `NumericLimits<size_t>::max()`. Since this abstract
operation is only used when dealing with object size, we don't lose any
functionality by taking that limit into account too.
This fixes a UBSAN error when compiling with Clang.
If we call these two functions on a negative value, undefined behavior
occurs due to casting a negative double to an unsigned integer. These
functions are defined to perform modular arithmetic, so negative values
can be fixed up by adding 2^8/2^16.
The reason why this step is not mentioned in ECMA-262 is that it defines
modular arithmetic so that `x mod m` had the same sign as `m`, while
LibM's `fmod(x, m)` copies `x`'s sign.
This issue was found by UBSAN with the Clang toolchain.
Problem:
- New `all_of` implementation takes the entire container so the user
does not need to pass explicit begin/end iterators. This is unused
except is in tests.
Solution:
- Make use of the new and more user-friendly version where possible.
LibJS parses JavaScript as UTF-8, so when creating a string, we must
transcode it to UTF-16 to handle encoded surrogate pairs.
For example, consider the following string:
"\ud83d\ude00"
The UTF-8 encoding of this surrogate pair is:
0xf0 0x9f 0x98 0x80
However, LibJS will currently store the two surrogates individually as
UTF-8 encoded bytes, rather than combining the pair:
0xed 0xa0 0xb8, 0xed 0xb8 0x80
These are not equivalent. So, as String.prototype becomes UTF-16 aware,
this encoding will no longer work for abstractions like strict equality.
Previously, in LibGFX's `Point` class, calculated distances were passed
to the integer `abs` function, even if the stored type was a float. This
caused the value to unexpectedly be truncated. Luckily, this API was not
used with floating point types, but that can change in the future, so
why not fix it now :^)
Since we are in C++, we can use function overloading to make things
easy, and to automatically use the right version.
This is even better than the LibC/LibM functions, as using a bit of
hackery, they are able to be constant-evaluated. They use compiler
intrinsics, so they do not depend on external code and the compiler can
emit the most optimized code by default.
Since we aren't using the C++ standard library's trick of importing
everything into the `AK` namespace, this `abs` function cannot be
exported to the global namespace, as the names would clash.
These were an ad-hoc way to implement special behaviour when reading or
writing to specific object properties. Because these were effectively
replaced by the abillity to override the internal methods of Object,
they are no longer needed.
It's way too easy to get this wrong: for the IsArray abstract operation,
Value::is_array() needs to be called. Since we have RTTI, the virtual
Object::is_array() method is not needed anymore - if we need to know
whether something is *actually* a JS::Array (we currently check in more
cases than we should, I think) and not a Proxy with an Array target, we
should do that in a way that doesn't look like an abstract operation.
This is a huge patch, I know. In hindsight this perhaps could've been
done slightly more incremental, but I started and then fixed everything
until it worked, and here we are. I tried splitting of some completely
unrelated changes into separate commits, however. Anyway.
This is a rewrite of most of Object, and by extension large parts of
Array, Proxy, Reflect, String, TypedArray, and some other things.
What we already had worked fine for about 90% of things, but getting the
last 10% right proved to be increasingly difficult with the current code
that sort of grew organically and is only very loosely based on the
spec - this became especially obvious when we started fixing a large
number of test262 failures.
Key changes include:
- 1:1 matching function names and parameters of all object-related
functions, to avoid ambiguity. Previously we had things like put(),
which the spec doesn't have - as a result it wasn't always clear which
need to be used.
- Better separation between object abstract operations and internal
methods - the former are always the same, the latter can be overridden
(and are therefore virtual). The internal methods (i.e. [[Foo]] in the
spec) are now prefixed with 'internal_' for clarity - again, it was
previously not always clear which AO a certain method represents,
get() could've been both Get and [[Get]] (I don't know which one it
was closer to right now).
Note that some of the old names have been kept until all code relying
on them is updated, but they are now simple wrappers around the
closest matching standard abstract operation.
- Simplifications of the storage layer: functions that write values to
storage are now prefixed with 'storage_' to make their purpose clear,
and as they are not part of the spec they should not contain any steps
specified by it. Much functionality is now covered by the layers above
it and was removed (e.g. handling of accessors, attribute checks).
- PropertyAttributes has been greatly simplified, and is being replaced
by PropertyDescriptor - a concept similar to the current
implementation, but more aligned with the actual spec. See the commit
message of the previous commit where it was introduced for details.
- As a bonus, and since I had to look at the spec a whole lot anyway, I
introduced more inline comments with the exact steps from the spec -
this makes it super easy to verify correctness.
- East-const all the things.
As a result of all of this, things are much more correct but a bit
slower now. Retaining speed wasn't a consideration at all, I have done
no profiling of the new code - there might be low hanging fruits, which
we can then harvest separately.
Special thanks to Idan for helping me with this by tracking down bugs,
updating everything outside of LibJS to work with these changes (LibWeb,
Spreadsheet, HackStudio), as well as providing countless patches to fix
regressions I introduced - there still are very few (we got it down to
5), but we also get many new passing test262 tests in return. :^)
Co-authored-by: Idan Horowitz <idan.horowitz@gmail.com>
This is not exactly compliant with the specification, but our current
bound function implementation isn't either, so its not currently
possible to implement it the way the specification requires.
The unsigned shift right implementation was already doing this, but
the spec requires a mod32 of rhs before the shift for the signed shift
right implementation as well. Caught by UBSAN and oss-fuzz.
If the value we get after fmod in Value::to_u32 is negative, UBSAN
complains that -N is out of bounds for u32. An extra static cast to i64
makes it stop complaining. An alternative implementation could add 2^32
if the fmod'd value is negative. Caught by UBSAN and oss-fuzz.
This was a standalone function previously (get_method()), but instead of
passing a Value to it, we can just make it a method.
Also add spec step comments and fix the receiver value by using GetV().
Like Get(), but with any value instead of an object - it's calling
ToObject() for us and passes the value to [[Get]]() as the receiver.
This will be used in GetMethod() (and a couple of other places, which
can be updated over time).
I also tried something new here: adding the three steps from the spec as
inline comments :^)
Requires a bunch of find-and-replace updates across LibJS, but
constructing a PropertyName from a nullptr Symbol* should not be
possible - let's enforce this at the compiler level instead of using
VERIFY() (and already dereference Symbol pointers at the call site).
Value.{cpp,h} has become a dumping ground, let's change that.
Things that are directly related to Values (e.g. bitwise/binary ops,
equality related functions) can remain, but everything else that's not a
Value or Object method and globally required (not just a static function
somewhere) is being moved.
Also convert to east-const while we're here.
I haven't touched IteratorOperations.{cpp,h}, it seems fine to still
have those separately.
As mentioned on Discord earlier, we'll add these to all new functions
going forward - this is the backfill. Reasons:
- It makes you look at the spec, implementing based on MDN or V8
behavior is a no-go
- It makes finding the various functions that are non-compliant easier,
in the future everything should either have such a comment or, if it's
not from the spec at all, a comment explaining why that is the case
- It makes it easier to check whether a certain abstract operation is
implemented in LibJS, not all of them use the same name as the spec.
E.g. RejectPromise() is Promise::reject()
- It makes it easier to reason about vm.arguments(), e.g. when the
function has a rest parameter
- It makes it easier to see whether a certain function is from a
proposal or Annex B
Also:
- Add arguments to all functions and abstract operations that already
had a comment
- Fix some outdated section numbers
- Replace some ecma-international.org URLs with tc39.es
The second argument (the default constructor) and the return value have
to be constructors (as a result functions), so we can require that
explicitly by using appropriate types.
We already have two separate implementations of this, so let's do it
properly. The optional value type check is done by a callback function
that returns Result<void, ErrorType> - value type accepted or message
for TypeError, that is.