Before, TEST_MAIN used to return the return value of TestSuite::main()
function (which returns the number of test cases that did not pass, so
it can be >=256) directly.
The run-tests utility determines the success / failure of a test suite
binary by examining its (or i.e. TEST_MAIN's) exit status.
But as exit status values are supposed to be between 0 and 255, values
>=256 will get wrapped around (modulo 256), converting a return value of
256 to 0.
So, in a rare case where exactly 256 test cases are failing in your test
suite, run-tests utility will display that the test suite passed without
any failures.
Now, TEST_MAIN just returns 0 if all of the test cases pass and returns
1 otherwise.
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
NumericLimits<u32>::max + 1 overflowing to 0 caused us to call
AK::get_random_uniform(0) which doesn't make sense (the argument is an
_exclusive_ bound).
The original value 15 was too little: it made our
`weighted_boolean_fair_false` test fail every now and then.
This is because a fair coin (P(false) = 0.5) will hit the same value 15
times in a row with a probability (1/2)^15: around once in a 32k tries.
With the bumped up value, this is now once in 1 billion tries. Should
lower the test flakiness enough (if our random number generator is
truly uniform), while 30 tries is still an OK amount of computation for
randomized tests to do, compared to 15.
MAX_GENERATED_VALUES_PER_TEST is now the --randomized_runs flag:
$ ./Build/lagom/bin/TestGenerator --randomized_runs 1000
It's sometimes useful to try larger numbers for it instead of the
default of 100.
MAX_GEN_ATTEMPTS_PER_VALUE is now a constexpr. It's not usually needed
to tweak this value; we can recompile with a different value on the rare
occasion.
unsigned_int(0) doesn't need to draw bits from RandomnessSource.
An expression for getting INT_MAX for u32 didn't need to be
special-cased over the general formula.
This is a follow-up on a few comments
Tests defined like
RANDOMIZED_TEST_CASE(test_name)
{
GEN(dice, Gen::unsigned_int(1,6));
EXPECT(dice >= 1 && dice <= 6);
}
will be run many times (100x by default, can be overriden with
MAX_GENERATED_VALUES_PER_TEST), each time generating different random
values, and if any of the test runs fails, we'll shrink the generated
values and report the final minimal ones to the user.
ShrinkCommands are recipes for how a RandomRun should be shrunk. They
are not related to a specific RandomRun, although we'll take the length
of a specific RandomRun into account when generating the ShrinkCommands.
ShrinkCommands will later be interpreted by the shrink_with_command()
function.
Generators are callable as-is:
u32 my_int = Gen::unsigned_int(); // -> 1, 5, 8, 3, 2
But there is little visibility in the test fail message into what went
wrong. Showing what values were generated helps a lot, and that's what
this macro does:
GEN(my_int, Gen::unsigned_int());
expands into the above declaration and (crucially) a conditional
warnln() call looking like "my_int = {}". It will only run if error
reporting is enabled (see Test::can_report()), so it will only give the
final shrunk value instead of spamming the output with each generated
value.
REJECT and ASSUME are useful for filtering out unwanted generated
values. While this is not ideal, it is ocassionally useful and so we
include it for convenience.
The main loop of RANDOMIZED_TEST_CASE runs the test case 100 times, each
time trying to generate a different set of values. Inside that loop, if
it sees a REJECT (ASSUME is implemented in terms of REJECT), it retries
up to 15 times before giving up (perhaps it's impossible or just very
improbable to generate a value that will survive REJECT or ASSUME).
REJECT("Reason for rejecting") will just outright fail, while
ASSUME(bool) is more of an equivalent of a .filter() method from
functional languages.
This will be very useful as we add the randomized test cases and their
two loops ("generate+test many times" and "shrink once failure is
found"), because without this failure reporting we'd get many FAIL error
messages while still searching for the minimal one.
So, inside randomized test cases we want to only turn the error
reporting on for one last time after all the generating and shrinking.
These functions all plug into RandomnessSource and produce random values
of various types. They are to be used either inside other generator
definitions or inside the GEN(...) macro when used in tests.
This will be a foundational part of bootstrapping generators: this is
the way they'll get prerecorded values from / record random values into
RandomRuns. (Generators don't get in contact with RandomRuns
themselves, they just interact with the RandomnessSource.)
To be used later by generators and during shrinking.
The generators used in randomized tests will record the drawn random
bits into a RandomRun. This is a layer of indirection that will help us
automatically shrink any generated value without any user input. (Having
to provide manually defined shrinkers is a downside to many randomized /
property-based testing libraries like QuickCheck for Haskell.)
This will be used in the randomized tests a lot more than it is in the
unit tests / benchmarks; randomized tests will run the test function
multiple times, check the result and optionally start shrinking the
failing input. Generators will also be able to fail, resulting in some
of the new TestResult variants.
Previously VERIFY et al. was redefined inside tests to not abort and
instead fail the test. This wouldn't apply to non-header code though,
and was not helpful, as it prevented you from easily attaching gdb near
the abort.
After this removal tests can still use the EXPECT family of macros, but
VERIFY will behave like it does in the rest of the codebase (abort
etc.).
This commit removes DeprecatedString's "null" state, and replaces all
its users with one of the following:
- A normal, empty DeprecatedString
- Optional<DeprecatedString>
Note that null states of DeprecatedFlyString/StringView/etc are *not*
affected by this commit. However, DeprecatedString::empty() is now
considered equal to a null StringView.
These syscalls are not necessary on their own, and they give the false
impression that a caller could set or get the thread name of any process
in the system, which is not true.
Therefore, move the functionality of these syscalls to be options in the
prctl syscall, which makes it abundantly clear that these operations
could only occur from a running thread in a process that sees other
threads in that process only.
Stop worrying about tiny OOMs. Work towards #20449.
While going through these, I also changed the function signature in many
places where returning ThrowCompletionOr<T> is no longer necessary.
The AST interpreter is still available behind a new `--ast` flag.
We switch to testing with bytecode in the big run-tests battery on
SerenityOS. Lagom CI continues running both AST and BC.
This is meant to be used in a similar manner to skipping tests, with the
extra advantage that if the test begins passing unexpectedly, the test
will fail.
Being notified of unexpected passes allows for the test to be updated to
the correct expectation.
Since https://reviews.llvm.org/D131441, libc++ must be included before
LibC. As clang includes libc++ as one of the system includes, LibC
must be included after those, and the only correct way to do that is
to install LibC's headers into the sysroot.
Targets that don't link with LibC yet require its headers for one
reason or another must add install_libc_headers as a dependency to
ensure that the correct headers have been (re)installed into the
sysroot.
LibC/stddef.h has been dropped since the built-in stddef.h receives
a higher include priority.
In addition, string.h and wchar.h must
define __CORRECT_ISO_CPP_STRING_H_PROTO and
_LIBCPP_WCHAR_H_HAS_CONST_OVERLOADS respectively in order to tell
libc++ to not try to define methods implemented by LibC.
The JS::VM now owns the one Bytecode::Interpreter. We no longer have
multiple bytecode interpreters, and there is no concept of a "current"
bytecode interpreter.
If you ask for VM::bytecode_interpreter_if_exists(), it will return null
if we're not running the program in "bytecode enabled" mode.
If you ask for VM::bytecode_interpreter(), it will return a bytecode
interpreter in all modes. This is used for situations where even the AST
interpreter switches to bytecode mode (generators, etc.)
Don't try to implement this AO in bytecode. Instead, the bytecode
Interpreter class now has a run() API with the same inputs as the AST
interpreter. It sets up the necessary environments etc, including
invoking the GlobalDeclarationInstantiation AO.
Neither Azure Pipelines' log viewer, nor macOS Terminal.app support full
24-bit RGB color codes, causing the text output to be displayed
incorrectly. Fix this by using one of the 16 standard colors.
Most terminal emulators use a relatively dark shade for red by default,
as seen in the "ANSI escape code" Wikipedia article, so change the
foreground color to white to preserve contrast.