Previously the CMake options for -fsanitize=address, thread and
undefined were gated behind clang, which was unecessary. Only
-fsanitize=fuzzer is clang-only.
Since the operations are already complicated and will become even more
so soon, let's split them into their own files. We can also integrate
the NumberTheory operations that would better fit there into this class
as well.
This commit doesn't change behaviors, but moves the allocation of some
variables into caller classes.
Previously the directions omitted that you have to specify
`-CMAKE_CXX_COMPILER` when building the Fuzzers. This
would cause all kinds of weird problems at compilation and
link time. You can't specify one or the other, they must
both be pointing at clang in order for things to work as
experted. Fix this by updating the documentation to specify
that the user should specify both the C and CXX compiler explicitly
to be safe, as well as forcing the cmake clang argument handling
to modify the CXX compiler variable instead of the C version.
Clang's default constexpr-steps limit is 1048576, which is not enough
for LibGfx's generation of the unicode bidirectional class lookup table
while GCC doesn't have any limit at all, so this patch increases the
limit to an arbitrarily larger value.
As many macros as possible are moved to Macros.h, while the
macros to create a test case are moved to TestCase.h. TestCase is now
the only user-facing header for creating a test case. TestSuite and its
helpers have moved into a .cpp file. Instead of requiring a TEST_MAIN
macro to be instantiated into the test file, a TestMain.cpp file is
provided instead that will be linked against each test. This has the
side effect that, if we wanted to have test cases split across multiple
files, it's as simple as adding them all to the same executable.
The test main should be portable to kernel mode as well, so if
there's a set of tests that should be run in self-test mode in kernel
space, we can accomodate that.
A new serenity_test CMake function streamlines adding a new test with
arguments for the test source file, subdirectory under /usr/Tests to
install the test application and an optional list of libraries to link
against the test application. To accomodate future test where the
provided TestMain.cpp is not suitable (e.g. test-js), a CUSTOM_MAIN
parameter can be passed to the function to not link against the
boilerplate main function.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
Move LibCompress unit tests to LibCompress/Tests directory and register
them with CMake's add_test. This allows us to run these tests with
ninja test instead of running a separate executable.
Also split the existing tests in 3 test files that better follow the
source code structure (inspired by AK tests).
This allows us to remove the FAIL_REGEX logic from the CTest invocation
of AK and LibRegex tests, as they will return a non-zero exit code on
failure :^).
Also means that running a failing TestSuite-enabled test with the
run-test-and-shutdown script will actually print that the test failed.
These tests were never built for the serenity target. Move their Lagom
build steps to the Lagom CMakeLists.txt, and add serenity build steps
for them. Also, fix the build errors when building them with the
serenity cross-compiler :^)
This is already set in the root CMakeLists.txt as well as here for Clang
(-Wno-user-defined-literals), but was forgotten for GCC which made an
Lagom-only build (cmake ../Meta/Lagom [...]) fail.
This is basically just for consistency, it's quite strange to see
multiple AK container types next to each other, some with and some
without the namespace prefix - we're 'using AK::Foo;' a lot and should
leverage that. :^)
A new operator, operator""sv was added as of C++17 to support
string_view literals. This allows string_views to be constructed
from string literals and with no runtime cost to find the string
length.
See: https://en.cppreference.com/w/cpp/string/basic_string_view/operator%22%22sv
This change implements that functionality in AK::StringView.
We do have to suppress some warnings about implementing reserved
operators as we are essentially implementing STL functions in AK
as we have no STL :).
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)
Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.
We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
Arbitrarily split up to make git bisect easier.
These unnecessary #include's were found by combining an automated tool (which
determined likely candidates) and some brain power (which decided whether
the #include is also semantically superfluous).
This caused some confusion: Apparently, clang has no trouble overriding Shell's
main, and this issue only surfaced when I tried to build the fuzzers with
wrong configuration (i.e., without the clang-injected 'main').
The diff is suggested by, and work of, @alimpfard.
This was done with the help of several scripts, I dump them here to
easily find them later:
awk '/#ifdef/ { print "#cmakedefine01 "$2 }' AK/Debug.h.in
for debug_macro in $(awk '/#ifdef/ { print $2 }' AK/Debug.h.in)
do
find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/#ifdef '$debug_macro'/#if '$debug_macro'/' {} \;
done
# Remember to remove WRAPPER_GERNERATOR_DEBUG from the list.
awk '/#cmake/ { print "set("$2" ON)" }' AK/Debug.h.in
-fsanitize=fuzzer was being added to LINKER_FLAGS from Lagom/CMakeLists,
which we don't want with FuzzilliJs as we want to define the functions
it provides ourselves.
There's no guarantee that the last executed command will have a zero
exit code, and so the shell exit code may or may not be zero, even if
all the tests pass.
Also changes the `test || echo fail && exit` to
`if not test { echo fail && exit }`, since that's nicer-looking.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.Everything:
The modifications in this commit were automatically made using the
following command:
find . -name '*.cpp' -exec sed -i -E 's/dbg\(\) << ("[^"{]*");/dbgln\(\1\);/' {} \;
Loader.so now just performs the initial self relocations and static
LibC initialisation before handing over to ELF::DynamicLinker::linker_main
to handle the rest of the process.
As a trade-off, ELF::DynamicLinker needs to be explicitly excluded from
Lagom unless we really want to try writing a cross platform dynamic loader
This commit gets rid of ELF::Loader entirely since its very ambiguous
purpose was actually to load executables for the kernel, and that is
now handled by the kernel itself.
This patch includes some drive-by cleanup in LibDebug and CrashDaemon
enabled by the fact that we no longer need to keep the ref-counted
ELF::Loader around.
Problem:
- `(void)` simply casts the expression to void. This is understood to
indicate that it is ignored, but this is really a compiler trick to
get the compiler to not generate a warning.
Solution:
- Use the `[[maybe_unused]]` attribute to indicate the value is unused.
Note:
- Functions taking a `(void)` argument list have also been changed to
`()` because this is not needed and shows up in the same grep
command.
There are cases where Lagom will build with GCC but not Clang.
This often goes unnoticed for a while as we don't often build with
Clang.
However, this is now important to test in CI because of the
OSS-Fuzz integration.
Note that this only tests the build, it does not run any tests.
Note that it also only builds LagomCore, Lagom and the fuzzers.
It does not build the other programs that use Lagom.
We added OSS-Fuzz integration in #4154, but documentation about it
is spread across several pull requests, IRC, and issues. Let's collect
the important bits in the ReadMe.
This commit is a mix of several commits, squashed into one because the
commits before 'Move regex to own Library and fix all the broken stuff'
were not fixable in any elegant way.
The commits are listed below for "historical" purposes:
- AK: Add options/flags and Errors for regular expressions
Flags can be provided for any possible flavour by adding a new scoped enum.
Handling of flags is done by templated Options class and the overloaded
'|' and '&' operators.
- AK: Add Lexer for regular expressions
The lexer parses the input and extracts tokens needed to parse a regular
expression.
- AK: Add regex Parser and PosixExtendedParser
This patchset adds a abstract parser class that can be derived to implement
different parsers. A parser produces bytecode to be executed within the
regex matcher.
- AK: Add regex matcher
This patchset adds an regex matcher based on the principles of the T-REX VM.
The bytecode pruduced by the respective Parser is put into the matcher and
the VM will recursively execute the bytecode according to the available OpCodes.
Possible improvement: the recursion could be replaced by multi threading capabilities.
To match a Regular expression, e.g. for the Posix standard regular expression matcher
use the following API:
```
Pattern<PosixExtendedParser> pattern("^.*$");
auto result = pattern.match("Well, hello friends!\nHello World!"); // Match whole needle
EXPECT(result.count == 1);
EXPECT(result.matches.at(0).view.starts_with("Well"));
EXPECT(result.matches.at(0).view.end() == "!");
result = pattern.match("Well, hello friends!\nHello World!", PosixFlags::Multiline); // Match line by line
EXPECT(result.count == 2);
EXPECT(result.matches.at(0).view == "Well, hello friends!");
EXPECT(result.matches.at(1).view == "Hello World!");
EXPECT(pattern.has_match("Well,....")); // Just check if match without a result, which saves some resources.
```
- AK: Rework regex to work with opcodes objects
This patchsets reworks the matcher to work on a more structured base.
For that an abstract OpCode class and derived classes for the specific
OpCodes have been added. The respective opcode logic is contained in
each respective execute() method.
- AK: Add benchmark for regex
- AK: Some optimization in regex for runtime and memory
- LibRegex: Move regex to own Library and fix all the broken stuff
Now regex works again and grep utility is also in place for testing.
This commit also fixes the use of regex.h in C by making `regex_t`
an opaque (-ish) type, which makes its behaviour consistent between
C and C++ compilers.
Previously, <regex.h> would've blown C compilers up, and even if it
didn't, would've caused a leak in C code, and not in C++ code (due to
the existence of `OwnPtr` inside the struct).
To make this whole ordeal easier to deal with (for now), this pulls the
definitions of `reg*()` into LibRegex.
pros:
- The circular dependency between LibC and LibRegex is broken
- Eaiser to test (without accidentally pulling in the host's libc!)
cons:
- Using any of the regex.h functions will require the user to link -lregex
- The symbols will be missing from libc, which will be a big surprise
down the line (especially with shared libs).
Co-Authored-By: Ali Mohammad Pur <ali.mpfard@gmail.com>
This was broken with the JS::Parser::Error position changes, but I don't
actually see a reason to do anything with the parser errors here, so
let's remove it and consider simply not crashing a success. :^)
It's a thin userland wrapper around adjtime(2). It can be used
to view current pending time adjustments, and root can use it to
smoothly adjust the system time.
As far as I can tell, other systems don't have a userland utility
for this, but it seems useful. Useful enough that I'm adding it to
the lagom build so I can use it on my linux box too :)
This broke in case of unterminated regular expressions, causing goofy location
numbers, and 'source_location_hint' to eat up all memory:
Unexpected token UnterminatedRegexLiteral. Expected statement (line: 2, column: 4294967292)
While this _does_ add a point of failure, it'll be a pretty bad day when
google goes down.
And this is unlikely to put a (positive) dent in their incoming
requests, so let's just roll with it until we have our own TLS server.
If a buffer smaller than Elf32_Ehdr was passed to Image, header()
would do an out-of-bounds read.
Make parse() check for that. Make most Image methods assert that the image
is_valid(). For that to work, set m_valid early in Image::parse()
instead of only at its end.
Also reorder a few things so that the fuzzer doesn't hit (valid)
assertions, which were harmless from a security PoV but which still
allowed userspace to crash the kernel with an invalid ELF file.
Make dbgprintf()s configurable at run time so that the fuzzer doesn't
produce lots of logspam.
Useful for sanitizer fuzzer builds.
clang doesn't have a -fconcepts switch (I'm guessing it just enables
concepts automatically with -std=c++2a, but I haven't checked),
and at least the version on my system doesn't understand
-Wno-deprecated-move, so pass these two flags only to gcc.
In return, disable -Woverloaded-virtual which fires in many places.
The preceding commits fixed the handful of -Wunused-private-field
warnings that clang emitted.
This moves most of the work from run-tests.sh to test-js.cpp. This way,
we have a lot more control over how the test suite runs, as well as how
it outputs. This should result in some cool functionality!
This commit also refactors test-common.js to mimic the jest library.
This should allow tests to be much more expressive :)
- Parsing invalid JSON no longer asserts
Instead of asserting when coming across malformed JSON,
JsonParser::parse now returns an Optional<JsonValue>.
- Disallow trailing commas in JSON objects and arrays
- No longer parse 'undefined', as that is a purely JS thing
- No longer allow non-whitespace after anything consumed by the initial
parse() call. Examples of things that were valid and no longer are:
- undefineddfz
- {"foo": 1}abcd
- [1,2,3]4
- JsonObject.for_each_member now iterates in original insertion order
Rather than printing them to stderr directly the parser now keeps a
Vector<Error>, which allows the "owner" of the parser to consume them
individually after parsing.
The Error struct has a message, line number, column number and a
to_string() helper function to format this information into a meaningful
error message.
The Function() constructor will now include an error message when
throwing a SyntaxError.
Note: clang only (see https://llvm.org/docs/LibFuzzer.html)
- add FuzzJs which will run the LibJS parser on random javascript inputs
- added a basic dictionary of javascript tokens
To use fuzzer:
CC=/usr/bin/clang CXX=/usr/bin/clang++ cmake -DENABLE_FUZZER_SANITIZER=1 ..
Fuzzers/FuzzJs -dict=../Fuzzers/FuzzJs.dict
Adding the ability to turn on Clang analyzer support in the Lagom build.
Right now the following are working warning free on the LibJS test suite:
-DENABLE_MEMORY_SANITIZER:BOOL=ON
-DENABLE_ADDRESS_SANITIZER:BOOL=ON
The following analyzer produces errors when running the LibJS test suite:
-DENABLE_UNDEFINED_SANITIZER:BOOL=ON
I've been wanting to do this for a long time. It's time we start being
consistent about how this stuff works.
The new convention is:
- "LibFoo" is a userspace library that provides the "Foo" namespace.
That's it :^) This was pretty tedious to convert and I didn't even
start on LibGUI yet. But it's coming up next.
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.
For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.
Going forward, all new source files should include a license header.
This is more of a meta thing, since it's not seeing active development,
but is just a way for me to build some Serenity parts and include them
in other projects. Move it out of the root to keep things tidy.