When creating uninitialized storage for variables, we need to make sure
that the alignment is correct. Fixes a KUBSAN failure when running
kernels compiled with Clang.
In `Syscalls/socket.cpp`, we can simply use local variables, as
`sockaddr_un` is a POD type.
Along with moving the `alignas` specifier to the correct member,
`AK::Optional`'s internal buffer has been made non-zeroed by default.
GCC emitted bogus uninitialized memory access warnings, so we now use
`__builtin_launder` to tell the compiler that we know what we are doing.
This might disable some optimizations, but judging by how GCC failed to
notice that the memory's initialization is dependent on `m_has_value`,
I'm not sure that's a bad thing.
This implements StringUtils::find_any_of() and uses it in
String::find_any_of() and StringView::find_any_of(). All uses of
find_{first,last}_of have been replaced with find_any_of(), find() or
find_last(). find_{first,last}_of have subsequently been removed.
This adds the String::find_last() as wrapper for StringUtils::find_last,
which is another step in harmonizing the String and StringView APIs
where possible.
This also inlines the find() methods, as they are simple wrappers around
StringUtils functions without any additional logic.
This implements the StringView::find_all() method by re-implemeting the
current method existing for String in StringUtils, and using that
implementation for both String and StringView.
The rewrite uses memmem() instead of strstr(), so the String::find_all()
argument type has been changed from String to StringView, as the null
byte is no longer required.
This removes StringView::find_first_of(char) and find_last_of(char) and
replaces all its usages with find and find_last respectively. This is
because those two methods are functionally equivalent.
find_{first,last}_of should only be used if searching for multiple
different characters, which is never the case with the char argument.
This also adds the [[nodiscard]] to the remaining find_{first,last}_of
methods.
This patch reimplements the StringView::find methods in StringUtils, so
they can also be used by String. The methods now also take an optional
start parameter, which moves their API in line with String's respective
methods.
This also implements a StringView::find_ast(char) method, which is
currently functionally equivalent to find_last_of(char). This is because
find_last_of(char) will be removed in a further commit.
This patch refactors StringImpl::to_{lower,upper}case to use the new
static methods StringImpl::create_{lower,upper}cased if they have to use
to create a new StringImpl. This allows implementing StringView's
to_{lower,upper}case_string using the same methods.
It also replaces the usage of hand-written to_ascii_lowercase() and
similar methods with those from CharacterTypes.h.
This moves the path canonicalization from the LexicalPath constructor to
canonicalized_path. This allows canonicalized path to no longer
construct a LexicalPath object and initialize all its member variables.
This replaces the current LexicalPath::append() API with a new method
that returns a new LexicalPath object and doesn't touch the this-object.
With this, LexicalPath is now immutable. It also adds a
LexicalPath::parent() method and the relevant test cases.
This changes the m_parts, m_dirname, m_basename, m_title and m_extension
member variables to StringViews onto the m_string String. It also
removes the m_is_absolute member in favour of computing if a path is
absolute in the is_absolute() getter. Due to this, the canonicalize()
method has been completely rewritten.
The parts() getter still returns a Vector<String>, although it is no
longer a const reference as m_parts is no longer a Vector<String>.
Rather, it is constructed from the StringViews in m_parts upon request.
The parts_view() getter has been added, which returns Vector<StringView>
const&. Most previous users of parts() have been changed to use
parts_view(), except where Strings are required.
Due to this change, it's is now no longer allow to create temporary
LexicalPath objects to call the dirname, basename, title, or extension
getters on them because the returned StringViews will point to possible
freed memory.
The LexicalPath instance methods dirname(), basename(), title() and
extension() will be changed to return StringView const& in a further
commit. Due to this, users creating temporary LexicalPath objects just
to call one of those getters will recieve a StringView const& pointing
to a possible freed buffer.
To avoid this, static methods for those APIs have been added, which will
return a String by value to avoid those problems. All cases where
temporary LexicalPath objects have been used as described above haven
been changed to use the static APIs.
Since this is always set to true on the non-default constructor and
subsequently never modified, it is somewhat pointless. Furthermore,
there are arguably no invalid relative paths.
This attribute tells compilers that the pointer returned by a function
is never null, which lets it optimize away null checks in some places.
This seems like a nice addition to `NonnullOwnPtr` and `NonnullRefPtr`.
Using this attribute causes extra UBSan checks to be emitted. To offset
its performance loss, some additional methods were marked ALWAYS_INLINE,
which lets the compiler optimize duplicate checks
This removes JsonObject::get_or(), which is inefficient because it has
to copy the returned value. It was only used in a few cases, some of
which resulted in copying JsonObjects, which can become quite large.
This adds methods to JsonObject to check if a key exists with a certain
type. This simplifies code that validates whether a JsonObject has the
expected structure.
This adds a static JsonValue* s_null_value, which allows
JsonObject::get to return a reference instaed of copying the return
value. Since JsonValue is only 16 bytes, this seems like a reasonable
compromise.
This changes JsonObject to use the new OrderedHashMap instead of an
extra vector for tracking the insertion order.
This also adds a default value for the KeyTraits template argument in
OrderedHashMap. Furthermore, it fixes two cases where code iterating
over a JsonObject relied on the value argument being copied before
invoking the callback.
It was previously stored as a StringView, which prevented us from
using temporary strings in the 'extra' argument.
The performance hit doesn't really matter because ScopeLogger is used
exclusively for debugging.
Also add some tests to ensure that they _remain_ constexpr.
In general, any runtime assertions, weirdo C casts, pointer aliasing,
and such shenanigans should be gated behind the (helpfully newly added)
AK::is_constant_evaluated() function when the intention is to write
constexpr-capable code.
a.k.a. deliver promises of constexpr-ness :P
The existing InputBitStream methods only read in little endian, as this
is what the rest of the system requires. Two new methods allow the input
bitstream to read bits in big endian as well, while using the existing
state infrastructure.
Note that it can lead to issues if little endian and big endian reads
are used out of order without aligning to a byte boundary first.
Clang enforces the ordering that attributes specified with the
`[[attr_name]]` syntax must comes before those defines as
`__attribute__((attr_name))`. We don't want to deal with that, so we
should stick to a single syntax (for functions, at least).
This commit favors the latter, as it's used more widely in the code
(for declaring more "exotic" options), and changing those would be a
larger effort than modifying this single file.
These functions abstract away the need to call the proper new operator
("throwing" or "non-throwing") and manually adopt the resulting raw
pointer. Modelled after the existing `NonnullOwnPtr<T> make()`
functions, these forward their parameters to the object's constructor.
Note: These can't be used in the common "factory method" idiom, as
private constructors can't be called from a standalone function.
The naming is consistent with AK's and Shell's previous implementation
of these:
- `make` creates a `NonnullOwnPtr<T>` and aborts if the allocation could
not be performed.
- `try_make` creates an `OwnPtr<T>`, which may be null if the allocation
failed.
- `create` creates a `NonnullRefPtr<T>`, and aborts on allocation
failure.
- `try_create` creates a `RefPtr<T>`, which may be null if the
allocation was not successful.
In standard C++, operators `new` and `new[]` are guaranteed to return a
valid (non-null) pointer and throw an exception if the allocation
couldn't be performed. Based on this, compilers did not check the
returned pointer before attempting to use them for object construction.
To avoid this, the allocator operators were changed to be `noexcept` in
PR #7026, which made GCC emit the desired null checks. Unfortunately,
this is a non-standard feature which meant that Clang would not accept
these function definitions, as it did not match its expected
declaration.
To make compiling using Clang possible, the special "nothrow" versions
of `new` are implemented in this commit. These take a tag type of
`std::nothrow_t` (used for disambiguating from placement new/etc.), and
are allowed by the standard to return null. There is a global variable,
`std::nothrow`, declared with this type, which is also exported into the
global namespace.
To perform fallible allocations, the following syntax should be used:
```cpp
auto ptr = new (nothrow) T;
```
As we don't support exceptions in the kernel, the only way of uphold the
"throwing" new's guarantee is to abort if the allocation couldn't be
performed. Once we have proper OOM handling in the kernel, this should
only be used for critical allocations, where we wouldn't be able to
recover from allocation failures anyway.
While Clang claims to implement GCC's atomics libcall API, a small
incompatibility caused our builds to fail on Clang.
Clang requires requires the operands to its fixed-size functions to be
integer types, while GCC will take any type with the same size and
alignment as the various integer primitives. This was problematic, as
atomic `enum class`es would not compile.
Furthermore, Clang does not like if only one operand pointer is marked
volatile. Because it only affects the standalone atomic functions, that
will be fixed in a later commit.
As an added benefit, the code is more type-safe, as it won't let us
perform arithmetic on non-integer types. Types with overloaded
arithmetic types won't cause unexpected behavior anymore.
The constructors for the various atomic types can now be used in
constant expressions.
It was really annoying to `static_cast` the arguments to be the same
type, so instead of doing that, just convert the second one to the first
one, and let the compiler warn about sign differences and truncation.
Problem:
- Now that a generic free-function form of `find_if` is implemented
the code in `any_of` is redundant.
Solution:
- Follow the "don't repeat yourself" mantra and make the code DRY by
implementing `any_of` in terms of `find_if`.
This will just hexdump the given value.
Note that not all formatters respect this, the ones that do are:
- (Readonly)Bytes: formatter added in this commit
- StringView / char const*
- integral types
Components are a group of build targets that can be built and installed
separately. Whether a component should be built can be configured with
CMake arguments: -DBUILD_<NAME>=ON|OFF, where <NAME> is the name of the
component (in all caps).
Components can be marked as REQUIRED if they're necessary for a
minimally functional base system or they can be marked as RECOMMENDED
if they're not strictly necessary but are useful for most users.
A component can have an optional description which isn't used by the
build system but may be useful for a configuration UI.
Components specify the TARGETS which should be built when the component
is enabled. They can also specify other components which they depend on
(with DEPENDS).
This also adds the BUILD_EVERYTHING CMake variable which lets the user
build all optional components. For now this defaults to ON to make the
transition to the components-based build system easier.
The list of components is exported as an INI file in the build directory
(e.g. Build/i686/components.ini).
Fixes#8048.
All usages of AK::InlineLinkedList have been converted to
AK::IntrusiveList. So it's time to retire our old friend.
Note: The empty white space change in AK/CMakeLists.txt is to
force CMake to re-glob the header files in the AK directory so
incremental build will work when folks git pull this change locally.
Otherwise they'll get errors, because CMake will attempt to install
a file which no longer exists.
The insert_before method on AK::InlineLinkedList is used, so in order to
achieve feature parity, we need to implement it for AK::IntrusiveList as
well.
Some of the code assumed that chars were always signed while that is
not the case on ARM hosts.
Also, some of the code tried to use EOF (-1) in a way similar to what
fgetc() does, however instead of storing the characters in an int
variable a char was used.
While this seemed to work it also meant that character 0xFF would be
incorrectly seen as an end-of-file.
Careful reading of fgetc() reveals that fgetc() stores character
data in an int where valid characters are in the range of 0-255 and
the EOF value is explicitly outside of that range (usually -1).
Doing these as custom classes might be faster, especially when writing
them in SSE, but this would cause a lot of Code duplication and due to
the nature of constexprs and the intelligence of the compiler they might
be using SSE/MMX either way
It's prone to finding "technically uninitialized but can never happen"
cases, particularly in Optional<T> and Variant<Ts...>.
The general case seems to be that it cannot infer the dependency
between Variant's index (or Optional's boolean state) and a particular
alternative (or Optional's buffer) being untouched.
So it can flag cases like this:
```c++
if (index == StaticIndexForF)
new (new_buffer) F(move(*bit_cast<F*>(old_buffer)));
```
The code in that branch can _technically_ make a partially initialized
`F`, but that path can never be taken since the buffer holding an
object of type `F` and the condition being true are correlated, and so
will never be taken _unless_ the buffer holds an object of type `F`.
This commit also removed the various 'diagnostic ignored' pragmas used
to work around this warning, as they no longer do anything.
This commit makes it possible to instantiate `Vector<T&>` and use it
to store references to `T` in a vector.
All non-pointer observers are made to return the reference, and the
pointer observers simply yield the underlying pointer.
Note that the 'find_*' methods act on the values and not the pointers
that are stored in the vector.
This commit also makes errors in various vector methods much more
readable by directly using requires-clauses on them.
And finally, it should be noted that Vector cannot hold temporaries :^)
The methods of this class were all over the place, this commit reorders
them to place them in a more logical order:
- Constructors/Destructor
- Observers
- Comparisons and const existence checks
- Mutators: insert
- Mutators: append
- Mutators: prepend
- Mutators: assignment
- Mutators: remove
- OOM-safe mutators: insert
- OOM-safe mutators: append
- OOM-safe mutators: prepend
- OOM-safe size management
- Size management
- Iterators
This fixes a bug where a Utf8View was created with data from a temporary
string, which was immediately deleted. This lead to a use-after-free
issue. This also changes most occurences for StringBuilder::to_string in
URLParser to use ::string_view(), as the value is passed as StringView
const& most of the time anyways.
This fixes oss-fuzz issue 34973.
When a code point is invalid, the full string was outputted to the debug
output. For large strings, this can make the system quite slow.
Furthermore, one of the cases incorrectly assumed the data to be null
terminated. This patch modifies the debug statements not to print the
full string.
This fixes oss-fuzz issue 35050.
Other software might not expect these to be defined and behave
differently if they _are_ defined, e.g. scummvm which checks if
the TODO macro is defined and fails to build if it is.
This commit initializes the LibVideo library and implements parsing
basic Matroska container files. Currently, it will only parse audio
and video tracks.
Previously, AK::Function would accept _any_ callable type, and try to
call it when called, first with the given set of arguments, then with
zero arguments, and if all of those failed, it would simply not call the
function and **return a value-constructed Out type**.
This lead to many, many, many hard to debug situations when someone
forgot a `const` in their lambda argument types, and many cases of
people taking zero arguments in their lambdas to ignore them.
This commit reworks the Function interface to not include any such
surprising behaviour, if your function instance is not callable with
the declared argument set of the Function, it can simply not be
assigned to that Function instance, end of story.
This changes URL parser to use the 0xFFFFFFFF constant instead of 0 to
indicate end of file. This fixes a bug where inputs containing null
bytes would terminate the parser early, because they were interpreted
as end of file.
This patch adds a state_name method to URLParser to convert a state to a
string. With this, the debugging statements now display the state names.
Furthermore, this fixes a bug where non-ASCII code points were
formatted as characters, which fails an assertion in the formatting
system.
Because non-ASCII code points have negative byte values, trimming away
control characters requires checking for negative bytes values.
This also adds a test case with a URL containing non-ASCII code points.