Utf8View wraps a StringView and implements begin() and end() that
return a Utf8CodepointIterator, which parses UTF-8-encoded Unicode
codepoints and returns them as 32-bit integers.
This is the first step towards supporting emojis in Serenity ^)
https://github.com/SerenityOS/serenity/issues/490
When printing hex numbers, we were printing the wrong thing sometimes. This
was because we were dividing the digit to print by 15 instead of 16. Also,
dividing by 16 is the same as shifting four bits to the right, which is a
bit closer to our actual intention in this case, so let's use a shift
instead.
This way, primitive JsonValue serialization is still handled by
JsonValue::serialize(), but JsonArray and JsonObject serialization
always goes through serializer classes. This is no less efficient
if you have the whole JSON in memory already.
We use consumable annotations to catch bugs where you get the .value()
of an Optional before verifying that it's okay.
The bug here was that only has_value() would set the consumed state,
even though operator bool() does the same job.
Normally you want to access the T&, but sometimes you need to grab at
the NonnullPtr, for example when moving it out of the Vector before
a call to remove(). This is generally a weird pattern, but I don't have
a better solution at the moment.
Using the new get_process_name() syscall, we can automatically prefix
all userspace debug logging.
Hopefully this is more helpful than annoying. We'll find out! :^)
Add the concept of a PeekType to Traits<T>. This is the type we'll
return (wrapped in an Optional) from HashMap::get().
The PeekType for OwnPtr<T> and NonnullOwnPtr<T> is const T*,
which means that HashMap::get() will return an Optional<const T*> for
maps-of-those.
Okay, so, OwnPtr<T>::release_nonnull() returns a NonnullOwnPtr<T>.
It assumes that the OwnPtr is non-null to begin with.
Note that this removes the value from the OwnPtr, as there can only be
a single owner.
The printf formatting mini-language actually allows you
to pass a '*' character in place of the fill width specification,
in which case it eats one of the passed in arguments and uses it
as width, so implement that.
This replaces Optional<T>(U&&) which clang-tidy complained may hide the
regular copy and move constructors. That's a good point, clang-tidy,
and I appreciate you pointing that out!
This allows us to take advantage of the now-optimized (to do memmove())
Vector::append(const T*, int count) for collecting these strings.
This is a ~15% speedup on the load_4chan_catalog benchmark.
This can definitely be improved with better trivial type detection and
by using the TypedTransfer template in more places.
It's a bit annoying that we can't get <type_traits> in Vector.h since
it's included in the toolchain compilation before we have libstdc++.
This has several significant changes to the networking stack.
* Significant refactoring of the TCP state machine. Right now it's
probably more fragile than it used to be, but handles quite a lot
more of the handshake process.
* `TCPSocket` holds a `NetworkAdapter*`, assigned during `connect()` or
`bind()`, whichever comes first.
* `listen()` is now virtual in `Socket` and intended to be implemented
in its child classes
* `listen()` no longer works without `bind()` - this is a bit of a
regression, but listening sockets didn't work at all before, so it's
not possible to observe the regression.
* A file is exposed at `/proc/net_tcp`, which is a JSON document listing
the current TCP sockets with a bit of metadata.
* There's an `ETHERNET_VERY_DEBUG` flag for dumping packet's content out
to `kprintf`. It is, indeed, _very debug_.
Keep a 256-entry string cache during parse to avoid creating some new
strings when possible. This cache is far from perfect but very cheap.
Since none of the strings are transient, this only costs us a couple of
pointers and a bit of ref-count manipulation.
The cache hit rate on 4chan_catalog.json is ~33% and the speedup on
the load_4chan_catalog benchmark is ~7%.
I was able to get parsing time down to about 1/3 of the original time
by using callgrind+kcachegrind. There's definitely more improvements
that can be made here, but I'm gonna be happy with this for now. :^)
- Return more specific types from parse_array() and parse_object().
- Don't create a throwaway String in extract_while().
- Use a StringView in parse_number() to avoid a throwaway String.
This is a shameless copy-paste of String::to_int(). We should find some
way to share this code between String and StringView instead of having
two duplicate copies like this.
Use AK::exchange() to switch out the internal storage. Also mark these
functions with [[nodiscard]] to provoke an compile-time error if they
are called without using the return value.
This gives us much better error messages when you try to use them.
Without this change, it would complain about the absence of functions
named ref() and deref() on RefPtr itself. With it, we instead get a
"hey, this function is deleted" error.
Change operator=(T&) to operator=T(const T&) also, to keep assigning
a const T& to a NonnullRefPtr working.
Instead of aborting the program when we hit an assertion, just print a
message and keep going.
This allows us to write tests that provoke assertions on purpose.
There was a bug in the "prepend_vector_object" test but it was masked
by us not printing failures. (The bug was that we were adding three
elements to the "objects" vector and then checking that another
vector called "more_objects" indeed had three elements. Oops!)
This is a complement to append() that works by constructing the new
element in-place via placement new and forwarded constructor arguments.
The STL calls this emplace_back() which looks ugly, so I'm inventing
a nice word for it instead. :^)
It doesn't seem sane to try to iterate over a HashTable while it's in
the middle of being cleared. Since this might cause strange problems,
this patch adds an assertion if an iterator is constructed during
clear() or rehash() of a HashTable.
An operation often has two pieces of underlying information:
* the data returned as a result from that operation
* an error that occurred while retrieving that data
Merely returning the data is not good enough. Result<> allows exposing
both the data, and the underlying error, and forces (via clang's
consumable attribute) you to check for the error before you try to
access the data.
Put simply, Error<> is a way of forcing error handling onto an API user.
Given a function like:
bool might_work();
The following code might have been written previously:
might_work(); // but what if it didn't?
The easy way to work around this is of course to [[nodiscard]] might_work.
But this doesn't work for more complex cases like, for instance, a
hypothetical read() function which might return one of _many_ errors
(typically signalled with an int, let's say).
int might_read();
In such a case, the result is often _read_, but not properly handled. Like:
return buffer.substr(0, might_read()); // but what if might_read returned an error?
This is where Error<> comes in:
typedef Error<int, 0> ReadError;
ReadError might_read();
auto res = might_read();
if (might_read.failed()) {
switch (res.value()) {
case EBADF:
...
}
}
Error<> uses clang's consumable attributes to force failed() to be
checked on an Error instance. If it's not checked, then you get smacked.
We had some kernel-specific gizmos in AK that should really just be in the
Kernel subdirectory instead. The only thing remaining after moving those
was mmx_memcpy() which I moved to the ARCH(i386)-specific section of
LibC/string.cpp.
So we already have ByteBuffer::wrap() which is like a StringView for random
data. This might not be the best abstraction actually, but this will be
immediately useful so let's add it.
Clang loses the typestate when passing NonnullRefPtr's via lambda captures.
This is unfortunate, but not much we can do about it. Allowing ptr() makes
it possible to use captured NonnullRefPtrs as you'd expect.
Add an "ElementType" typedef to NonnullOwnPtr and NonnullRefPtr to allow
clients to easily find the pointee type. Then use this to remove a template
argument from NonnullPtrVector. :^)
It's not possible to grow one of these vectors beyond what's already in them
since it's not possible to default-construct Nonnull{Own,Ref}Ptr.
Add Vector::shrink() which can be used when you want to shrink the Vector
and delete resize() from the specialized Vectors.
This works just like NonnullRefPtr, except for NonnullOwnPtr's instead.
NonnullOwnPtrVector<T> inherits from Vector<NonnullOwnPtr<T>>, and adds some
comforts on top, like making accessors return T& so we can chase dots (.)
instead of arrows (->) :^)
This is just like OwnPtr (also single-owner), except it cannot be null.
NonnullOwnPtr is perfect as the return type of functions that never need to
return nullptr.
It's also useful as an argument type to encode the fact that the argument
must not be nullptr.
The make<Foo>() helper is changed to return NonnullOwnPtr<Foo>.
Note: You can move() out of a NonnullOwnPtr, and after that the object is
in an invalid state. Internally it will be a nullptr at this point, so we'll
still catch misuse, but the only thing that should be done in this state
is running the destructor. I've used consumable annotations to generate some
warnings when using a NonnullOwnPtr after moving from it, but these only
work when compiling with clang, so be aware of that.
Restructure the makefile a little so it only builds objects once, and
then run them on make clean.
This is a little slower (since we're relinking tests each makeall), but
it also ensures that it will work.
And use it in the scheduler.
IntrusiveList is similar to InlineLinkedList, except that rather than
making assertions about the type (and requiring inheritance), it
provides an IntrusiveListNode type that can be used to put an instance
into many different lists at once.
As a proof of concept, port the scheduler over to use it. The only
downside here is that the "list" global needs to know the position of
the IntrusiveListNode member, so we have to position things a little
awkwardly to make that happen. We also move the runnable lists to
Thread, to avoid having to publicize the node.
The to_foo() functions are for converting when you might not be sure of the
underlying value type. The as_foo() family assumes that you know exactly
what the underlying value type is.
Meet TStyle. It allows you to write things like this:
dbg() << TStyle(TStyle::Red, TStyle::Bold) << "Hello, friends!";
Any style used will be reset along with the newline emitted when the dbg()
temporary goes out of scope. :^)
This can definitely be improved, but I think it's a decent place to start.
This is the same as calling FileSystemPath(foo).string(). The majority of
clients only care about canonicalizing a path, so let's have an easy way
to express that.
We shouldn't allow constructing e.g an OwnPtr from a RefPtr, and similar
conversions. Instead just delete those functions so the compiler whines
loudly if you try to use them.
This patch also deletes constructing OwnPtr from a WeakPtr, even though
that *may* be a valid thing to do, it's sufficiently weird that we can
make the client jump through some hoops if he really wants it. :^)
This patch removes copy_ref() from RefPtr and NonnullRefPtr. This means that
it's now okay to simply copy these smart pointers instead:
- RefPtr = RefPtr // Okay!
- RefPtr = NonnullRefPtr // Okay!
- NonnullRefPtr = NonnullRefPtr // Okay!
- NonnullRefPtr = RefPtr // Not okay, since RefPtr can be null.
I had a silly ambition that we would avoid unnecessary ref count churn by
forcing explicit use of "copy_ref()" wherever a copy was actually needed.
This was making RefPtr a bit clunky to work with, for no real benefit.
This patch adds the missing copy construction/assignment stuff to RefPtr.
You can currently use this to detect the CPU architecture like so:
#if ARCH(I386)
...
#elif ARCH(X86_64)
...
#else
...
#endif
This will be helpful for separating out architecture-specific code blocks.
Instead of computing the path length inside the syscall handler, let the
caller do that work. This allows us to implement to new variants of open()
and creat(), called open_with_path_length() and creat_with_path_length().
These are suitable for use with e.g StringView.
This makes me wonder if the open() syscall should take characters+length
and we'd compute the length at the LibC layer instead. That way we could
also provide an optional non-POSIX open() that takes the length directly..
This allows you to do things like:
vector.insert_before_matching(value, [](auto& entry) {
return value < entry;
});
Basically it scans until it finds an element that matches the condition
callback and then inserts the new value before the matching element.
The first implementation class is DebugLogStream, which can be used like so:
dbg() << "Hello friends, I am " << m_years << " years old!";
Note that it will automatically print a newline when the object created by
dbg() goes out of scope.
This API will grow and evolve, so let's see what we end up with :^)
Instead of manually doing String::format("%d"/"%u") everywhere, let's have
a String API for this. It's just a wrapper around format() for now, but it
could be made more efficient in the future.
Solve this by adding find() overloads to HashTable and SinglyLinkedList
that take a templated functor for comparing the values.
This allows HashMap to call HashTable::find() without having to create
a temporary Entry for use as the table key. :^)
This is prep work for supporting HashMap with NonnullRefPtr<T> as values.
It's currently not possible because many HashTable functions require being
able to default-construct the value type.
Update ProcessManager, top and WSCPUMonitor to handle the new format.
Since the kernel is not allowed to use floating-point math, we now compile
the JSON classes in AK without JsonValue::Type::Double support.
To accomodate large unsigned ints, I added a JsonValue::Type::UnsignedInt.
The LibC build is a bit complicated, since the toolchain depends on it.
During the toolchain bootstrap, after we've built parts of GCC, we have
to stop and build Serenity's LibC, so that the rest of GCC can use it.
This means that during that specific LibC build, we don't yet have access
to things like std::initializer_list.
For now we solve this by defining SERENITY_LIBC_BUILD during the LibC
build and excluding the Vector/initializer_list support inside LibC.
Get rid of the ConstIterator classes for these containers and use templated
FooIterator<T, ...> and FooIterator<const T, ...> helpers.
This makes the HashTable class a lot easier to read.
This means you can now do this:
void harmonize(NonnullRefPtrVector<Voice>& voices)
{
for (auto& voice : voices) {
voice.sing(); // Look, no "->"!
}
}
Pretty dang cool :^)
This is a slot-in convenience replacement for Vector<NonnullRefPtr<T>> that
makes accessors return T& instead of NonnullRefPtr<T>&.
Since NonnullRefPtr guarantees non-nullness, this allows you to access these
vector elements using dot (.) rather than arrow (->). :^)
This avoids putting pressure on kmalloc() during backtrace symbolication.
Since we dump backtrace for every process that exits, this is actually a
decent performance improvement for things like GCC that chain a lot of
processes together.
This parser assumes that the JSON is well-formed and will choke horribly
on invalid input.
Since we're primarily interested in parsing our own output right now, this
is less of a problem. Longer-term we're gonna need something better. :^)
- Delete the default constructor instead of just making it private.
It's never valid to create an empty NonnullRefPtr.
- Add copy assignment operators. I originally omitted these to force use
of .copy_ref() at call sites, but the hassle/gain ratio is minuscule.
- Allow calling all the assignment operators in all consumable states.
This codifies that it's okay to overwrite a moved-from NonnullRefPtr.
It's kinda funny how I can make a mistake like this in Serenity and then
get so used to it by spending lots of time using this API that I start to
believe that this is how printf() always worked..
There's no need for a member char* m_characters if we always store them
in the inline buffer. So with this patch, we now do.
After that, rearrange the members a bit for ideal packing. :^)
We'll now try to detect crashes that were due to dereferencing nullptr,
uninitialized malloc() memory, or recently free()'d memory.
It's not perfect but I think it's pretty good. :^)
Also added some color to the most important parts of the crash log,
and added some more modes to /bin/crash for exercising this code.
Fixes#243.
This patch adds JsonValue, JsonObject and JsonArray. You can use them to
build up a JsonObject and then serialize it to a string via to_string().
This patch only implements encoding, no decoding yet.
The underlying data structure is a singly-linked list of Vector<T>.
We never shift any of the vector contents around, but we batch the memory
allocations into 1000-element segments.
Put together a pretty well-performing queue using a Vector and an offset.
By using the new Vector::shift_left(int) instead of Vector::take_first()
we can avoid shifting the vector contents every time and instead only
do it every so often.
Maybe this could be generalized into a separate class, I'm not sure if it's
the best algorithm though, it's just what I came up with right now. :^)
StringView character buffer is not guaranteed to be null-terminated;
in particular it will not be null-terminated when making a substring.
This means it is not correct to check whether we've reached the end
of a StringView by comparing the next character to null; instead, we
need to do an explicit length (or pointer) comparison.
Without this function, comparing a String to a const char* will instantiate
a temporary String which is obviously not great.
Also add some missing null checks to StringView::operator==(const char*).
String&& is just not very practical. Also return const String& when the
returned string is a member variable. The call site is free to make a copy
if he wants, but otherwise we can avoid the retain count churn.
This is useful when you want to ensure some little thing happens when you
exit a certain scope.
This patch makes use of it in LibC's netdb code to make sure we close the
connection to the LookupServer.
This is a small change to the existing split() functionality to support
the case of splitting a string and stopping at a certain number of
tokens. This is useful for parsing e.g. key/value pairs, where the value
may contain the delimiter you're splitting on.