This updates the profiling subsystem to use a separate timer to
trigger CPU sampling. This timer has a higher resolution (1000Hz)
and is independent from the scheduler. At a later time the
resolution could even be made configurable with an argument for
sys$profiling_enable() - but not today.
Make messages which should be fatal, actually fail the build.
- FATAL is not a valid mode keyword. The full list is available in the
docs: https://cmake.org/cmake/help/v3.19/command/message.html
- SEND_ERROR doesn't immediately stop processing, FATAL_ERROR does.
We should immediately stop if the Toolchain is not present.
- The app icon size validation was just a WARNING that is easy to
overlook. We should promote it to a FATAL_ERROR so that people will
not overlook the issue when adding a new application. We can only make
the small icon message FATAL_ERROR, as there is currently one
violation of the medium app icon validation.
Note that the changes to IPv4Socket::create are unfortunately needed as
the return type of TCPSocket::create and IPv4Socket::create don't match.
- KResultOr<NonnullRefPtr<TcpSocket>>>
vs
- KResultOr<NonnullRefPtr<Socket>>>
To handle this we are forced to manually decompose the KResultOr<T> and
return the value() and error() separately.
The make<T> factory function allocates internally and immediately
dereferences the pointer, and always returns a NonnullOwnPtr<T> making
it impossible to propagate an error on OOM.
The make<T> factory function allocates internally and immediately
dereferences the pointer, and always returns a NonnullOwnPtr<T> making
it impossible to propagate an error on OOM.
Modify the API so it's possible to propagate error on OOM failure.
NonnullOwnPtr<T> is not appropriate for the ThreadTracer::create() API,
so switch to OwnPtr<T>, use adopt_own_if_nonnull() to handle creation.
Currently, when passing buffers into VirtIOQueues, we use scatter-gather
lists, which contain an internal vector of buffers. This vector is
allocated, filled and the destroy whenever we try to provide buffers
into a virtqueue, which would happen a lot in performance cricital code
(the main transport mechanism for certain paravirtualized devices).
This commit moves it over to using VirtIOQueueChains and building the
chain in place in the VirtIOQueue. Also included are a bunch of fixups
for the VirtIO Console device, making it use an internal VM::RingBuffer
instead.
We want to move this out of the AHCI subsystem into the VM system,
since other parts of the kernel may need to perform scatter-gather IO.
We rename the current VM::ScatterGatherList impl that's used in the
virtio subsystem to VM::ScatterGatherRefList, since its distinguishing
feature from the AHCI scatter-gather list is that it doesn't own its
buffers.
For Kernel OOM hardening to work correctly, we need to be able to
call a "nothrow" version of operator new. Unfortunately the default
"throwing" version of operator new assumes that the allocation will
never return on failure and will always throw an exception. This isn't
true in the Kernel, as we don't have exceptions. So if we call the
normal/throwing new and kmalloc returns NULL, the generated code will
happily go and dereference that NULL pointer by invoking the constructor
before we have a chance to handle the failure.
To fix this we declare operator new as noexcept in the Kernel headers,
which will allow the caller to actually handle allocation failure.
The delete implementations need to match the prototype of the new which
allocated them, so we need define delete as noexcept as well. GCC then
errors out declaring that you should implement sized delete as well, so
this change provides those stubs in order to compile cleanly.
Finally the new operator definitions have been standardized as being
declared with [[nodiscard]] to avoid potential memory leaks. So lets
declares the kernel versions that way as well.
This avoids some of the the shortest-lived allocations in the kernel:
StringImpl::create_uninitialized(unsigned long, char*&)
StringImpl::create(char const*, unsigned long, ShouldChomp)
StringBuilder::to_string() const
String::vformatted(StringView, TypeErasedFormatParams)
void Kernel::KBufferBuilder::appendff<unsigned int>(...)
JsonObjectSerializer<Kernel::KBufferBuilder>::add(..., unsigned int)
Kernel::procfs$all(Kernel::InodeIdentifier, ...) const
Kernel::procfs$all(Kernel::InodeIdentifier, Kernel::KBufferBuilder&)
This avoids allocations for initializing the Function<T>
for the NetworkAdapter::for_each callback argument.
Applying this patch decreases CPU utilization for NetworkTask
from 40% to 28% when receiving TCP packets at a rate of 100Mbit/s.
We already have another limit for the total number of packet buffers
allowed (max_packet_buffers). This second limit caused us to
repeatedly allocate and then free buffers.
This patch modifies InodeWatcher to switch to a one watcher, multiple
watches architecture. The following changes have been made:
- The watch_file syscall is removed, and in its place the
create_iwatcher, iwatcher_add_watch and iwatcher_remove_watch calls
have been added.
- InodeWatcher now holds multiple WatchDescriptions for each file that
is being watched.
- The InodeWatcher file descriptor can be read from to receive events on
all watched files.
Co-authored-by: Gunnar Beutner <gunnar@beutner.name>
If the HPET main counter does not support full 64 bits, we should
not expect the upper 32 bit to work. This is a problem when writing
to the upper 32 bit of the comparator value, which requires the
TimerConfiguration::ValueSet bit to be set, but if it's not 64 bit
capable then the bit will not be cleared and leave it in a bad state.
Fixes#6990
This matches what other operating systems like Linux do:
$ ip route get 0.0.0.0
local 0.0.0.0 dev lo src 127.0.0.1 uid 1000
cache <local>
$ ssh 0.0.0.0
gunnar@0.0.0.0's password:
$ ss -na | grep :22 | grep ESTAB
tcp ESTAB 0 0 127.0.0.1:43118 127.0.0.1:22
tcp ESTAB 0 0 127.0.0.1:22 127.0.0.1:43118
When we receive a TCP packet with a sequence number that is not what
we expected we have lost one or more packets. We can signal this to
the sender by sending a TCP ACK with the previous ack number so that
they can resend the missing TCP fragments.
Previously we'd process TCP packets in whatever order we received
them in. In the case where packets arrived out of order we'd end
up passing garbage to the userspace process.
This was most evident for TLS connections:
courage:~ $ git clone https://github.com/SerenityOS/serenity
Cloning into 'serenity'...
remote: Enumerating objects: 178826, done.
remote: Counting objects: 100% (1880/1880), done.
remote: Compressing objects: 100% (907/907), done.
error: RPC failed; curl 56 OpenSSL SSL_read: error:1408F119:SSL
routines:SSL3_GET_RECORD:decryption failed or bad record mac, errno 0
error: 1918 bytes of body are still expected
fetch-pack: unexpected disconnect while reading sideband packet
fatal: early EOF
fatal: fetch-pack: invalid index-pack output
When the MSS option header is missing the default maximum segment
size is 536 which results in lots of very small TCP packets that
NetworkTask has to handle.
This adds the MSS option header to outbound TCP SYN packets and
sets it to an appropriate value depending on the interface's MTU.
Note that we do not currently do path MTU discovery so this could
cause problems when hops don't fragment packets properly.
This increases the default TCP window size to a more reasonable
value of 64k. This allows TCP peers to send us more packets before
waiting for corresponding ACKs.
This increases the buffer size for connection-oriented sockets
to 256kB. In combination with the other patches in this series
I was able to receive TCP packets at a rate of about 120Mbps.
The get_dir_entries syscall failed if the serialized form of all the
directory entries together was too large to fit in its temporary buffer.
Now the kernel uses a fixed size buffer, that is flushed to an output
buffer when it is full. If this flushing operation fails because there
is not enough space available, the syscall will return -EINVAL. That
error code is then used in userspace as a signal to allocate a larger
buffer and retry the syscall.
Previously we'd try to load ELF images which did not have
an interpreter set with an incorrect load offset of 0, i.e. way
outside of the part of the address space where we'd expect either
the dynamic loader or the user's executable to reside.
This fixes the problem by using get_load_offset for both executables
which have an interpreter set and those which don't. Notably this
allows us to actually successfully execute the Loader.so binary:
courage:~ $ /usr/lib/Loader.so
You have invoked `Loader.so'. This is the helper program for programs
that use shared libraries. Special directives embedded in executables
tell the kernel to load this program.
This helper program loads the shared libraries needed by the program,
prepares the program to run, and runs it. You do not need to invoke
this helper program directly.
courage:~ $
Previously we'd incorrectly use the default gateway's MAC address.
Instead we must use destination MAC addresses that are derived from
the multicast IPv4 address.
With this patch applied I can query mDNS on a real network.
The Kernel/.gitignore file is a remnant of the prior build system,
where the kernel.map was written directly to to the Kernel folder.
The run.sh was also under Kernel so pcap files and others would get
dropped there when running the system under qemu.
None of these situations are possible now, so lets get rid of it.
Instead of reading in the entire contents of a directory into a large
buffer, we can iterate block by block. This only requires a small
buffer.
Because directory entries are guaranteed to never span multiple blocks
we do not have to handle any edge cases related to that.
On some cases, the FADT could be on the end of a page, so if we don't
have two pages being mapped, we could easily read from a non-mapped
virtual address, which will trigger the UB sanitizer.
Also, we need to treat the FADT structure as volatile and const, as it
may change at any time, but we should not touch (write) it anyhow.