Small position independent code model (which we end up using after this
change) is suitable for us since the kernel is not expected to grow more
than 2Gb in size. This might be a bit risky since this model is not
mentioned anywhere except for System V ABI document but experiments show
that the kernel compiled with this change works just fine.
This otherwise caused a race condition between the signal dispatcher
(which sets sepc to the signal trampoline) and sepc being updated in the
trap handler.
We obviously have to keep the sepc set by the signal dispatcher and not
increment it afterwards.
While this clutters Process.cpp a tiny bit, I feel that it's worth it:
- 2x speed on the kcov_loop benchmark. Likely more during fuzzing.
- Overall code complexity is going down with this change.
- By reducing the code reachable from __sanitizer_cov_trace_pc code,
we can now instrument more code.
Sticking this to the function source has multiple benefits:
- We instrument more code, by not excluding entire files.
- NO_SANITIZE_COVERAGE can be used in Header files.
- Keeping the info with the source code, means if a function or
file is moved around, the NO_SANITIZE_COVERAGE moves with it.
This reverts commit 9dbec601b0.
For KCOV to be performant (or at least not even slower) we need to
mmap the PC buffer from both user and kernel space at the same time.
You can't mmap a character device, so this change didn't make sense.
Plus even if we did invent a new method to exfiltrate the coverage
information out of the kernel, it would be incompatible with existing
kernel fuzzers. That would be kind of annoying. 🙃
This simple delay loop uses the time CSR to wait for the given amount
of time. The tick frequency of the CSR is read from the
/cpus/timebase-frequency devicetree property.
Instead, rewrite the region page fault handling code to not use
PageFault::type() on RISC-V.
I split Region::handle_fault into having a RISC-V-specific
implementation, as I am not sure if I cover all page fault handling edge
cases by solely relying on MM's own region metadata.
We should probably also take the processor-provided page fault reason
into account, if we decide to merge these two implementations in the
future.
This commit also removes the unnecessary user_sp RegisterState member.
We never use the kernel stack pointer on entry, so we can simply always
store the stack pointer of the previous privilege mode in sp.
Also remove the sp member from mcontext, as RISC-V doesn't have a
dedicated stack pointer register.
sp is defined to be x2 (x[1] in our case) by the ABI.
I probably accidentally included sp while copying the struct from
aarch64.
sepc has to be incremented before the call to syscall_handler,
as we otherwise would return to the ecall instruction, resulting in an
infinite trap loop.
We can't increment it after syscall_handler, as sepc might get changed
while handling the syscall.
The value of this field is incremented by one, as a value of 0 for this
field means 1 entry supported.
A value of 0xffff for CAP.MQES would incorrectly by truncated to 0x0000,
if we don't increase the bit width of the return type.
RFC9293 states that from the TimeWait state the TCPSocket
should wait the MSL (2mins) for delayed segments to expire
so that their sequence numbers do not clash with a new
connection's sequence numbers using the same ip address
and port number. The wait also ensures the remote TCP peer
has received the ACK to their FIN segment.
Please be aware that I only have NIC with chip version 6 so
this is the only one that I have tested. Rest was implemented
via looking at Linux rtl8169 driver. Also thanks to IdanHo
for some initial work.
Either we mount from a loop device or other source, the user might want
to obfuscate the given source for security reasons, so this option will
ensure this will happen.
If passed during a mount, the source will be hidden when reading from
the /sys/kernel/df node.
Similarly to DevPtsFS, this filesystem is about exposing loop device
nodes easily in /dev/loop, so userspace doesn't need to do anything in
order to use new devices immediately.
Instead of returning a raw pointer, which could be technically invalid
when using it in the caller function, we return a valid RefPtr of such
device.
This ensures that the code in DevPtsFS is now safe from a rare race
condition in which the SlavePTY device is gone but we still have a
pointer to it.
This device is a block device that allows a user to effectively treat an
Inode as a block device.
The static construction method is given an OpenFileDescription reference
but validates that:
- The description has a valid custody (so it's not some arbitrary file).
Failing this requirement will yield EINVAL.
- The description custody points to an Inode which is a regular file, as
we only support (seekable) regular files. Failing this requirement
will yield ENOTSUP.
LoopDevice can be used to mount a regular file on the filesystem like
other supported types of (physical) block devices.
Instead of using C-arrays, and manually counting their lengths, use
AK::Array. And pass these arrays around as spans, instead of as pointer-
and-length pairs.
Such operation is almost equivalent to writing on an Inode, so lock the
Inode m_inode_lock exclusively.
All FileSystem Inode implementations then override a new method called
truncate_locked which should implement the actual truncating.
thread_context_first_enter reuses the context restoring code in the
trap handler, just like other arches already do.
The `ld x2, 1*8(sp)` is unnecessary in the trap handler, as the stack
pointer should be equal to the stack pointer slot in the RegisterState
if the trap is from supervisor mode (and we currently don't support
user traps).
This load will however make us unable to reuse that code for
thread_context_first_enter.
This commit adds two functions which save/restore the entire FPU state.
On RISC-V, you only need to save the floating pointer registers
themselves and the fcsr CSR, which contains the entire state of the F/D
extensions.