Commit graph

790 commits

Author SHA1 Message Date
Daniel Bertalan
8e3d1a42e3 Kernel+UE+LibC: Store address as void* in SC_m{re,}map_params
Most other syscalls pass address arguments as `void*` instead of
`uintptr_t`, so let's do that here too. Besides improving consistency,
this commit makes `strace` correctly pretty-print these arguments in
hex.
2021-12-23 23:08:10 +01:00
Daniel Bertalan
77f9272aaf Kernel+UE: Add MAP_FIXED_NOREPLACE mmap() flag
This feature was introduced in version 4.17 of the Linux kernel, and
while it's not specified by POSIX, I think it will be a nice addition to
our system.

MAP_FIXED_NOREPLACE provides a less error-prone alternative to
MAP_FIXED: while regular fixed mappings would cause any intersecting
ranges to be unmapped, MAP_FIXED_NOREPLACE returns EEXIST instead. This
ensures that we don't corrupt our process's address space if something
is already at the requested address.

Note that the more portable way to do this is to use regular
MAP_ANONYMOUS, and check afterwards whether the returned address matches
what we wanted. This, however, has a large performance impact on
programs like Wine which try to reserve large portions of the address
space at once, as the non-matching addresses have to be unmapped
separately.
2021-12-23 23:08:10 +01:00
Andreas Kling
1d08b671ea Kernel: Enter new address space before destroying old in sys$execve()
Previously we were assigning to Process::m_space before actually
entering the new address space (assigning it to CR3.)

If a thread was preempted by the scheduler while destroying the old
address space, we'd then attempt to resume the thread with CR3 pointing
at a partially destroyed address space.

We could then crash immediately in write_cr3(), right after assigning
the new value to CR3. I am hopeful that this may have been the bug
haunting our CI for months. :^)
2021-12-23 01:18:26 +01:00
Daniel Bertalan
ce1bf3724e Kernel: Replace intersecting ranges in mmap when MAP_FIXED is specified
This behavior is mandated by POSIX and is used by software like Wine
after reserving large chunks of the address range.
2021-12-22 00:02:36 -08:00
Martin Bříza
86b249f02f Kernel: Implement sysconf(_SC_SYMLOOP_MAX)
Not much to say here, this is an implementation of this call that
accesses the actual limit constant that's used by the VirtualFileSystem
class.

As a side note, this is required for my eventual Qt port.
2021-12-21 12:54:11 -08:00
Liav A
5a649d0fd5 Kernel: Return EINVAL when specifying -1 for setuid and similar syscalls
For setreuid and setresuid syscalls, -1 means to set the current
uid/euid/gid/egid value, to be more convenient for programming.
However, for other syscalls where we pass only one argument, there's no
justification to specify -1.

This behavior is identical to how Linux handles the value -1, and is
influenced by the fact that the manual pages for the group of one
argument syscalls that handle ID operations is ambiguous about this
topic.
2021-12-20 11:32:16 +01:00
Hendiadyoin1
18013f3c06 Kernel: Remove a redundant check in Process::remap_range_as_stack
We already VERIFY that we have carved something out, so we don't need to
check that again.
2021-12-18 10:31:18 -08:00
Hendiadyoin1
2d28b441bf Kernel: Collapse a redundant boolean conditional return statement in …
validate_mmap_prot
2021-12-18 10:31:18 -08:00
Hendiadyoin1
f38d32535c Kernel: Access OpenFileDescriptions::max_open() statically in Syscalls 2021-12-18 10:31:18 -08:00
Hendiadyoin1
c860e0ab95 Kernel: Add implicit auto qualifiers in Syscalls 2021-12-18 10:31:18 -08:00
Hendiadyoin1
f5b495d92c Kernel: Remove else after return in Process::do_write 2021-12-18 10:31:18 -08:00
Andreas Kling
32aa623eff Kernel: Fix 4-byte uninitialized memory leak in sys$sigaltstack()
It was possible to extract 4 bytes of uninitialized kernel stack memory
on x86_64 by looking in the padding of stack_t.
2021-12-18 11:30:10 +01:00
Andreas Kling
0ae8702692 Kernel: Make File::stat() & friends return Error<struct stat>
Instead of making the caller provide a stat buffer, let's just return
one as a value.
2021-12-18 11:30:10 +01:00
Andreas Kling
abf2204402 Kernel: Use copy_typed_from_user() in more places :^) 2021-12-18 11:30:10 +01:00
Andreas Kling
39d9337db5 Kernel: Make sys${ftruncate,pread} take off_t as const pointer
These syscalls don't write back to the off_t value (unlike sys$lseek)
so let's take Userspace<off_t const*> instead of Userspace<off_t*>.
2021-12-18 11:30:10 +01:00
Jean-Baptiste Boric
23257cac52 Kernel: Remove sys$select() syscall
Now that the userland has a compatiblity wrapper for select(), the
kernel doesn't need to implement this syscall natively. The poll()
interface been around since 1987, any code still using select()
should be slapped silly.

Note: the SerenityOS source tree mostly uses select() and not poll()
despite SerenityOS having support for poll() since early 2019...
2021-12-12 21:48:50 +01:00
Jean-Baptiste Boric
2177c2a30b Kernel: Split off sys$poll() into Syscalls/poll.cpp 2021-12-12 21:48:50 +01:00
Idan Horowitz
762e047ec9 Kernel+LibC: Implement sigtimedwait()
This includes a new Thread::Blocker called SignalBlocker which blocks
until a signal of a matching type is pending. The current Blocker
implementation in the Kernel is very complicated, but cleaning it up is
a different yak for a different day.
2021-12-12 08:34:19 +02:00
Idan Horowitz
81a76a30a1 Kernel: Preserve pending signals across execve(2)s
As required by posix. Also rename Thread::clear_signals to
Thread::reset_signals_for_exec since it doesn't actually clear any
pending signals, but rather does execve related signal book-keeping.
2021-12-12 08:34:19 +02:00
Idan Horowitz
0ca1231d8f Kernel: Inherit alternative signal stack on fork(2)
A child process created via fork(2) inherits a copy of its parent's
alternate signal stack settings.
2021-12-12 08:34:19 +02:00
Idan Horowitz
92a6c91f4e Kernel: Preserve signal mask across fork(2) and execve(2)
A child created via fork(2) inherits a copy of its parent's signal
mask; the signal mask is preserved across execve(2).
2021-12-12 08:34:19 +02:00
Ben Wiederhake
0e6e1092f0 Kernel: Make ptrace return an error on error
Returning 'result.error().code()' erroneously creates an
ErrorOr<FlatPtr> of the positive errno code, which breaks our
error-returning convention.

This seems to be due to a forgotten minus-sign during the refactoring in
9e51e295cf. This latent bug was never
discovered, because currently the error-handling paths are rarely
exercised.
2021-12-05 22:59:09 +01:00
Ben Wiederhake
0f8483f09c Kernel: Implement new ptrace function PT_PEEKBUF
This enables the tracer to copy large amounts of data in a much saner
way.
2021-12-05 22:59:09 +01:00
Ben Wiederhake
3e223185b3 Kernel+strace: Remove unnecessary indirection for PEEK
Also, remove incomplete, superfluous check.
Incomplete, because only the byte at the provided address was checked;
this misses the last bytes of the "jerk page".
Superfluous, because it is already correctly checked by peek_user_data
(which calls copy_from_user).

The caller/tracer should not typically attempt to read non-userspace
addresses, we don't need to "hot-path" it either.
2021-12-05 22:59:09 +01:00
Idan Horowitz
265764ff2f Kernel: Add support for the POLLWRBAND poll event 2021-12-05 12:53:29 +01:00
Idan Horowitz
f415218afe Kernel+LibC: Implement sigaltstack()
This is required for compiling wine for serenity
2021-12-01 21:44:11 +02:00
Idan Horowitz
d5d0eb45bf Kernel: Clear up some comments in the sys$mprotect implementation 2021-12-01 21:44:11 +02:00
Idan Horowitz
f27bbec7b2 Kernel: Move incorrect early return in sys$mprotect
Since we're iterating over multiple regions that interesect with the
requested range, just one of them having the requested access flags
is not enough to finish the syscall early.
2021-12-01 21:44:11 +02:00
Idan Horowitz
4ca39c7110 Kernel: Move the expand_range_to_page_boundaries helper to MemoryManager
This helper can (and will) be used in more parts of the kernel besides
the mmap-family of syscalls.
2021-12-01 21:44:11 +02:00
Idan Horowitz
fc13d0782f LibC: Make the madvise advice field a value instead of a bitfield
The advices are almost always exclusive of one another, and while POSIX
does not define madvise, most other unix-like and *BSD systems also only
accept a singular value per call.
2021-12-01 21:44:11 +02:00
Hendiadyoin1
c7b90fa7d3 Kernel: Don't rewrite the whole file on sys$msync 2021-12-01 09:47:46 +01:00
Hendiadyoin1
259f78545a Kernel: Allow flushing of partial regions in sys$msync 2021-12-01 09:47:46 +01:00
Hendiadyoin1
49d6ad6633 Kernel: Handle more error cases in sys$msync 2021-12-01 09:47:46 +01:00
Ben Wiederhake
33079c8ab9 Kernel+UE+LibC: Remove unused dbgputch syscall
Everything uses the dbgputstr syscall anyway, so there is no need to
keep supporting it.
2021-11-24 22:56:39 +01:00
Jelle Raaijmakers
46ad5f2a17 Kernel: Fix futex syscall return values
We were returning `int`s from two functions that caused `ErrorOr` to
not recognize the error codes as a special case. For example,
`ETIMEDOUT` was returned as the positive number 66 resulting in all
kinds of defective behavior.

As a result, SDL2's timer subsystem was not working at all, since the
`SDL_MUTEX_TIMEDOUT` value was never returned.
2021-11-24 19:44:57 +01:00
Andreas Kling
dd6e73176d Kernel: Make sys$mmap() interpret 0-alignment as page-sized alignment
This allows userspace to get a sane default behavior without having to
specify the page size.
2021-11-23 11:44:42 +01:00
Andreas Kling
f99af1bef0 Kernel: Make sure OpenFileDescription is kept alive while read() blocks
It's not safe to store OpenFileDescription in a raw pointer when
blocking, since another thread may decide to close the corresponding
file descriptor.
2021-11-21 20:22:48 +01:00
Andreas Kling
f2c3a41a8f Kernel: Make UserOrKernelBuffer::for_user_buffer() return ErrorOr<T>
This simplifies EFAULT propagation with TRY(). :^)
2021-11-21 20:22:48 +01:00
Itamar
38ddf301f6 Kernel+LibC: Fix ptrace for 64-bit
This makes the types used in the PT_PEEK and PT_POKE actions
suitable for 64-bit platforms as well.
2021-11-20 21:22:24 +00:00
Andreas Kling
e08d213830 Kernel: Use DistinctNumeric for filesystem ID's
This patch adds the FileSystemID type, which is a distinct u32.
This prevents accidental conversion from arbitrary integers.
2021-11-18 21:11:30 +01:00
Andreas Kling
32aa37d5dc Kernel+LibC: Add msync() system call
This allows userspace to trigger a full (FIXME) flush of a shared file
mapping to disk. We iterate over all the mapped pages in the VMObject
and write them out to the underlying inode, one by one. This is rather
naive, and there's lots of room for improvement.

Note that shared file mappings are currently not possible since mmap()
returns ENOTSUP for PROT_WRITE+MAP_SHARED. That restriction will be
removed in a subsequent commit. :^)
2021-11-17 19:34:15 +01:00
Andrew Kaster
f1d8978804 AK+Kernel: Remove implicit conversion from Userspace<T*> to FlatPtr
This feels like it was a refactor transition kind of conversion. The
places that were relying on it can easily be changed to explicitly ask
for the ptr() or a new vaddr() method on Userspace<T*>.

FlatPtr can still implicitly convert to Userspace<T> because the
constructor is not explicit, but there's quite a few more places that
are relying on that conversion.
2021-11-16 00:13:22 +01:00
Andrew Kaster
194456efdc Kernel: Remove unnecessary StringBuilder from sys$create_thread()
A series of refactors changed Threads to always have a name, and to
store their name as a KString. Before the refactors a StringBuilder was
used to format the default thread name for a non-main thread, but it is
since unused. Remove it and the AK/String related header includes from
the thread syscall implementation file.
2021-11-16 00:13:22 +01:00
Daniel Bertalan
648a139af3 Kernel+LibC: Pass off_t to pread() via a pointer
`off_t` is a 64-bit signed integer, so passing it in a register on i686
is not the best idea.

This fix gets us one step closer to making the LLVM port work.
2021-11-13 10:04:46 +01:00
Andreas Kling
88b6428c25 AK: Make Vector::try_* functions return ErrorOr<void>
Instead of signalling allocation failure with a bool return value
(false), we now use ErrorOr<void> and return ENOMEM as appropriate.
This allows us to use TRY() and MUST() with Vector. :^)
2021-11-10 21:58:58 +01:00
Ben Wiederhake
ad5061bb7a Kernel: Make (f)statvfs report filesystem ID correctly 2021-11-10 16:13:10 +01:00
Ben Wiederhake
631447da57 Kernel: Fix TOCTOU in fstatvfs
In particular, fstatvfs used to assume that a file that was earlier
opened using some path will forever be at that path. This is wrong, and
in the meantime new mounts and new filesystems could take up the
filename or directories, leading to a completely inaccurate result.
This commit improves the situation:
- All filesystem information is now always accurate.
- The mount flags *might* be erroneously zero, if the custody for the
  open file is not available. I don't know when that might happen, but
  it is definitely not the typical case.
2021-11-10 16:13:10 +01:00
Andreas Kling
79fa9765ca Kernel: Replace KResult and KResultOr<T> with Error and ErrorOr<T>
We now use AK::Error and AK::ErrorOr<T> in both kernel and userspace!
This was a slightly tedious refactoring that took a long time, so it's
not unlikely that some bugs crept in.

Nevertheless, it does pass basic functionality testing, and it's just
real nice to finally see the same pattern in all contexts. :^)
2021-11-08 01:10:53 +01:00
Brian Gianforcaro
9f6eabd73a Kernel: Move TTY subsystem to use KString instead of AK::String
This is minor progress on removing the `AK::String` API from the Kernel
in the interest of improving OOM safety.
2021-11-02 11:34:31 +01:00
Ben Wiederhake
c05c5a7ff4 Kernel: Clarify ambiguous {File,Description}::absolute_path
Found due to smelly code in InodeFile::absolute_path.

In particular, this replaces the following misleading methods:

File::absolute_path
This method *never* returns an actual path, and if called on an
InodeFile (which is impossible), it would VERIFY_NOT_REACHED().

OpenFileDescription::try_serialize_absolute_path
OpenFileDescription::absolute_path
These methods do not guarantee to return an actual path (just like the
other method), and just like Custody::absolute_path they do not
guarantee accuracy. In particular, just renaming the method made a
TOCTOU bug obvious.

The new method signatures use KResultOr, just like
try_serialize_absolute_path() already did.
2021-10-31 12:06:28 +01:00