Commit graph

927 commits

Author SHA1 Message Date
Timon Kruiper
a4534678f9 Kernel: Implement InterruptDisabler using generic Processor functions
Now that the code does not use architectural specific code, it is moved
to the generic Arch directory and the paths are modified accordingly.
2022-06-02 13:14:12 +01:00
Liav A
58acdce41f Kernel/FileSystem: Simplify even more the mount syscall
As with the previous commit, we put a distinction between filesystems
that require a file description and those which don't, but now in a much
more readable mechanism - all initialization properties as well as the
create static method are grouped to create the FileSystemInitializer
structure. Then when we need to initialize an instance, we iterate over
a table of these structures, checking for matching structure and then
validating the given arguments from userspace against the requirements
to ensure we can create a valid instance of the requested filesystem.
2022-05-29 19:31:02 +01:00
Liav A
4c588441e3 Kernel: Simplify mount syscall flow for regular calls
We do this by putting a distinction between two types of filesystems -
the first type is backed in RAM, and includes TmpFS, ProcFS, SysFS,
DevPtsFS and DevTmpFS. Because these filesystems are backed in RAM,
trying to mount them doesn't require source open file description.
The second type is filesystems that are backed by a file, therefore the
userspace program has to open them (hence it has a open file description
on them) and provide the appropriate source open file description.
By putting this distinction, we can early check if the user tried to
mount the second type of filesystems without a valid file description,
and fail with EBADF then.
Otherwise, we can proceed to either mount either type of filesystem,
provided that the fs_type is valid.
2022-05-29 19:31:02 +01:00
Peter Elliott
f6943c85b0 Kernel: Fix EINVAL when mmaping with address and no MAP_FIXED
The current behavior accidently trys to allocate 0 bytes when a non-null
address is provided and MAP_FIXED is specified. This is clearly a bug.
2022-05-23 00:13:26 +02:00
Ariel Don
8a854ba309 Kernel+LibC: Implement futimens(3)
Implement futimes() in terms of utimensat(). Now, utimensat() strays
from POSIX compliance because it also accepts a combination of a file
descriptor of a regular file and an empty path. utimensat() then uses
this file descriptor instead of the path to update the last access
and/or modification time of a file. That being said, its prior behavior
remains intact.

With the new behavior of utimensat(), `path` must point to a valid
string; given a null pointer instead of an empty string, utimensat()
sets `errno` to `EFAULT` and returns a failure.
2022-05-21 18:15:00 +02:00
Ariel Don
9a6bd85924 Kernel+LibC+VFS: Implement utimensat(3)
Create POSIX utimensat() library call and corresponding system call to
update file access and modification times.
2022-05-21 18:15:00 +02:00
Tim Schumacher
098af0f846 Kernel: Properly define IOV_MAX 2022-05-05 20:47:38 +02:00
Timon Kruiper
feba7bc8a8 Kernel: Move Kernel/Arch/x86/SafeMem.h to Kernel/Arch/SafeMem.h
The file does not contain any specific architectural code, thus it can
be moved to the Kernel/Arch directory.
2022-05-03 21:53:36 +02:00
Andrew Kaster
f08e91f67e Kernel: Don't check pledges or veil against code coverage data files
Coverage tools like LLVM's source-based coverage or GNU's --coverage
need to be able to write out coverage files from any binary, regardless
of its security posture. Not ignoring these pledges and veils means we
can't get our coverage data out without playing some serious tricks.

However this is pretty terrible for normal exeuction, so only skip these
checks when we explicitly configured userspace for coverage.
2022-05-02 01:46:18 +02:00
Andreas Kling
b85c8a0b80 Kernel: Add FIOCLEX and FIONCLEX ioctls
These allow you to turn the close-on-exec flag on/off via ioctl().
2022-04-26 14:32:12 +02:00
sin-ack
bc7c8879c5 Kernel+LibC+LibCore: Implement the unlinkat(2) syscall 2022-04-23 10:43:32 -07:00
Tim Schumacher
a1686db2de Kernel: Skip setting region name if none is given to mmap
This keeps us from accidentally overwriting an already set region name,
for example when we are mapping a file (as, in this case, the file name
is already stored in the region).
2022-04-12 01:52:21 +02:00
Idan Horowitz
e84bbfed44 Kernel: Remove big lock from sys$mkdir
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
165a23b68c Kernel: Remove big lock from sys$rename
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
5c064d3e8e Kernel: Remove big lock from sys$rmdir
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
d4ce43cf45 Kernel: Remove big lock from sys$statvfs
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
4ae93179f1 Kernel: Remove big lock from sys$symlink
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
1474b18070 Kernel: Remove big lock from sys$link
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
fa360f7d88 Kernel: Remove big lock from sys$unlink
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
5a96260e25 Kernel: Remove big lock from sys$setsockopt
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
c2372242b1 Kernel: Remove big lock from sys$getsockopt
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
849c227f72 Kernel: Remove big lock from sys$shutdown
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
e620487b66 Kernel: Remove big lock from sys$connect
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
9547a8e8a2 Kernel: Remove big lock from sys$close
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
0297349922 Kernel: Remove big lock from sys$chown
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
8458313e8a Kernel: Remove big lock from sys$fchown
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Idan Horowitz
f986a3b886 Kernel: Remove big lock from sys$bind
This syscall doesn't access any unprotected shared data.
2022-04-09 23:46:02 +02:00
Luke Wilde
1682b0b6d8 Kernel: Remove big lock from sys$set_coredump_metadata
The only requirement for this syscall is to make
Process::m_coredump_properties SpinlockProtected.
2022-04-09 21:51:16 +02:00
Jelle Raaijmakers
cc411b328c Kernel: Remove big lock from sys$accept4
The only thing we needed to check is whether `socket.accept()` returns
a socket, and if not, we go back to blocking again.
2022-04-09 17:53:18 +02:00
Andreas Kling
9b9b05eabf Kernel: Make sys$mmap() round requested VM size to page size multiple
This fixes an issue where File::mmap() overrides would fail because they
were expecting to be called with a size evenly divisible by PAGE_SIZE.
2022-04-05 22:26:37 +02:00
Andreas Kling
e3e1d79a7d Kernel: Remove unused ShouldDeallocateVirtualRange parameters
Since there is no separate virtual range allocator anymore, this is
no longer used for anything.
2022-04-05 01:15:22 +02:00
Andreas Kling
63ddbaf68a Kernel: Tweak broken dbgln_if() in sys$fork() after RegionTree changes 2022-04-04 11:05:49 +02:00
Andreas Kling
12b612ab14 Kernel: Mark sys$adjtime() as not needing the big lock
This syscall works on global kernel state and so doesn't need protection
from threads in the same process.
2022-04-04 00:42:18 +02:00
Andreas Kling
4306422f29 Kernel: Mark sys$clock_settime() as not needing the big log
This syscall ends up disabling interrupts while changing the time,
and the clock is a global resource anyway, so preventing threads in the
same process from running wouldn't solve anything.
2022-04-04 00:42:18 +02:00
Andreas Kling
55814f6e0e Kernel: Mark sys$sched_{set,get}param() as not needing the big lock
Both of these syscalls take the scheduler lock while accessing the
thread priority, so there's no reliance on the process big lock.
2022-04-04 00:42:18 +02:00
Andreas Kling
9250ac0c24 Kernel: Randomize non-specific VM allocations done by sys$execve()
Stuff like TLS regions, main thread stacks, etc. All deserve to be
randomized unless the ELF requires specific placement. :^)
2022-04-04 00:42:18 +02:00
Andreas Kling
36d829b97c Kernel: Mark sys$listen() as not needing the big lock
This syscall already performs the necessary locking and so doesn't
need to rely on the process big lock.
2022-04-03 22:22:22 +02:00
Andreas Kling
e103c5fe2d Kernel: Don't hog file descriptor table lock in sys$bind()
We don't need to hold the lock across the entire syscall. Once we've
fetched the open file description we're interested in, we can let go.
2022-04-03 22:20:34 +02:00
Andreas Kling
85ceab1fec Kernel: Don't hog file descriptor table lock in sys$listen()
We don't need to hold the lock across the entire syscall. Once we've
fetched the open file description we're interested in, we can let go.
2022-04-03 22:18:57 +02:00
Andreas Kling
bc4282c773 Kernel: Mark sys$sendfd() and sys$recvfd() as not needing the big lock
These syscalls already perform the necessary locking and don't rely on
the process big lock.
2022-04-03 22:06:03 +02:00
Andreas Kling
858b196c59 Kernel: Unbreak ASLR in the new RegionTree world
Functions that allocate and/or place a Region now take a parameter
that tells it whether to randomize unspecified addresses.
2022-04-03 21:51:58 +02:00
Andreas Kling
07f3d09c55 Kernel: Make VM allocation atomic for userspace regions
This patch move AddressSpace (the per-process memory manager) to using
the new atomic "place" APIs in RegionTree as well, just like we did for
MemoryManager in the previous commit.

This required updating quite a few places where VM allocation and
actually committing a Region object to the AddressSpace were separated
by other code.

All you have to do now is call into AddressSpace once and it'll take
care of everything for you.
2022-04-03 21:51:58 +02:00
Andreas Kling
ffe2e77eba Kernel: Add Memory::RegionTree to share code between AddressSpace and MM
RegionTree holds an IntrusiveRedBlackTree of Region objects and vends a
set of APIs for allocating memory ranges.

It's used by AddressSpace at the moment, and will be used by MM soon.
2022-04-03 21:51:58 +02:00
Andreas Kling
02a95a196f Kernel: Use AddressSpace region tree for range allocation
This patch stops using VirtualRangeAllocator in AddressSpace and instead
looks for holes in the region tree when allocating VM space.

There are many benefits:

- VirtualRangeAllocator is non-intrusive and would call kmalloc/kfree
  when used. This new solution is allocation-free. This was a source
  of unpleasant MM/kmalloc deadlocks.

- We consolidate authority on what the address space looks like in a
  single place. Previously, we had both the range allocator *and* the
  region tree both being used to determine if an address was valid.
  Now there is only the region tree.

- Deallocation of VM when splitting regions is no longer complicated,
  as we don't need to keep two separate trees in sync.
2022-04-03 21:51:58 +02:00
Andreas Kling
2617adac52 Kernel: Store AddressSpace memory regions in an IntrusiveRedBlackTree
This means we never need to allocate when inserting/removing regions
from the address space.
2022-04-03 21:51:58 +02:00
Tim Schumacher
4ba39c8d63 Kernel: Implement f_basetype in statvfs 2022-04-03 19:15:14 +02:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Ali Mohammad Pur
d6ce3e63e2 Kernel: Disallow elevating pledge promises with no_error set
8233da3398 introduced a not-so-subtle bug
where an application with an existing pledge set containing `no_error`
could elevate its pledge set by pledging _anything_, this commit makes
sure that no new promise is accepted.
2022-03-29 12:11:56 +02:00
Ali Mohammad Pur
8233da3398 Kernel: Add a 'no_error' pledge promise
This makes pledge() ignore promises that would otherwise cause it to
fail with EPERM, which is very useful for allowing programs to run under
a "jail" so to speak, without having them termiate early due to a
failing pledge() call.
2022-03-26 21:34:56 +04:30
Liav A
b5ef900ccd Kernel: Don't assume paths of TTYs and pseudo terminals anymore
The obsolete ttyname and ptsname syscalls are removed.
LibC doesn't rely on these anymore, and it helps simplifying the Kernel
in many places, so it's an overall an improvement.

In addition to that, /proc/PID/tty node is removed too as it is not
needed anymore by userspace to get the attached TTY of a process, as
/dev/tty (which is already a character device) represents that as well.
2022-03-22 20:26:05 +01:00