Commit graph

1027 commits

Author SHA1 Message Date
Brian Gianforcaro
e8ec1e908d Kernel: Only instantiate main_program_metadata in the scope it's needed
pvs-studio flagged this as a potential perf optimization.
2021-09-16 17:17:13 +02:00
Andreas Kling
b6efd66d56 Kernel: Use move semantics in sys$sendfd()
Avoid an unnecessary NonnullRefPtr<OpenFileDescription> copy.
2021-09-15 21:09:47 +02:00
Liav A
8d0dbdeaac Kernel+Userland: Introduce a new way to reboot and poweroff the machine
This change removes the halt and reboot syscalls, and create a new
mechanism to change the power state of the machine.
Instead of how power state was changed until now, put a SysFS node as
writable only for the superuser, that with a defined value, can result
in either reboot or poweroff.
In the future, a power group can be assigned to this node (which will be
the GroupID responsible for power management).

This opens an opportunity to permit to shutdown/reboot without superuser
permissions, so in the future, a userspace daemon can take control of
this node to perform power management operations without superuser
permissions, if we enforce different UserID/GroupID on that node.
2021-09-12 11:52:16 +02:00
Liav A
9132596b8e Kernel: Move ACPI and BIOS code into the new Firmware directory
This will somwhat help unify them also under the same SysFS directory in
the commit.
Also, it feels much more like this change reflects the reality that both
ACPI and the BIOS are part of the firmware on x86 computers.
2021-09-12 11:52:16 +02:00
TheFightingCatfish
a81b21c1a7 Kernel+LibC: Implement fsync 2021-09-12 11:24:02 +02:00
Liav A
04ba31b8c5 Kernel+Userland: Remove loadable kernel moduless
These interfaces are broken for about 9 months, maybe longer than that.
At this point, this is just a dead code nobody tests or tries to use, so
let's remove it instead of keeping a stale code just for the sake of
keeping it and hoping someone will fix it.

To better justify this, I read that OpenBSD removed loadable kernel
modules in 5.7 release (2014), mainly for the same reason we do -
nobody used it so they had no good reason to maintain it.
Still, OpenBSD had LKMs being effectively working, which is not the
current state in our project for a long time.
An arguably better approach to minimize the Kernel image size is to
allow dropping drivers and features while compiling a new image.
2021-09-11 19:05:00 +02:00
Linus Groh
f646d49ac1 Kernel: Add _SC_HOST_NAME_MAX 2021-09-11 00:28:39 +02:00
Andreas Kling
dd82f68326 Kernel: Use KString all the way in sys$execve()
This patch converts all the usage of AK::String around sys$execve() to
using KString instead, allowing us to catch and propagate OOM errors.

It also required changing the kernel CommandLine helper class to return
a vector of KString for the userspace init program arguments.
2021-09-09 21:25:10 +02:00
Andreas Kling
524ef5e475 Kernel: Add KBuffer::bytes() and use it
(Instead of hand-wrapping { data(), size() } in a bunch of places.)
2021-09-08 20:16:00 +02:00
Liav A
3d5ddbab74 Kernel: Rename DevFS => DevTmpFS
The current implementation of DevFS resembles the linux devtmpfs, and
not the traditional DevFS, so let's rename it to better represent the
direction of the development in regard to this filesystem.

The abbreviation for DevTmpFS is still "dev", because it doesn't add
value as a commandline option to make it longer.

In quick summary - DevFS in unix OSes is simply a static filesystem, so
device nodes are generated and removed by the kernel code. DevTmpFS
is a "modern reinvention" of the DevFS, so it is much more like a TmpFS
in the sense that not only it's stored entirely in RAM, but the userland
is responsible to add and remove devices nodes as it sees fit, and no
kernel code is directly being involved to keep the filesystem in sync.
2021-09-08 00:42:20 +02:00
Andreas Kling
a01b19c878 Kernel: Remove KBuffer::try_copy() in favor of try_create_with_bytes()
These were already equivalent, so let's only have one of them.
2021-09-07 16:22:29 +02:00
Andreas Kling
b300f9aa2f Kernel: Convert KBuffer::copy() => KBuffer::try_copy()
This was a weird KBuffer API that assumed failure was impossible.
This patch converts it to a modern KResultOr<NonnullOwnPtr<KBuffer>> API
and updates the two clients to the new style.
2021-09-07 15:36:39 +02:00
Andreas Kling
250b52d6e5 Kernel: Make KBuffer::try_create_with_bytes() return KResultOr 2021-09-07 15:22:24 +02:00
Andreas Kling
ed5d04b0ea Kernel: Use KResultOr and TRY() for FIFO 2021-09-07 13:58:16 +02:00
Andreas Kling
213b8868af Kernel: Rename file_description(fd) => open_file_description(fd)
To go with the class rename.
2021-09-07 13:53:14 +02:00
Andreas Kling
4a9c18afb9 Kernel: Rename FileDescription => OpenFileDescription
Dr. POSIX really calls these "open file description", not just
"file description", so let's call them exactly that. :^)
2021-09-07 13:53:14 +02:00
Andreas Kling
dbd639a2d8 Kernel: Convert much of sys$execve() to using KString
Make use of the new FileDescription::try_serialize_absolute_path() to
avoid String in favor of KString throughout much of sys$execve() and
its helpers.
2021-09-07 13:53:14 +02:00
Andreas Kling
226383f45b LibELF: Use StringView to carry temporary strings in auxiliary vector
Let's not force clients to provide a String.
2021-09-07 13:53:14 +02:00
Andreas Kling
a27c6f5226 Kernel: Avoid unnecessary String allocation in sys$statvfs() 2021-09-07 13:53:14 +02:00
Andreas Kling
55b0b06897 Kernel: Store process names as KString 2021-09-07 13:53:14 +02:00
Andreas Kling
fe2e25edad Kernel: Add a comment explaining an alternate path in Process::exec()
I had to look at this for a moment before I realized that sys$execve()
and the spawning of /bin/SystemServer at boot are taking two different
paths out of exec().

Add a comment to help the next person looking at it. :^)
2021-09-07 01:34:26 +02:00
Andreas Kling
5d06ab6531 Kernel: Fix file description leak in sys$execve()
Before this patch, we were leaking a ref on the open file description
used for the interpreter (the dynamic loader) in sys$execve().

This surfaced when adapting the syscall to use TRY(), since we were now
correctly transferring ownership of the interpreter to Process::exec()
and no longer holding on to a local copy of it (in `elf_result`).

Fixing the leak uncovered another problem. The interpreter description
would now get destroyed when returning from do_exec(), which led to a
kernel panic when attempting to acquire a mutex.

This happens because we're in a particularly delicate state when
returning from do_exec(). Everything is primed for the upcoming context
switch into the new executable image, and trying to block the thread
at this point will panic the kernel.

We fix this by destroying the interpreter description earlier in
do_exec(), at the point where we no longer need it.
2021-09-07 01:18:02 +02:00
Andreas Kling
e226400dd8 Kernel: Don't seek the program executable description in sys$execve()
The dynamic loader doesn't care if the kernel has moved the file
cursor around before it gains control.
2021-09-07 01:18:02 +02:00
Andreas Kling
f4624e4ee1 Kernel: Hoist allocation of main program FD in sys$execve()
When executing a dynamically linked program, we need to pass the main
program executable via a file descriptor to the dynamic loader.

Before this patch, we were allocating an FD for this purpose long after
it was safe to do anything fallible. If we were unable to allocate an
FD we would simply panic the kernel(!)

We now hoist the allocation so it can fail before we've committed to
a new executable.
2021-09-07 01:18:02 +02:00
Andreas Kling
b141bfe53b Kernel: Reorganize ELF loading so it can use TRY()
Due to the use of ELF::Image::for_each_program_header(), we were
previously unable to use TRY() in the ELF loading code (since the return
statement inside TRY() would only return from the iteration callback.)
2021-09-07 01:18:02 +02:00
Andreas Kling
e6929835d2 Kernel: Make copy_time_from_user() helpers use KResultOr<Time>
...and use TRY() for smooth error propagation everywhere.
2021-09-07 01:18:02 +02:00
Andreas Kling
274d535d0e Kernel: Use TRY() in sys$module_load() and sys$module_unload() 2021-09-06 20:23:08 +02:00
Andreas Kling
56a2594de7 Kernel: Make KString factories return KResultOr + use TRY() everywhere
There are a number of places that don't have an error propagation path
right now, so I've added FIXME's about that.
2021-09-06 19:25:36 +02:00
Andreas Kling
f16b9a691f Kernel: Rename ProcessPagingScope => ScopedAddressSpaceSwitcher 2021-09-06 18:56:51 +02:00
Andreas Kling
cd8d52e6ae Kernel: Improve API names for switching address spaces
- enter_space => enter_address_space
- enter_process_paging_scope => enter_process_address_space
2021-09-06 18:56:51 +02:00
Andreas Kling
298cd57fe7 Kernel: Allocate signal trampoline before committing to a sys$execve()
Once we commit to a new executable image in sys$execve(), we can no
longer return with an error to whoever called us from userspace.
We must make sure to surface any potential errors before that point.

This patch moves signal trampoline allocation before the commit.
A number of other things remain to be moved.
2021-09-06 18:56:51 +02:00
Andreas Kling
6863d015ec Kernel: Use TRY() more in sys$execve()
I just keep finding more and more places to make use of this. :^)
2021-09-06 18:56:51 +02:00
Andreas Kling
009ea5013d Kernel: Use TRY() in find_elf_interpreter_for_executable() 2021-09-06 18:56:51 +02:00
Andreas Kling
511ebffd94 Kernel: Improve find_elf_interpreter_for_executable() parameter names 2021-09-06 18:56:51 +02:00
Andreas Kling
645e29a88b Kernel: Don't turn I/O errors during sys$execve() into ENOEXEC
Instead, just propagate whatever the real error was.
2021-09-06 13:06:05 +02:00
Andreas Kling
84addef10f Kernel: Improve arguments retrieval error propagation in sys$execve()
Instead of turning any arguments related error into an EFAULT, we now
propagate the innermost error during arguments retrieval.
2021-09-06 13:06:05 +02:00
Andreas Kling
6e3381ac32 Kernel: Use KResultOr and TRY() for {Shared,Private}InodeVMObject 2021-09-06 13:06:05 +02:00
Andreas Kling
e3a716ceff Kernel: Make Memory::Region::map() return KResult
..and use TRY() at the call sites to propagate errors. :^)
2021-09-06 13:06:05 +02:00
Andreas Kling
7981422500 Kernel: Make Threads always have a name
We previously allowed Thread to exist in a state where its m_name was
null, and had to work around that in various places.

This patch removes that possibility and forces those who would create a
thread (or change the name of one) to provide a NonnullOwnPtr<KString>
with the name.
2021-09-06 13:06:05 +02:00
Andreas Kling
cda2b9e71c Kernel: Improvements to Custody absolute path serialization
- Renamed try_create_absolute_path() => try_serialize_absolute_path()
- Use KResultOr and TRY() to propagate errors
- Don't call this when it's only for debug logging
2021-09-06 13:06:05 +02:00
Andreas Kling
e3b063581e Kernel: Use TRY() in sys$alarm() 2021-09-06 13:06:05 +02:00
Andreas Kling
d34f2b643e Kernel: Tidy up Plan9FS construction a bit 2021-09-06 13:06:05 +02:00
Andreas Kling
36725228fa Kernel: Tidy up Ext2FS construction a bit 2021-09-06 13:06:05 +02:00
Andreas Kling
47bfbe343b Kernel: Tidy up SysFS construction
- Use KResultOr and TRY() to propagate errors
- Check for OOM errors
- Move allocation out of constructors

There's still a lot more to do here, as SysFS is still quite brittle
in the face of memory pressure.
2021-09-06 13:06:05 +02:00
Andreas Kling
788b91a65c Kernel: Tidy up DevFS construction and handle OOM errorso
- Use KResultOr and TRY() to propagate errors
- Check for OOM
- Move allocations out of the DevFS constructor
2021-09-06 13:06:05 +02:00
Andreas Kling
efe4e230ee Kernel: Tidy up DevPtsFS construction and handle OOM errors
- Use KResultOr and TRY() to propagate errors
- Check for OOM when creating new inodes
2021-09-06 13:06:05 +02:00
Andreas Kling
f2512071f2 Kernel: Use TRY() in sys$getrandom() 2021-09-06 02:36:21 +02:00
Andreas Kling
a8516681b7 Kernel: Tidy up TmpFS and TmpFSInode construction
- Use KResultOr<NonnullRefPtr<T>>
- Propagate errors
- Use TRY() at call sites
2021-09-06 02:36:21 +02:00
Andreas Kling
a994f11f10 Kernel: Make AddressSpace::add_region() return KResultOr<Region*>
This allows us to use TRY() in a few places.
2021-09-06 02:02:06 +02:00
Andreas Kling
75564b4a5f Kernel: Make kernel region allocators return KResultOr<NOP<Region>>
This expands the reach of error propagation greatly throughout the
kernel. Sadly, it also exposes the fact that we're allocating (and
doing other fallible things) in constructors all over the place.

This patch doesn't attempt to address that of course. That's work for
our future selves.
2021-09-06 01:55:27 +02:00
Andreas Kling
f4a9a0d561 Kernel: Make VirtualRangeAllocator return KResultOr<VirtualRange>
This achieves two things:
- The allocator can report more specific errors
- Callers can (and now do) use TRY() :^)
2021-09-06 01:55:27 +02:00
Andreas Kling
98dc08fe56 Kernel: Use KResultOr better in ProcessGroup construction
This allows us to use TRY() more.
2021-09-06 01:55:27 +02:00
Andreas Kling
12d9a6c1fa Kernel: Use TRY() in sys$waitid() 2021-09-05 18:42:32 +02:00
Andreas Kling
c076d765c4 Kernel: Simplify sys$inode_watcher_remove_watch() a little bit 2021-09-05 18:41:28 +02:00
Andreas Kling
d912bfdf38 Kernel: Use TRY() in sys$create_inode_watcher() and friends 2021-09-05 18:41:01 +02:00
Andreas Kling
a9204510a4 Kernel: Make file description lookup return KResultOr
Instead of checking it at every call site (to generate EBADF), we make
file_description(fd) return a KResultOr<NonnullRefPtr<FileDescription>>.

This allows us to wrap all the calls in TRY(). :^)

The only place that got a little bit messier from this is sys$mount(),
and there's a whole bunch of things there in need of cleanup.
2021-09-05 18:36:13 +02:00
Andreas Kling
2d2ea05c97 Kernel: Use TRY() in sys$sethostname() 2021-09-05 18:22:18 +02:00
Andreas Kling
963f847579 Kernel: Use TRY() in sys$mount() 2021-09-05 18:20:57 +02:00
Andreas Kling
3580c5a72e Kernel: Use TRY() in sys$umount() 2021-09-05 18:18:23 +02:00
Andreas Kling
76f2596ce8 Kernel: Use TRY() in sys$set_process_name() 2021-09-05 18:17:06 +02:00
Andreas Kling
53aa01384d Kernel: Use TRY() in sys$set_coredump_metadata() 2021-09-05 18:15:42 +02:00
Andreas Kling
bfe4c84541 Kernel: Use TRY() in sys$create_thread() 2021-09-05 18:15:05 +02:00
Andreas Kling
257fa80312 Kernel: Use TRY() in sys$set_thread_name() 2021-09-05 18:15:05 +02:00
Andreas Kling
afc5bbd56b Kernel: Use TRY() in sys$write() 2021-09-05 18:15:05 +02:00
Andreas Kling
17933b193a Kernel: Use TRY() in sys$perf_register_string() 2021-09-05 18:15:05 +02:00
Andreas Kling
77b7a44691 Kernel: Use TRY() in sys$recvfd() 2021-09-05 18:15:05 +02:00
Andreas Kling
ea911bc125 Kernel: Use TRY() in sys$pledge() 2021-09-05 18:15:05 +02:00
Andreas Kling
95e74d1776 Kernel: Use TRY() even more in sys$mmap() and friends :^) 2021-09-05 18:15:05 +02:00
Andreas Kling
cf2c04eb13 Kernel: Use TRY() in sys$dbgputstr() 2021-09-05 18:15:05 +02:00
Andreas Kling
6dddd500bf Kernel: Use TRY() in sys$map_time_page() 2021-09-05 18:15:05 +02:00
Andreas Kling
d53c60fd9f Kernel: Use TRY() in sys$setkeymap() 2021-09-05 18:15:05 +02:00
Andreas Kling
1f475f7bbc Kernel: Use TRY() in sys$realpath() 2021-09-05 18:15:05 +02:00
Andreas Kling
4ea3dc77f0 Kernel: Use TRY() in sys$statvfs() 2021-09-05 18:15:05 +02:00
Andreas Kling
7efa742a39 Kernel: Use TRY() in sys$stat() 2021-09-05 17:58:08 +02:00
Andreas Kling
de2c1bc5c3 Kernel: Use TRY() in sys$unlink() 2021-09-05 17:56:40 +02:00
Andreas Kling
4e4b7c272c Kernel: Use TRY() in sys$readlink() 2021-09-05 17:55:43 +02:00
Andreas Kling
e0cf9152ca Kernel: Use TRY() in sys$rmdir() 2021-09-05 17:54:33 +02:00
Andreas Kling
780737de7a Kernel: Use TRY() in sys$rename() 2021-09-05 17:53:59 +02:00
Andreas Kling
c5d046c23a Kernel: Use TRY() in sys$utime() 2021-09-05 17:53:28 +02:00
Andreas Kling
789db813d3 Kernel: Use copy_typed_from_user<T> for fetching syscall parameters 2021-09-05 17:51:37 +02:00
Andreas Kling
48a0b31c47 Kernel: Make copy_{from,to}_user() return KResult and use TRY()
This makes EFAULT propagation flow much more naturally. :^)
2021-09-05 17:38:37 +02:00
Andreas Kling
9903f5c6ef Kernel: Use TRY() in sys$read() and sys$readv() 2021-09-05 16:25:40 +02:00
Andreas Kling
c1e18befe8 Kernel: Reorder sys$pipe() to fail more nicely
Try to do both FD allocations up front instead of interleaved between
assigning them to the descriptor table. This prevents us from failing
in the middle of setting up the pipes.
2021-09-05 16:25:40 +02:00
Andreas Kling
4483110990 Kernel: Use TRY() in sys$open() 2021-09-05 16:25:40 +02:00
Andreas Kling
b01e4b171d Kernel: Use TRY() in sys$mknod() 2021-09-05 16:25:40 +02:00
Andreas Kling
2658ab4a80 Kernel: Use TRY() in sys$mkdir() 2021-09-05 16:25:40 +02:00
Andreas Kling
e899ea459c Kernel: Use TRY() in sys$lseek() 2021-09-05 16:25:40 +02:00
Andreas Kling
f605cd04b6 Kernel: Use TRY() in sys$fork()
There's a lot of work to do on improving error propagation in the
fork system call. This just scratches the surface.
2021-09-05 16:25:40 +02:00
Andreas Kling
29bafec43b Kernel: Use is_error() in sys$fcntl() 2021-09-05 16:25:40 +02:00
Andreas Kling
c767e20f91 Kernel: Use TRY() in sys$chown() 2021-09-05 16:25:40 +02:00
Andreas Kling
24e8ad5ade Kernel: Use TRY() in sys$chmod() 2021-09-05 16:25:40 +02:00
Andreas Kling
aa4d5817af Kernel: Use TRY() in sys$ptrace() 2021-09-05 16:25:40 +02:00
Andreas Kling
4a2b0f6bec Kernel: Use TRY() in sys$access() 2021-09-05 16:25:40 +02:00
Andreas Kling
f8fba5f017 Kernel: Use TRY() in sys$socket() and friends 2021-09-05 16:25:40 +02:00
Andreas Kling
83fed5b2de Kernel: Tidy up Memory::AddressSpace construction
- Return KResultOr<T> in places
- Propagate errors
- Use TRY()
2021-09-05 15:13:20 +02:00
Andreas Kling
f2f5df793a Kernel: Use TRY() in sys$chdir() 2021-09-05 14:41:13 +02:00
Andreas Kling
8cd4879946 Kernel: Use TRY() in sys$link() and sys$symlink() 2021-09-05 14:38:28 +02:00
Andreas Kling
c902b3cb0d Kernel: Use TRY() in sys$anon_create() 2021-09-05 14:36:40 +02:00
Andreas Kling
4012099338 Kernel: Tidy up AnonymousFile construction a bit
- Rename create() => try_create()
- Use adopt_nonnull_ref_or_enomem()
2021-09-05 14:33:25 +02:00
Andreas Kling
9a1fdb523f Kernel: Use TRY() in sys$unveil() 2021-09-05 14:30:08 +02:00
Andreas Kling
36efecf3c3 Kernel: Use try in sys$mmap() and friends :^) 2021-09-05 14:27:41 +02:00
Andreas Kling
6bf901b414 Kernel: Use TRY() in sys$execve()
There are more opportunities to use TRY() here, but it will require
improvements to error propagation first.
2021-09-05 14:20:03 +02:00
Andreas Kling
68a6d4c30a Kernel: Tidy up InodeWatcher construction
- Rename create() => try_create()
- Use adopt_nonnull_ref_or_enomem()
2021-09-05 01:10:56 +02:00
Andreas Kling
ac85fdeb1c Kernel: Handle ProcessGroup allocation failures better
- Rename create* => try_create*
- Don't null out existing process group on allocation failure
2021-09-04 23:11:04 +02:00
Andreas Kling
12f820eb08 Kernel: Make Process::try_create() propagate errors better 2021-09-04 23:11:04 +02:00
Andreas Kling
5e2e17c38c Kernel: Rename Process::create() => try_create() 2021-09-04 23:11:03 +02:00
Daniel Bertalan
d7b6cc6421 Everywhere: Prevent risky implicit casts of (Nonnull)RefPtr
Our existing implementation did not check the element type of the other
pointer in the constructors and move assignment operators. This meant
that some operations that would require explicit casting on raw pointers
were done implicitly, such as:
- downcasting a base class to a derived class (e.g. `Kernel::Inode` =>
  `Kernel::ProcFSDirectoryInode` in Kernel/ProcFS.cpp),
- casting to an unrelated type (e.g. `Promise<bool>` => `Promise<Empty>`
  in LibIMAP/Client.cpp)

This, of course, allows gross violations of the type system, and makes
the need to type-check less obvious before downcasting. Luckily, while
adding the `static_ptr_cast`s, only two truly incorrect usages were
found; in the other instances, our casts just needed to be made
explicit.
2021-09-03 23:20:23 +02:00
Brian Gianforcaro
668c429900 Kernel: Convert UserOrKernelBuffer callbacks to use AK::Bytes 2021-09-01 18:06:14 +02:00
Brian Gianforcaro
f3baa5d8c9 Kernel: Convert random bytes interface to use AK::Bytes 2021-09-01 18:06:14 +02:00
Andrew Kaster
fcdd7aa990 Kernel: Only unlock Mutex once in execve when PT_TRACE_ME is enabled
Fixes a regression introduced in 70518e6. Fixes #9704.
2021-09-01 13:36:26 +02:00
Andreas Kling
5046a1fe38 Kernel: Ignore zero-sized PT_LOAD headers when loading ELF images 2021-08-31 16:46:16 +02:00
Andreas Kling
68bf6db673 Kernel: Rename Spinlock::is_owned_by_current_thread()
...to is_owned_by_current_processor(). As Tom pointed out, this is
much more accurate. :^)
2021-08-29 22:19:42 +02:00
Andreas Kling
0b4671add7 Kernel: {Mutex,Spinlock}::own_lock() => is_locked_by_current_thread()
Rename these API's to make it more clear what they are checking.
2021-08-29 12:53:11 +02:00
Andreas Kling
48a1a3c0ce Kernel: Rename LocalSocket::create_connected_pair() => try_*() 2021-08-29 01:33:15 +02:00
Andreas Kling
ae197deb6b Kernel: Strongly typed user & group ID's
Prior to this change, both uid_t and gid_t were typedef'ed to `u32`.
This made it easy to use them interchangeably. Let's not allow that.

This patch adds UserID and GroupID using the AK::DistinctNumeric
mechanism we've already been employing for pid_t/ProcessID.
2021-08-29 01:09:19 +02:00
Andreas Kling
59335bd8ea Kernel: Rename FileDescription::create() => try_create() 2021-08-29 01:09:19 +02:00
Andrew Kaster
54161bf5b4 Kernel: Acquire reference to waitee before trying to block in sys$waitid
Previously, we would try to acquire a reference to the all processes
lock or other contended resources while holding both the scheduler lock
and the thread's blocker lock. This could lead to a deadlock if we
actually have to block on those other resources.
2021-08-28 20:53:38 +02:00
Andrew Kaster
dea62fe93c Kernel: Guard the all processes list with a Spinlock rather than a Mutex
There are callers of processes().with or processes().for_each that
require interrupts to be disabled. Taking a Mutexe with interrupts
disabled is a recipe for deadlock, so convert this to a Spinlock.
2021-08-28 20:53:38 +02:00
Andrew Kaster
70518e69f4 Kernel: Unlock ptrace lock before entering a critical section in execve
While it might not be as bad to release a mutex while interrupts are
disabled as it is to acquire one, we don't want to mess with that.
2021-08-28 20:53:38 +02:00
Andrew Kaster
8e70b85215 Kernel: Don't disable interrupts in validate_inode_mmap_prot
There's no need to disable interrupts when trying to access an inode's
shared vmobject. Additionally, Inode::shared_vmobject() acquires a Mutex
which is a recipe for deadlock if acquired with interrupts disabled.
2021-08-28 20:53:38 +02:00
Brian Gianforcaro
485f51690d Kernel: Always observe the return value of Region::map and remap
We have seen cases where the map fails, but we return the region
to the caller, causing them to page fault later on when they touch
the region.

The fix is to always observe the return code of map/remap.
2021-08-25 00:18:42 +02:00
Andreas Kling
6c16bedd69 Kernel: Remove unnecessary FutexQueue::did_remove()
This was only ever called immediately after FutexQueue::try_remove()
to VERIFY() that the state looks exactly like it should after returning
from try_remove().
2021-08-23 00:02:09 +02:00
Andreas Kling
85546af417 Kernel: Rename Thread::BlockCondition to BlockerSet
This class represents a set of Thread::Blocker objects attached to
something that those blockers are waiting on.
2021-08-23 00:02:09 +02:00
Andreas Kling
bcd2025311 Everywhere: Core dump => Coredump
We all know what a coredump is, and it feels more natural to refer to
it as a coredump (most code already does), so let's be consistent.
2021-08-23 00:02:09 +02:00
Andreas Kling
1b9916439f Kernel: Make Processor::platform_string() return StringView 2021-08-23 00:02:09 +02:00
Andreas Kling
c922a7da09 Kernel: Rename ScopedSpinlock => SpinlockLocker
This matches MutexLocker, and doesn't sound like it's a lock itself.
2021-08-22 03:34:10 +02:00
Andreas Kling
55adace359 Kernel: Rename SpinLock => Spinlock 2021-08-22 03:34:10 +02:00
Idan Horowitz
cf271183b4 Kernel: Make Process::current() return a Process& instead of Process*
This has several benefits:
1) We no longer just blindly derefence a null pointer in various places
2) We will get nicer runtime error messages if the current process does
turn out to be null in the call location
3) GCC no longer complains about possible nullptr dereferences when
compiling without KUBSAN
2021-08-19 23:49:53 +02:00
Andreas Kling
961f727448 Kernel: Consolidate a bunch of i386/x86_64 code paths
Add some arch-specific getters and setters that allow us to merge blocks
that were previously specific to either ARCH(I386) or ARCH(X86_64).
2021-08-19 23:22:02 +02:00
Andreas Kling
4226b662cd Kernel+Userland: Remove global futexes
We only ever use private futexes, so it doesn't make sense to carry
around all the complexity required for global (cross-process) futexes.
2021-08-17 01:21:47 +02:00
sin-ack
0a18425cbb Kernel: Make Memory::Region allocation functions return KResultOr
This makes for some nicer handling of errors compared to checking an
OwnPtr for null state.
2021-08-15 15:41:02 +02:00
sin-ack
4bfd6e41b9 Kernel: Make Kernel::VMObject allocation functions return KResultOr
This makes for nicer handling of errors compared to checking whether a
RefPtr is null. Additionally, this will give way to return different
types of errors in the future.
2021-08-15 15:41:02 +02:00
Andreas Kling
1b739a72c2 Kernel+Userland: Remove chroot functionality
We are not using this for anything and it's just been sitting there
gathering dust for well over a year, so let's stop carrying all this
complexity around for no good reason.
2021-08-15 12:44:35 +02:00
Andreas Kling
0f6f863382 Kernel: Convert remaining users of copy_string_from_user()
This patch replaces the remaining users of this API with the new
try_copy_kstring_from_user() instead. Note that we still convert to a
String for continued processing, and I've added FIXME about continuing
work on using KString all the way.
2021-08-15 12:44:35 +02:00
sin-ack
748938ea59 Kernel: Handle allocation failure in ProcFS and friends
There were many places in which allocation failure was noticed but
ignored.
2021-08-15 02:27:13 +02:00
Andreas Kling
7676edfb9b Kernel: Stop allowing implicit conversion from KResult to int
This patch removes KResult::operator int() and deals with the fallout.
This forces a lot of code to be more explicit in its handling of errors,
greatly improving readability.
2021-08-14 15:19:00 +02:00
Andreas Kling
d30d776ca4 Kernel: Make FileSystem::initialize() return KResult
This forced me to also come up with error codes for a bunch of
situations where we'd previously just panic the kernel.
2021-08-14 15:19:00 +02:00
Brian Gianforcaro
296452a981 Kernel: Make cloning of FileDescriptions OOM safe 2021-08-13 11:09:25 +02:00
Brian Gianforcaro
ed6d842f85 Kernel: Fix OOB read in sys$dbgputstr(..) during fuzzing
The implementation uses try_copy_kstring_from_user to allocate a kernel
string using, but does not use the length of the resulting string.
The size parameter to the syscall is untrusted, as try copy kstring will
attempt to perform a `safe_strlen(..)` on the user mode string and use
that value for the allocated length of the KString instead. The bug is
that we are printing the kstring, but with the usermode size argument.

During fuzzing this resulted in us walking off the end of the allocated
KString buffer printing garbage (or any kernel data!), until we stumbled
in to the KSym region and hit a fatal page fault.

This is technically a kernel information disclosure, but (un)fortunately
the disclosure only happens to the Bochs debug port, and or the serial
port if serial debugging is enabled. As far as I can tell it's not
actually possible for an untrusted attacker to use this to do something
nefarious, as they would need access to the host. If they have host
access then they can already do much worse things :^).
2021-08-13 11:08:11 +02:00
Brian Gianforcaro
40a942d28b Kernel: Remove char* versions of path argument / kstring copy methods
The only two paths for copying strings in the kernel should be going
through the existing Userspace<char const*>, or StringArgument methods.

Lets enforce this by removing the option for using the raw cstring APIs
that were previously available.
2021-08-13 11:08:11 +02:00
Brian Gianforcaro
5121e58d4a Kernel: Fix sys$dbgputstr(...) to take a char* instead of u8*
We always attempt to print this as a string, and it's defined as such in
LibC, so fix the signature to match.
2021-08-13 11:08:11 +02:00
Liav A
01b79910b3 Kernel/Process: Move protected values to the end of the object
The compiler can re-order the structure (class) members if that's
necessary, so if we make Process to inherit from ProcFSExposedComponent,
even if the declaration is to inherit first from ProcessBase, then from
ProcFSExposedComponent and last from Weakable<Process>, the members of
class ProcFSExposedComponent (including the Ref-counted parts) are the
first members of the Process class.

This problem made it impossible to safely use the current toggling
method with the write-protection bit on the ProcessBase members, so
instead of inheriting from it, we make its members the last ones in the
Process class so we can safely locate and modify the corresponding page
write protection bit of these values.

We make sure that the Process class doesn't expand beyond 8192 bytes and
the protected values are always aligned on a page boundary.
2021-08-12 20:57:32 +02:00
Andreas Kling
a29788e58a Kernel: Don't record sys$perf_event() if profiling is not enabled
If you want to record perf events, just enable profiling. This allows us
to add random perf events to programs without littering the file system
with perfcore files.
2021-08-12 00:03:40 +02:00
Andreas Kling
1e90a3a542 Kernel: Make sys$perf_register_string() generate the string ID's
Making userspace provide a global string ID was silly, and made the API
extremely difficult to use correctly in a global profiling context.

Instead, simply make the kernel do the string ID allocation for us.
This also allows us to convert the string storage to a Vector in the
kernel (and an array in the JSON profile data.)
2021-08-12 00:03:39 +02:00
Andreas Kling
4657c79143 Kernel+LibC: Add sys$perf_register_string()
This syscall allows userspace to register a keyed string that appears in
a new "strings" JSON object in profile output.

This will be used to add custom strings to profile signposts. :^)
2021-08-12 00:03:39 +02:00
Gunnar Beutner
3322efd4cd Kernel: Fix kernel panic when blocking on the process' big lock
Another thread might end up marking the blocking thread as holding
the lock before it gets a chance to finish invoking the scheduler.
2021-08-10 22:33:50 +02:00
Andreas Kling
fdfc66db61 Kernel+LibC: Allow clock_gettime() to run without syscalls
This patch adds a vDSO-like mechanism for exposing the current time as
an array of per-clock-source timestamps.

LibC's clock_gettime() calls sys$map_time_page() to map the kernel's
"time page" into the process address space (at a random address, ofc.)
This is only done on first call, and from then on the timestamps are
fetched from the time page.

This first patch only adds support for CLOCK_REALTIME, but eventually
we should be able to support all clock sources this way and get rid of
sys$clock_gettime() in the kernel entirely. :^)

Accesses are synchronized using two atomic integers that are incremented
at the start and finish of the kernel's time page update cycle.
2021-08-10 19:21:16 +02:00
Andreas Kling
fa64ab26a4 Kernel+UserspaceEmulator: Remove unused sys$gettimeofday()
Now that LibC uses clock_gettime() to implement gettimeofday(), we can
get rid of this entire syscall. :^)
2021-08-10 13:01:39 +02:00
Andreas Kling
0a02496f04 Kernel/SMP: Change critical sections to not disable interrupts
Leave interrupts enabled so that we can still process IRQs. Critical
sections should only prevent preemption by another thread.

Co-authored-by: Tom <tomut@yahoo.com>
2021-08-10 02:49:37 +02:00
Andreas Kling
9babb92a4b Kernel/SMP: Make entering/leaving critical sections multi-processor safe
By making these functions static we close a window where we could get
preempted after calling Processor::current() and move to another
processor.

Co-authored-by: Tom <tomut@yahoo.com>
2021-08-10 02:49:37 +02:00
Andreas Kling
c94c15d45c Everywhere: Replace AK::Singleton => Singleton 2021-08-08 00:03:45 +02:00
Andreas Kling
15d033b486 Kernel: Remove unused Process pointer in Memory::AddressSpace
Nobody was using the back-pointer to the process, so let's lose it.
2021-08-08 00:03:45 +02:00
Idan Horowitz
9d21c79671 Kernel: Disable big process lock for sys$sync
This syscall doesn't touch any intra-process shared resources and only
calls VirtualFileSystem::sync, which is self-locking.
2021-08-07 15:30:26 +02:00
sin-ack
0d468f2282 Kernel: Implement a ISO 9660 filesystem reader :^)
This commit implements the ISO 9660 filesystem as specified in ECMA 119.
Currently, it only supports the base specification and Joliet or Rock
Ridge support is not present. The filesystem will normalize all
filenames to be lowercase (same as Linux).

The filesystem can be mounted directly from a file. Loop devices are
currently not supported by SerenityOS.

Special thanks to Lubrsi for testing on real hardware and providing
profiling help.

Co-Authored-By: Luke <luke.wilde@live.co.uk>
2021-08-07 15:21:58 +02:00
Andreas Kling
5acb7e4eba Kernel: Remove outdated FIXME about ProcessHandle
ProcessHandle hasn't been a thing since Process became ref-counted.
2021-08-07 12:29:26 +02:00
Jean-Baptiste Boric
08891e82a5 Kernel: Migrate process list locking to ProtectedValue
The existing recursive spinlock is repurposed for profiling only, as it
was shared with the process list.
2021-08-07 11:48:00 +02:00
Jean-Baptiste Boric
8554b66d09 Kernel: Make process list a singleton 2021-08-07 11:48:00 +02:00
Jean-Baptiste Boric
626b99ce1c Kernel: Migrate hostname locking to ProtectedValue 2021-08-07 11:48:00 +02:00
Andreas Kling
f770b9d430 Kernel: Fix bad search-and-replace renames
Oops, I didn't mean to change every *Range* to *VirtualRange*!
2021-08-07 00:39:06 +02:00
Idan Horowitz
ad419a669d Kernel: Disable big process lock for sys$sysconf
This syscall only reads constant kernel globals, and as such does not
need to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz
efeb01e35f Kernel: Disable big process lock for sys$get_stack_bounds
This syscall only reads from the shared m_space field, but that field
is only over written to by Process::attach_resources, before the
process was initialized (aka, before syscalls can happen), by
Process::finalize which is only called after all the process' threads
have exited (aka, syscalls can not happen anymore), and by
Process::do_exec which calls all other syscall-capable threads before
doing so. Space's find_region_containing already holds its own lock,
and as such there's no need to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz
d40038a04f Kernel: Disable big process lock for sys$gettimeofday
This syscall doesn't touch any intra-process shared resources and only
accesses the time via the atomic TimeManagement::now so there's no need
to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz
3ba2449058 Kernel: Disable big process lock for sys$clock_nanosleep
This syscall doesn't touch any intra-process shared resources and only
accesses the time via the atomic TimeManagement::current_time so there's
no need to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz
fbd848e6eb Kernel: Disable big process lock for sys$clock_gettime()
This syscall doesn't touch any intra-process shared resources and
reads the time via the atomic TimeManagement::current_time, so it
doesn't need to hold any lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz
1a08694dfc Kernel: Disable big process lock for sys$getkeymap
This syscall only reads non process-related global values, and as such
doesn't need to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz
48325e2959 Kernel: Disable big process lock for sys$getrandom
This syscall doesn't touch any intra-process shared resources and
already holds the global kernel RNG lock so there's no reason to hold
the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz
b1f4f6ee15 Kernel: Disable big process lock for sys$dbgputch
This syscall doesn't touch any intra-process shared resources and
already holds the global logging lock so there's no reason to hold
the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz
c7ad4c6c32 Kernel: Disable big process lock for sys$dbgputstr
This syscall doesn't touch any intra-process shared resources and
already holds the global logging lock so there's no reason to hold
the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz
00818b8447 Kernel: Disable big process lock for sys$dump_backtrace()
This syscall only dumps the current thread's backtrace and as such
doesn't touch any shared intra-process resources.
2021-08-06 23:36:12 +02:00
Idan Horowitz
da0b7d1737 Kernel: Disable big process lock for sys$beep()
The PCSpeaker is global and not locked anyways, so there's no need for
mutual exclusion between threads in the same process.
2021-08-06 23:36:12 +02:00
Idan Horowitz
c3f668a758 Kernel: Make Process's m_promises & m_execpromises fields atomic
This is essentially free on x86 and allows us to not hold the big
process lock just to check the required promises for a syscall.
2021-08-06 23:36:12 +02:00
Andreas Kling
2cd8b21974 Kernel: Add convenience values to the Memory::Region::Access enum
Instead of `Memory::Region::Access::Read | Memory::Region::AccessWrite`
you can now say `Memory::Region::Access::ReadWrite`.
2021-08-06 22:25:00 +02:00
Andreas Kling
47bdd7c3a0 Kernel: Rename a very long enum to ShouldDeallocateVirtualRange
ShouldDeallocateVirtualMemoryVirtualRange was a bit on the long side.
2021-08-06 21:45:05 +02:00
Andreas Kling
208147c77c Kernel: Rename Process::space() => Process::address_space()
We commonly talk about "a process's address space" so let's nudge the
code towards matching how we talk about it. :^)
2021-08-06 14:05:58 +02:00
Andreas Kling
b7476d7a1b Kernel: Rename Memory::Space => Memory::AddressSpace 2021-08-06 14:05:58 +02:00
Andreas Kling
cd5faf4e42 Kernel: Rename Range => VirtualRange
...and also RangeAllocator => VirtualRangeAllocator.

This clarifies that the ranges we're dealing with are *virtual* memory
ranges and not anything else.
2021-08-06 14:05:58 +02:00
Andreas Kling
93d98d4976 Kernel: Move Kernel/Memory/ code into Kernel::Memory namespace 2021-08-06 14:05:58 +02:00
Andreas Kling
a1d7ebf85a Kernel: Rename Kernel/VM/ to Kernel/Memory/
This directory isn't just about virtual memory, it's about all kinds
of memory management.
2021-08-06 14:05:58 +02:00
Andreas Kling
3377cc74df Kernel: Use try_copy_kstring_from_user() in sys$mount() 2021-08-06 00:37:47 +02:00
Andreas Kling
33adc3a42d Kernel: Store coredump metadata properties as KStrings
This patch also replaces the HashMap previously used to store coredump
properties with a plain AK::Array.
2021-08-06 00:37:47 +02:00
Andreas Kling
95669fa861 Kernel: Use try_copy_kstring_from_user() in sys$link() 2021-08-06 00:37:47 +02:00
Andreas Kling
d5d8fba579 Kernel: Store Thread name as a KString 2021-08-06 00:37:47 +02:00
Brian Gianforcaro
187c086270 Kernel: Handle OOM from KBuffer creation in sys$module() 2021-08-03 18:54:23 +02:00
Brian Gianforcaro
8d3b819daf Kernel: Handle OOM from DoubleBuffer creation in FIFO creation 2021-08-03 18:54:23 +02:00
Brian Gianforcaro
fc91eb365d Kernel: Do not cancel stale timers when servicing sys$alarm
The sys$alarm() syscall has logic to cache a m_alarm_timer to avoid
allocating a new timer for every call to alarm. Unfortunately that
logic was broken, and there were conditions in which we could have
a timer allocated, but it was no longer on the timer queue, and we
would attempt to cancel that timer again resulting in an infinite
loop waiting for the timers callback to fire.

To fix this, we need to track if a timer is currently in use or not,
allowing us to avoid attempting to cancel inactive timers.

Luke and Tom did the initial investigation, I just happened to have
time to write a repro and attempt a fix, so I'm adding them as the
as co-authors of this commit.

Co-authored-by: Luke <luke.wilde@live.co.uk>
Co-authored-by: Tom <tomut@yahoo.com>
2021-08-03 18:44:01 +02:00
Brian Gianforcaro
0fc853f5ba Kernel: Remove ThreadTracer.h include from Process.h / Thread.h
This isn't needed for Process / Thread as they only reference it
by pointer and it's already part of Kernel/Forward.h. So just include
it where the implementation needs to call it.
2021-08-01 08:10:16 +02:00
Brian Gianforcaro
ed996fcced Kernel: Remove unused header includes 2021-08-01 08:10:16 +02:00
Andreas Kling
b807f1c3fc Kernel: Fail madvise() volatile change with EINVAL for non-purgeable mem
AnonymousVMObject::set_volatile() assumes that nobody ever calls it on
non-purgeable objects, so let's make sure we don't do that.

Also return EINVAL instead of EPERM for non-anonymous VM objects so the
error codes match.
2021-07-28 20:42:49 +02:00
Brian Gianforcaro
ddc950ce42 Kernel: Avoid file descriptor leak in Process::sys$socketpair on error
Previously it was possible to leak the file descriptor if we error out
after allocating the first descriptor. Now we perform both fd
allocations back to back so we can handle the potential error when
processing the second fd allocation.
2021-07-28 19:07:00 +02:00
Brian Gianforcaro
4b2651ddab Kernel: Track allocated FileDescriptionAndFlag elements in each Process
The way the Process::FileDescriptions::allocate() API works today means
that two callers who allocate back to back without associating a
FileDescription with the allocated FD, will receive the same FD and thus
one will stomp over the other.

Naively tracking which FileDescriptions are allocated and moving onto
the next would introduce other bugs however, as now if you "allocate"
a fd and then return early further down the control flow of the syscall
you would leak that fd.

This change modifies this behavior by tracking which descriptions are
allocated and then having an RAII type to "deallocate" the fd if the
association is not setup the end of it's scope.
2021-07-28 19:07:00 +02:00
Brian Gianforcaro
ba03b6ad02 Kernel: Make Process::FileDescriptions::allocate return KResultOr<int>
Modernize more error checking by utilizing KResultOr.
2021-07-28 19:07:00 +02:00
Brian Gianforcaro
d2cee9cbf6 Kernel: Remove unused fd allocation from Process::sys$connect(..) 2021-07-28 19:07:00 +02:00
Andreas Kling
a085168c52 Kernel: Rename Space::create => Space::try_create() 2021-07-27 14:54:35 +02:00
Andreas Kling
4648bcd3d4 Kernel: Remove unnecessary weak pointer from Region to owning Process
This was previously used for a single debug logging statement during
memory purging. There are no remaining users of this weak pointer,
so let's get rid of it.
2021-07-25 17:28:06 +02:00
Andreas Kling
09bc4cee15 Kernel: Remove unused madvise(MADV_GET_VOLATILE)
This was used to query the volatile state of a memory region, however
nothing ever actually used it.
2021-07-25 17:28:06 +02:00
Andreas Kling
2d1a651e0a Kernel: Make purgeable memory a VMObject level concept (again)
This patch changes the semantics of purgeable memory.

- AnonymousVMObject now has a "purgeable" flag. It can only be set when
  constructing the object. (Previously, all anonymous memory was
  effectively purgeable.)

- AnonymousVMObject now has a "volatile" flag. It covers the entire
  range of physical pages. (Previously, we tracked ranges of volatile
  pages, effectively making it a page-level concept.)

- Non-volatile objects maintain a physical page reservation via the
  committed pages mechanism, to ensure full coverage for page faults.

- When an object is made volatile, it relinquishes any unused committed
  pages immediately. If later made non-volatile again, we then attempt
  to make a new committed pages reservation. If this fails, we return
  ENOMEM to userspace.

mmap() now creates purgeable objects if passed the MAP_PURGEABLE option
together with MAP_ANONYMOUS. anon_create() memory is always purgeable.
2021-07-25 17:28:05 +02:00
Brian Gianforcaro
9d8482c3e8 Kernel: Use StringView when parsing pledges in sys$pledge(..)
This ensures no potential allocation as in some cases the pledge char*
could be promoted to AK::String by the compiler to execute the
comparison.
2021-07-23 19:02:25 +02:00
Brian Gianforcaro
e4b86aa5d8 Kernel: Fix bug where we half apply pledges in sys$pledge(..)
This bug manifests it self when the caller to sys$pledge() passes valid
promises, but invalid execpromises. The code would apply the promises
and then return an error for the execpromises. This leaves the user in
a confusing state, as the promises were silently applied, but we return
an error suggesting the operation has failed.

Avoid this situation by tweaking the implementation to only apply the
promises / execpromises after all validation has occurred.
2021-07-23 19:02:25 +02:00
Brian Gianforcaro
36ff717c54 Kernel: Migrate sys$pledge to use the KString API
This avoids potential unhandled OOM that's possible with the old
copy_string_from_user API.
2021-07-23 19:02:25 +02:00
Brian Gianforcaro
baec9e2d2d Kernel: Migrate sys$unveil to use the KString API
This avoids potential unhandled OOM that's possible with the old
copy_string_from_user API.
2021-07-23 19:02:25 +02:00
Brian Gianforcaro
2e7728bb05 Kernel: Use StringView literals for fs_type match in sys$mount(..) 2021-07-23 19:02:25 +02:00
Andreas Kling
082ed6f417 Kernel: Simplify VMObject locking & page fault handlers
This patch greatly simplifies VMObject locking by doing two things:

1. Giving VMObject an IntrusiveList of all its mapping Region objects.
2. Removing VMObject::m_paging_lock in favor of VMObject::m_lock

Before (1), VMObject::for_each_region() was forced to acquire the
global MM lock (since it worked by walking MemoryManager's list of
all regions and checking for regions that pointed to itself.)

With each VMObject having its own list of Regions, VMObject's own
m_lock is all we need.

Before (2), page fault handlers used a separate mutex for preventing
overlapping work. This design required multiple temporary unlocks
and was generally extremely hard to reason about.

Instead, page fault handlers now use VMObject's own m_lock as well.
2021-07-23 03:24:44 +02:00
Gunnar Beutner
31f30e732a Everywhere: Prefix hexadecimal numbers with 0x
Depending on the values it might be difficult to figure out whether a
value is decimal or hexadecimal. So let's make this more obvious. Also
this allows copying and pasting those numbers into GNOME calculator and
probably also other apps which auto-detect the base.
2021-07-22 08:57:01 +02:00
Peter Elliott
3fa2816642 Kernel+LibC: Implement fcntl(2) advisory locks
Advisory locks don't actually prevent other processes from writing to
the file, but they do prevent other processes looking to acquire and
advisory lock on the file.

This implementation currently only adds non-blocking locks, which are
all I need for now.
2021-07-20 17:44:30 +04:30
Brian Gianforcaro
8f01a8b741 Kernel: Disable big process lock for sys$yield() 2021-07-20 03:21:14 +02:00
Brian Gianforcaro
5c10fb4007 Kernel: Disable big process lock for sys$gettid()
This syscall reads a read only value from the current thread, and hence
has no need for the big process lock.
2021-07-20 03:21:14 +02:00
Brian Gianforcaro
638598b15d Kernel: Disable big process lock for sys$getpid() 2021-07-20 03:21:14 +02:00
Brian Gianforcaro
bfd4635274 Kernel: Disable big process lock for sys$uname() 2021-07-20 03:21:14 +02:00
Brian Gianforcaro
10ce896d4f Kernel: Disable big process lock in sys$gethostname() sys$sethostname() 2021-07-20 03:21:14 +02:00
Brian Gianforcaro
9201a06027 Kernel: Annotate all syscalls with VERIFY_PROCESS_BIG_LOCK_ACQUIRED
Before we start disabling acquisition of the big process lock for
specific syscalls, make sure to document and assert that all the
lock is held during all syscalls.
2021-07-20 03:21:14 +02:00
Brian Gianforcaro
308396bca1 Kernel: No lock validate_user_stack variant, switch to Space as argument
The entire process is not needed, just require the user to pass in the
Space. Also provide no_lock variant to use when you already have the
VM/Space lock acquired, to avoid unnecessary recursive spinlock
acquisitions.
2021-07-20 03:21:14 +02:00
Brian Gianforcaro
1cffecbe8d Kernel: Push ARCH specific ifdef's down into RegisterState functions
The non CPU specific code of the kernel shouldn't need to deal with
architecture specific registers, and should instead deal with an
abstract view of the machine. This allows us to remove a variety of
architecture specific ifdefs and helps keep the code slightly more
portable.

We do this by exposing the abstract representation of instruction
pointer, stack pointer, base pointer, return register, etc on the
RegisterState struct.
2021-07-19 08:46:55 +02:00
Andreas Kling
9457d83986 Kernel: Rename Locker => MutexLocker 2021-07-18 01:53:04 +02:00
Andreas Kling
cee9528168 Kernel: Rename Lock to Mutex
Let's be explicit about what kind of lock this is meant to be.
2021-07-17 21:10:32 +02:00
Brian Gianforcaro
c0987453e6 Kernel: Remove double RedBlackTree lookup in VM/Space region removal
We should never request a regions removal that we don't currently
own. We currently assert this everywhere else by all callers.

Instead lets just push the assert down into the RedBlackTree removal
and assume that we will always successfully remove the region.
2021-07-17 16:22:59 +02:00
Tom
704e1c2e3d Kernel: Rename functions to be less confusing
Thread::yield_and_release_relock_big_lock releases the big lock, yields
and then relocks the big lock.

Thread::yield_assuming_not_holding_big_lock yields assuming the big
lock is not being held.
2021-07-16 20:30:04 +02:00
Timothy
9715311837 AK+Kernel: Implement and use EnumBits has_any_flag()
This duplicates the old functionality of has_flag and will return true
when any flags present in the mask are also in the value.
2021-07-16 11:49:50 +02:00
Idan Horowitz
be475cd6a8 Kernel: Handle OOM when adding memory regions to Spaces :^) 2021-07-15 00:49:41 +02:00
Andreas Kling
859e5741ff Kernel: Fix Process use-after-free in Thread finalization
We leak a ref() onto every user process when constructing them,
either via Process::create_user_process(), or via Process::sys$fork().

This ref() is balanced by a corresponding unref() in
Thread::WaitBlockCondition::finalize().

Since kernel processes don't have a leaked ref() on them, this led to
an extra Process::unref() on kernel processes during finalization.
This happened during every boot, with the `init_stage2` process.

Found by turning off kfree() scrubbing. :^)
2021-07-14 22:36:29 +02:00
Brian Gianforcaro
84b4b9447d Kernel: Move new process registration out of Space spinlock scope
There appears to be no reason why the process registration needs
to happen under the space spin lock. As the first thread is not started
yet it should be completely uncontested, but it's still bad practice.
2021-07-12 10:20:21 +02:00
Andreas Kling
cd7a49b90d Kernel: Make Region splitting OOM-safe
Region allocation failures during splitting are now propagated all the
way out to where we can return ENOMEM for them.
2021-07-11 18:52:27 +02:00
Andreas Kling
88d490566f Kernel: Rename various *VMObject::create*() => try_create()
try_*() implies that it can fail (and they all return RefPtr with
nullptr signalling failure.)
2021-07-11 17:55:29 +02:00
Andreas Kling
af8c74a328 Kernel: Make SharedInodeVMObject allocation OOM-safe 2021-07-11 17:52:07 +02:00
Andreas Kling
07c4c89297 Kernel: Make VirtualFileSystem::sync() static 2021-07-11 00:26:17 +02:00
Andreas Kling
0d39bd04d3 Kernel: Rename VFS => VirtualFileSystem 2021-07-11 00:25:24 +02:00
Andreas Kling
d53d9d3677 Kernel: Rename FS => FileSystem
This matches our common naming style better.
2021-07-11 00:20:38 +02:00
Ralf Donau
8ee3a5e09e Kernel: Logic fix in the pledge syscall
Pledge should check m_has_promises. Calling pledge("", nullptr)
does not fail on an already pledged process anymore.
2021-07-10 21:59:29 +02:00
Gunnar Beutner
06883ed8a3 Kernel+Userland: Make the stack alignment comply with the System V ABI
The System V ABI for both x86 and x86_64 requires that the stack pointer
is 16-byte aligned on entry. Previously we did not align the stack
pointer properly.

As far as "main" was concerned the stack alignment was correct even
without this patch due to how the C++ _start function and the kernel
interacted, i.e. the kernel misaligned the stack as far as the ABI
was concerned but that misalignment (read: it was properly aligned for
a regular function call - but misaligned in terms of what the ABI
dictates) was actually expected by our _start function.
2021-07-10 01:41:57 +02:00
Ali Mohammad Pur
e37f9fa7db LibPthread+Kernel: Add pthread_kill() and the thread_kill syscall 2021-07-09 15:36:50 +02:00
Daniel Bertalan
d30dbf47f5 Kernel: Map non-page-aligned text segments correctly
`.text` segments with non-aligned offsets had their lengths applied to
the first page's base address. This meant that in some cases the last
PAGE_SIZE - 1 bytes weren't mapped. Previously, it did not cause any
problems as the GNU ld insists on aligning everything; but that's not
the case with the LLVM toolchain.
2021-07-07 22:26:53 +02:00
Max Wipfli
d5722eab36 Kernel: Custody::absolute_path() => try_create_absolute_path()
This converts most users of Custody::absolute_path() to use the new
try_create_absolute_path() API, and return ENOMEM if the KString
allocation fails.
2021-07-07 15:32:17 +02:00
Max Wipfli
ee342f5ec3 Kernel: Replace usage of LexicalPath with KLexicalPath
This replaces all uses of LexicalPath in the Kernel with the functions
from KLexicalPath. This also allows the Kernel to stop including
AK::LexicalPath.
2021-07-07 15:32:17 +02:00
Edwin Hoksberg
99328e1038 Kernel+KeyboardSettings: Remove numlock syscall and implement ioctl 2021-07-07 10:44:20 +02:00
Tom
864b50b5c2 Kernel: Do not hold spinlock while touching user mode futex values
The user_atomic_* functions are subject to the same rules as
copy_from/to/user, which may require preemption.
2021-07-07 10:05:55 +02:00
Tom
7593418b77 Kernel: Fix futex race that could lead to thread waiting forever
There is a race condition where we would remove a FutexQueue from
our futex map and in the meanwhile another thread started to queue
itself into that very same futex, leading to that thread to wait
forever as no other wake operation could discover that removed
FutexQueue.

This fixes the problem by:
* Tracking imminent waits, which prevents a new FutexQueue from being
  deleted that a thread will wait on momentarily
* Atomically marking a FutexQueue as removed, which prevents a thread
  from waiting on it before it is actually removed from the futex map.
2021-07-07 10:05:55 +02:00
Andreas Kling
565796ae4e Kernel+LibC: Remove sys$donate()
This was an old SerenityOS-specific syscall for donating the remainder
of the calling thread's time-slice to another thread within the same
process.

Now that Threading::Lock uses a pthread_mutex_t internally, we no
longer need this syscall, which allows us to get rid of a surprising
amount of unnecessary scheduler logic. :^)
2021-07-05 23:30:15 +02:00
ForLoveOfCats
ce6658acc1 KeyboardSettings+Kernel: Setting to enable Num Lock on login 2021-07-05 06:19:59 +02:00
Idan Horowitz
301c1a3a58 Everywhere: Fix incorrect usages of AK::Checked
Specifically, explicitly specify the checked type, use the resulting
value instead of doing the same calculation twice, and break down
calculations to discrete operations to ensure no intermediary overflows
are missed.
2021-07-04 20:08:28 +01:00
Gunnar Beutner
c51b49a8cb Kernel: Implement TLS support for x86_64 2021-07-04 01:07:28 +02:00
Gunnar Beutner
a09e6171a6 Kernel: Don't allow allocate_tls() if the process has multiple threads
We can't safely update the other threads' FS selector. This shouldn't
be a problem in practice because allocate_tls() is only used by the
loader.
2021-07-04 01:07:28 +02:00
Daniel Bertalan
b9f30c6f2a Everywhere: Fix some alignment issues
When creating uninitialized storage for variables, we need to make sure
that the alignment is correct. Fixes a KUBSAN failure when running
kernels compiled with Clang.

In `Syscalls/socket.cpp`, we can simply use local variables, as
`sockaddr_un` is a POD type.

Along with moving the `alignas` specifier to the correct member,
`AK::Optional`'s internal buffer has been made non-zeroed by default.
GCC emitted bogus uninitialized memory access warnings, so we now use
`__builtin_launder` to tell the compiler that we know what we are doing.
This might disable some optimizations, but judging by how GCC failed to
notice that the memory's initialization is dependent on `m_has_value`,
I'm not sure that's a bad thing.
2021-07-03 01:56:31 +04:30
Gunnar Beutner
16b9a2d2e1 Kernel+LibPthread: Add support for usermode threads on x86_64 2021-07-01 17:22:22 +02:00
Gunnar Beutner
93c741018e Kernel+LibPthread: Remove m_ prefix for public members 2021-07-01 17:22:22 +02:00
Gunnar Beutner
fe2716df21 Kernel: Disable __thread and TLS on x86_64 for now
They're not yet properly supported.
2021-06-30 15:13:30 +02:00
Gunnar Beutner
cafccb866c Kernel: Don't start usermode threads on x86_64 for now
Starting usermode threads doesn't currently work on x86_64. With this
stubbed out we can get text mode to boot though.
2021-06-30 15:13:30 +02:00
Max Wipfli
7405536a1a AK+Everywhere: Use mostly StringView in LexicalPath
This changes the m_parts, m_dirname, m_basename, m_title and m_extension
member variables to StringViews onto the m_string String. It also
removes the m_is_absolute member in favour of computing if a path is
absolute in the is_absolute() getter. Due to this, the canonicalize()
method has been completely rewritten.

The parts() getter still returns a Vector<String>, although it is no
longer a const reference as m_parts is no longer a Vector<String>.
Rather, it is constructed from the StringViews in m_parts upon request.
The parts_view() getter has been added, which returns Vector<StringView>
const&. Most previous users of parts() have been changed to use
parts_view(), except where Strings are required.

Due to this change, it's is now no longer allow to create temporary
LexicalPath objects to call the dirname, basename, title, or extension
getters on them because the returned StringViews will point to possible
freed memory.
2021-06-30 11:13:54 +02:00
Max Wipfli
fc6d051dfd AK+Everywhere: Add and use static APIs for LexicalPath
The LexicalPath instance methods dirname(), basename(), title() and
extension() will be changed to return StringView const& in a further
commit. Due to this, users creating temporary LexicalPath objects just
to call one of those getters will recieve a StringView const& pointing
to a possible freed buffer.

To avoid this, static methods for those APIs have been added, which will
return a String by value to avoid those problems. All cases where
temporary LexicalPath objects have been used as described above haven
been changed to use the static APIs.
2021-06-30 11:13:54 +02:00
Liav A
7c87891c06 Kernel: Don't copy a Vector<FileDescriptionAndFlags>
Instead of copying a Vector everytime we need to enumerate a Process'
file descriptions, we can just temporarily lock so it won't change.
2021-06-29 20:53:59 +02:00
Liav A
12b6e69150 Kernel: Introduce the new ProcFS design
The new ProcFS design consists of two main parts:
1. The representative ProcFS class, which is derived from the FS class.
The ProcFS and its inodes are much more lean - merely 3 classes to
represent the common type of inodes - regular files, symbolic links and
directories. They're backed by a ProcFSExposedComponent object, which
is responsible for the functional operation behind the scenes.
2. The backend of the ProcFS - the ProcFSComponentsRegistrar class
and all derived classes from the ProcFSExposedComponent class. These
together form the entire backend and handle all the functions you can
expect from the ProcFS.

The ProcFSExposedComponent derived classes split to 3 types in the
manner of lifetime in the kernel:
1. Persistent objects - this category includes all basic objects, like
the root folder, /proc/bus folder, main blob files in the root folders,
etc. These objects are persistent and cannot die ever.
2. Semi-persistent objects - this category includes all PID folders,
and subdirectories to the PID folders. It also includes exposed objects
like the unveil JSON'ed blob. These object are persistent as long as the
the responsible process they represent is still alive.
3. Dynamic objects - this category includes files in the subdirectories
of a PID folder, like /proc/PID/fd/* or /proc/PID/stacks/*. Essentially,
these objects are always created dynamically and when no longer in need
after being used, they're deallocated.
Nevertheless, the new allocated backend objects and inodes try to use
the same InodeIndex if possible - this might change only when a thread
dies and a new thread is born with a new thread stack, or when a file
descriptor is closed and a new one within the same file descriptor
number is opened. This is needed to actually be able to do something
useful with these objects.

The new design assures that many ProcFS instances can be used at once,
with one backend for usage for all instances.
2021-06-29 20:53:59 +02:00
Liav A
92c0dab5ab Kernel: Introduce the new SysFS
The intention is to add dynamic mechanism for notifying the userspace
about hotplug events. Currently, the DMI (SMBIOS) blobs and ACPI tables
are exposed in the new filesystem.
2021-06-29 20:53:59 +02:00
Gunnar Beutner
a5b4b95a76 Kernel: Report correct architecture for uname() 2021-06-29 20:03:36 +02:00
Gunnar Beutner
85561feb40 Kernel: Rename some variables to arch-independent names 2021-06-29 20:03:36 +02:00
Gunnar Beutner
90e3aa35ef Kernel: Fix correct argument order for userspace entry point 2021-06-29 20:03:36 +02:00
Gunnar Beutner
6dde7dac8f Kernel: Implement signal handling for x86_64 2021-06-29 20:03:36 +02:00
Gunnar Beutner
df9e73de25 Kernel: Add x86_64 support for fork() 2021-06-29 20:03:36 +02:00
Gunnar Beutner
2a78bf8596 Kernel: Fix the return type for syscalls
The Process::Handler type has KResultOr<FlatPtr> as its return type.
Using a different return type with an equally-sized template parameter
sort of works but breaks once that condition is no longer true, e.g.
for KResultOr<int> on x86_64.

Ideally the syscall handlers would also take FlatPtrs as their args
so we can get rid of the reinterpret_cast for the function pointer
but I didn't quite feel like cleaning that up as well.
2021-06-28 22:29:28 +02:00
Gunnar Beutner
d4c0d28035 Kernel: Properly set up the userland context for new processes on x86_64 2021-06-28 22:29:28 +02:00
Gunnar Beutner
158355e0d7 Kernel+LibELF: Add support for validating and loading ELF64 executables 2021-06-28 22:29:28 +02:00
Gunnar Beutner
f285241cb8 Kernel: Rename Thread::tss to Thread::regs and add x86_64 support
We're using software context switches so calling this struct tss is
somewhat misleading.
2021-06-27 15:46:42 +02:00
Gunnar Beutner
233ef26e4d Kernel+Userland: Add x86_64 registers to RegisterState/PtraceRegisters 2021-06-27 15:46:42 +02:00
Andreas Kling
f4090d46de Kernel: Don't kmalloc() for small (<=1024) dbgputstr() syscalls 2021-06-27 10:50:24 +02:00
Gunnar Beutner
f630299d49 Kernel: Add support for setting up a x86_64 GDT once in C++ land 2021-06-26 11:08:52 +02:00
Daniel Bertalan
f820917a76 Everywhere: Use nothrow new with adopt_{ref,own}_if_nonnull
This commit converts naked `new`s to `AK::try_make` and `AK::try_create`
wherever possible. If the called constructor is private, this can not be
done, so we instead now use the standard-defined and compiler-agnostic
`new (nothrow)`.
2021-06-24 17:35:49 +04:30
Max Wipfli
84c0f98fb2 Kernel: Reimplement the dbgputch and dbgputstr syscalls
This rewrites the dbgputch and dbgputstr system calls as wrappers of
kstdio.h.

This fixes a bug where only the Kernel's debug output was also sent to
a serial debugger, while the userspace's debug output was only sent to
the Bochs debugger.

This also fixes a bug where debug output from one process would
sometimes "interrupt" the debug output from another process in the
middle of a line.
2021-06-24 10:29:09 +02:00
Gunnar Beutner
38fca26f54 Kernel: Add stubs for missing x86_64 functionality
This adds just enough stubs to make the kernel compile on x86_64. Obviously
it won't do anything useful - in fact it won't even attempt to boot because
Multiboot doesn't support ELF64 binaries - but it gets those compiler errors
out of the way so more progress can be made getting all the missing
functionality in place.
2021-06-24 09:27:13 +02:00
Hendiadyoin1
925be2758e Kernel: Remove unused CPU.h includes
In most cases we did not need it at all, in other, we only needed one
header from it
2021-06-24 00:38:23 +02:00
Hendiadyoin1
7ca3d413f7 Kernel: Pull apart CPU.h
This does not add any functional changes
2021-06-24 00:38:23 +02:00
Gunnar Beutner
bf779e182e Kernel: Remove obsolete size_t casts 2021-06-17 19:52:54 +02:00
Gunnar Beutner
bc3076f894 Kernel: Remove various other uses of ssize_t 2021-06-16 21:29:36 +02:00
Jelle Raaijmakers
30abfc2b21 Kernel: Pass absolute path to shebang interpreter
When you invoke a binary with a shebang line, the `execve` syscall
makes sure to pass along command line arguments to the shebang
interpreter including the path to the binary to execute.

This does not work well when the binary lives in $PATH. For example,
given this script living in `/usr/local/bin/my-script`:

  #!/bin/my-interpreter
  echo "well hello friends"

When executing it as `my-script` from outside `/usr/local/bin/`, it is
executed as `/bin/my-interpreter my-script`. To make sure that the
interpreter can find the binary to execute, we need to replace the
first argument with an absolute path to the binary, so that the
resulting command is:

  /bin/my-interpreter /usr/local/bin/my-script
2021-06-13 21:19:51 +02:00
Jelle Raaijmakers
26250779d1 Kernel: Also move() the shebang path in execve 2021-06-13 21:19:51 +02:00
Max Wipfli
573664758a Kernel: Properly reset m_unveiled_paths on execve()
When a process executes another program, its unveil state is reset. For
this, we not only need to clear all nodes from m_unveiled_paths, but
also reset the metadata of m_unveiled_paths (the root node) itself.

This fixes the following bug:
1) A process unveils "/", then executes another program.
2) That other program also unveils some path.
3) "/" is now unveiled for the new program.
2021-06-08 12:15:04 +02:00
Max Wipfli
8930db0900 Kernel: Change unveil state to dropped even when node already exists
This also changes the UnveilState to Dropped when the path unveil() is
called for already has a node.

This fixes a bug where unveiling "/" would previously keep the
UnveilState as None, which meant that everything was still accessible
until unveil() was called with any non-root path (or nullptr).
2021-06-08 12:15:04 +02:00
Max Wipfli
2fcebfd6a8 Kernel: Update intermediate nodes when changing unveil permissions
When changing the unveil permissions of a preexisting node, we need to
make sure that any intermediate nodes that were created before and
should inherit permissions from the updated node are updated properly.

This fixes the following bug:
unveil("/home/anon/Documents", "r");
unveil("/home", "r");
Now there was a intermediate node for "/home/anon" which still had no
permission, even though it should have inherited the permissions from
"/home".
2021-06-08 12:15:04 +02:00
Max Wipfli
e8a317023d Kernel: Allow unveiling subfolders regardless of parent's permissions
This fixes a bug where unveiling a subdirectory of an already unveiled
path would sometimes be allowed and sometimes not (depending on what
other unveil calls have been made).

Now, it is always allowed to unveil a subdirectory of an already
unveiled directory, even if it has higher permissions.

This removes the need for the permissions_inherited_from_root flag in
UnveilMetadata, so it has been removed.
2021-06-08 12:15:04 +02:00
Max Wipfli
9d41dd2ed0 Kernel: Use LexicalPath to avoid two consecutive slashes in unveil path
This patch fixes a bug in the unveil syscall where an UnveilNode's path
would start with two slashes if it's parent node was "/".
2021-06-08 12:15:04 +02:00
Jelle Raaijmakers
d6a3f1fcd7 Kernel: Simplify execve shebang argument handling 2021-06-08 11:30:58 +02:00
Brian Gianforcaro
9fccbde371 Kernel: Switch Process to InstrusiveList from InlineLinkedList 2021-06-07 09:42:55 +02:00
Jelle Raaijmakers
f6d372b2ab Kernel: Process::exec(): Check if path is a regular file
https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html

  [EACCES] The new process image file is not a regular file and the
           implementation does not support execution of files of its
           type.

Let's check whether the passed `path` is indeed a regular file.
2021-06-04 23:45:17 +02:00
Jelle Raaijmakers
496988de47 LibC: Add POSIX timer constants 2021-06-04 10:39:41 +02:00
Gunnar Beutner
fe0ae3161a Kernel: Fix use-after-free in sys$mremap
Now that Region::name() has been changed to return a StringView we
can't rely on keeping a copy of the region's name past the region's
destruction just by holding a copy of the StringView.
2021-06-02 18:00:13 +02:00
Brian Gianforcaro
7c0b2eb0f5 Kernel: Handle OOM of file system in sys$mount 2021-06-01 23:14:40 +01:00
Brian Gianforcaro
d2d6ab40f9 Kernel: Make AnonymousFile::create API OOM safe 2021-06-01 23:14:40 +01:00
Nick Miller
10ba6f254c Kernel: Rename instances of IO port 0xe9 to BOCHS_DEBUG_PORT 2021-05-31 19:06:13 +01:00
Ali Mohammad Pur
2b5732ab77 AK+Kernel: Disallow implicitly lifting pointers to OwnPtr's
This doesn't really _fix_ anything, it just gets rid of the API and
instead makes the users explicitly use `adopt_own_if_non_null()`.
2021-05-31 17:09:12 +04:30
Gunnar Beutner
01c75e3a34 Kernel: Don't log profile data before/after the process/thread lifetime
There were a few cases where we could end up logging profiling events
before or after the associated process or thread exists in the profile:

After enabling profiling we might end up with CPU samples before we
had a chance to synthesize process/thread creation events.

After a thread exits we would still log associated kmalloc/kfree
events. Instead we now just ignore those events.
2021-05-30 19:03:03 +02:00
Andreas Kling
1123af361d Kernel: Convert Process::get_syscall_path_argument() to KString
This API now returns a KResultOr<NonnullOwnPtr<KString>> and allocation
failures should be propagated everywhere nicely. :^)
2021-05-29 20:18:57 +02:00
Gunnar Beutner
42d667645d Kernel: Make sure we free the thread stack on thread exit
This adds two new arguments to the thread_exit system call which let
a thread unmap an arbitrary VM range on thread exit. LibPthread
uses this functionality to unmap the thread stack.

Fixes #7267.
2021-05-29 15:53:08 +02:00
Gunnar Beutner
95c2166ca9 Kernel: Move sys$munmap functionality into a helper method 2021-05-29 15:53:08 +02:00
Brian Gianforcaro
65d5f81afc Kernel: Make PrivateInodeVMObject factory APIs OOM safe 2021-05-29 09:04:05 +02:00
Andreas Kling
9d801d2345 Kernel: Rename Custody::create() => try_create()
The try_ prefix indicates that this may fail. :^)
2021-05-28 11:23:00 +02:00
Andreas Kling
fc9ce22981 Kernel: Use KString for Region names
Replace the AK::String used for Region::m_name with a KString.

This seems beneficial across the board, but as a specific data point,
it reduces time spent in sys$set_mmap_name() by ~50% on test-js. :^)
2021-05-28 09:37:09 +02:00
Tim Schumacher
58bc10b947
Kernel: Make dup2() return the fd even if old & new are the same (#7506) 2021-05-27 21:14:57 +02:00
Gunnar Beutner
ad6587424f Kernel: Disable profiling if setting up the buffer or timer failed 2021-05-24 09:10:50 +02:00
Gunnar Beutner
0688e02339 Kernel: Make sure we only log profiling events when m_profiling is true
Previously the process' m_profiling flag was ignored for all event
types other than CPU samples.

The kfree tracing code relies on temporarily disabling tracing during
exec. This didn't work for per-process profiles and would instead
panic.

This updates the profiling code so that the m_profiling flag isn't
ignored.
2021-05-23 23:54:30 +01:00
Gunnar Beutner
7557f2db90 Kernel: Remove an allocation when blocking a thread
When blocking a thread with a timeout we would previously allocate
a Timer object. This removes the allocation for that Timer object.
2021-05-20 09:09:10 +02:00
Brian Gianforcaro
bb91bed576 Kernel: Make ProcessGroup::find_or_create API OOM safe
Make ProcessGroup::find_or_create & ProcessGroup::create OOM safe, by
moving to adopt_ref_if_nonnull.
2021-05-20 08:10:07 +02:00
Gunnar Beutner
7dc77bd833 Kernel: Avoid an allocation in sys$poll 2021-05-19 22:51:42 +02:00
Gunnar Beutner
277f333b2b Kernel: Add support for profiling kmalloc()/kfree() 2021-05-19 22:51:42 +02:00
Gunnar Beutner
572bbf28cc Kernel+LibC: Add support for filtering profiling events
This adds the -t command-line argument for the profile tool. Using this
argument you can filter which event types you want in your profile.
2021-05-19 22:51:42 +02:00
Justin
1c3badede3 Kernel: Add statvfs & fstatvfs Syscalls
These syscalls fill a statvfs struct with various data
about the mount on the VFS.
2021-05-19 21:33:29 +02:00
Hendiadyoin1
ef425a02f7 Kernel: Implement mprotect for multiple Regions 2021-05-18 16:50:52 +02:00
Sahan Fernando
d0f314b23c Kernel: Fix subtle race condition in sys$write implementation
There is a slight race condition in our implementation of write().
We call File::can_write() before attempting to write to it (blocking if
it returns false). If it returns true, we assume that we can write to
the file, and our code assumes that File::write() cannot possibly fail
by being blocked. There is, however, the rare case where another process
writes to the file and prevents further writes in between the call to
Files::can_write() and File::write() in the first process. This would
result in the first process calling File::write() when it cannot be
written to.

We fix this by adding a mechanism for File::can_write() to signal that
it was blocked, making it the responsibilty of File::write() to check
whether it can write and then finally making sys$write() check if the
write failed due to it being blocked.
2021-05-18 16:33:15 +02:00
Gunnar Beutner
89956cb0d6 Kernel+Userspace: Implement the accept4() system call
Unlike accept() the new accept4() system call lets the caller specify
flags for the newly accepted socket file descriptor, such as
SOCK_CLOEXEC and SOCK_NONBLOCK.
2021-05-17 13:32:19 +02:00
Liav A
e6f333ae00 Kernel: Print failed attempt to shutdown the machine
Because we don't parse ACPI AML yet, If we are not able to shut down
the machine with "hacky" emulation methods - halt and print this state
to the users so they know they can shutdown the machine by themselves.
2021-05-17 00:30:40 +01:00
Nicholas Baron
aa4d41fe2c
AK+Kernel+LibELF: Remove the need for IteratorDecision::Continue
By constraining two implementations, the compiler will select the best
fitting one. All this will require is duplicating the implementation and
simplifying for the `void` case.

This constraining also informs both the caller and compiler by passing
the callback parameter types as part of the constraint
(e.g.: `IterationFunction<int>`).

Some `for_each` functions in LibELF only take functions which return
`void`. This is a minimal correctness check, as it removes one way for a
function to incompletely do something.

There seems to be a possible idiom where inside a lambda, a `return;` is
the same as `continue;` in a for-loop.
2021-05-16 10:36:52 +01:00
Andreas Kling
4d429ba9ea Kernel: Unbreak profiling all processes
Regressed in 8a4cc735b9.
We stopped generating "process created" when enabling profiling,
which led to Profiler getting confused about the missing events.
2021-05-15 21:25:54 +02:00
Liav A
8a4cc735b9 Kernel: Don't use the profile timer if we don't have a timer to assign 2021-05-15 18:08:41 +02:00
Gunnar Beutner
4ab9d8736b Kernel: Make perf_event() work for global profiles
Previously calls to perf_event() would end up in a process-specific
perfcore file even though global profiling was enabled. This changes
the behavior for perf_event() so that these events are stored into
the global profile instead.
2021-05-15 16:28:18 +02:00
Brian Gianforcaro
ede1483e48 Kernel: Make Process creation APIs OOM safe
This change looks more involved than it actually is. This simply
reshuffles the previous Process constructor and splits out the
parts which can fail (resource allocation) into separate methods
which can be called from a factory method. The factory is then
used everywhere instead of the constructor.
2021-05-15 09:01:32 +02:00
Andreas Kling
16221305ad LibELF: Remove sketchy use of "undefined" ELF::Image::Section
We were using ELF::Image::section(0) to indicate the "undefined"
section, when what we really wanted was just Optional<Section>.

So let's use Optional instead. :^)
2021-05-15 00:17:55 +02:00
Mart G
e7310ba45a Kernel+LibC: Add fstatat
The function fstatat can do the same thing as the stat and lstat
functions. However, it can be passed the file descriptor of a directory
which will be used when as the starting point for relative paths. This
is contrary to stat and lstat which use the current working directory as
the starting for relative paths.
2021-05-14 23:32:10 +02:00
Gunnar Beutner
8614d18956 Kernel: Use a separate timer for profiling the system
This updates the profiling subsystem to use a separate timer to
trigger CPU sampling. This timer has a higher resolution (1000Hz)
and is independent from the scheduler. At a later time the
resolution could even be made configurable with an argument for
sys$profiling_enable() - but not today.
2021-05-14 00:35:57 +02:00
Andreas Kling
e46343bf9a Kernel: Make UserOrKernelBuffer R/W helpers return KResultOr<size_t>
This makes error propagation less cumbersome (and also exposed some
places where we were not doing it.)
2021-05-13 23:28:40 +02:00
Brian Gianforcaro
0d50d3ed1e Kernel: Make InodeWatcher::crate API OOM safe 2021-05-13 16:21:53 +02:00
Brian Gianforcaro
51ceb172b9 Kernel: Replace make<T>() with adopt_own_if_nonnull() in sys$module_load 2021-05-13 16:21:53 +02:00
Brian Gianforcaro
956314f0a1 Kernel: Make Process::start_tracing_from API OOM safe
Modify the API so it's possible to propagate error on OOM failure.
NonnullOwnPtr<T> is not appropriate for the ThreadTracer::create() API,
so switch to OwnPtr<T>, use adopt_own_if_nonnull() to handle creation.
2021-05-13 16:21:53 +02:00
sin-ack
fe5ca6ca27 Kernel: Implement multi-watch InodeWatcher :^)
This patch modifies InodeWatcher to switch to a one watcher, multiple
watches architecture.  The following changes have been made:

- The watch_file syscall is removed, and in its place the
  create_iwatcher, iwatcher_add_watch and iwatcher_remove_watch calls
  have been added.
- InodeWatcher now holds multiple WatchDescriptions for each file that
  is being watched.
- The InodeWatcher file descriptor can be read from to receive events on
  all watched files.

Co-authored-by: Gunnar Beutner <gunnar@beutner.name>
2021-05-12 22:38:20 +02:00
Gunnar Beutner
22ebd754d3 Kernel: Fix loading ELF images without PT_INTERP
Previously we'd try to load ELF images which did not have
an interpreter set with an incorrect load offset of 0, i.e. way
outside of the part of the address space where we'd expect either
the dynamic loader or the user's executable to reside.

This fixes the problem by using get_load_offset for both executables
which have an interpreter set and those which don't. Notably this
allows us to actually successfully execute the Loader.so binary:

courage:~ $ /usr/lib/Loader.so
You have invoked `Loader.so'. This is the helper program for programs
that use shared libraries. Special directives embedded in executables
tell the kernel to load this program.

This helper program loads the shared libraries needed by the program,
prepares the program to run, and runs it. You do not need to invoke
this helper program directly.
courage:~ $
2021-05-10 20:39:08 +02:00
Brian Gianforcaro
0b7395848a Kernel: Plumb OOM propagation through Custody factory
Modify the Custody::create(..) API so it has the ability to propagate
OOM back to the caller.
2021-05-10 11:55:52 +02:00
Brian Gianforcaro
8bf4201f50 Kernel: Move process creation perf events to PerformanceManager 2021-05-07 15:35:23 +02:00
Brian Gianforcaro
ccdcb6a635 Kernel: Add PerformanceManager static class, move perf event APIs there
The current method of emitting performance events requires a bit of
boiler plate at every invocation, as well as having to ignore the
return code which isn't used outside of the perf event syscall. This
change attempts to clean that up by exposing high level API's that
can be used around the code base.
2021-05-07 15:35:23 +02:00
Brian Gianforcaro
11306d7121
Kernel: Modify TimeManagement::current_time(..) API so it can't fail. (#6869)
The fact that current_time can "fail" makes its use a bit awkward.
All callers in the Kernel are trusted besides syscalls, so assert
that they never get there, and make sure all current callers perform
validation of the clock_id with TimeManagement::is_valid_clock_id().

I have fuzzed this change locally for a bit to make sure I didn't
miss any obvious regression.
2021-05-05 18:51:06 +02:00
Brian Gianforcaro
234c6ae32d Kernel: Change Inode::{read/write}_bytes interface to KResultOr<ssize_t>
The error handling in all these cases was still using the old style
negative values to indicate errors. We have a nicer solution for this
now with KResultOr<T>. This change switches the interface and then all
implementers to use the new style.
2021-05-02 13:27:37 +02:00
Sahan Fernando
bd563f0b3c Kernel: Make processes start with a 16-byte-aligned stack 2021-05-01 20:08:35 +02:00
Brian Gianforcaro
a678851b41 Kernel: Harden sys$setgroups Vector usage against OOM 2021-05-01 09:10:30 +02:00
Itamar
6bbd2ebf83 Kernel+LibELF: Support initializing values of TLS data
Previously, TLS data was always zero-initialized.

To support initializing the values of TLS data, sys$allocate_tls now
receives a buffer with the desired initial data, and copies it to the
master TLS region of the process.

The DynamicLinker gathers the initial TLS image and passes it to
sys$allocate_tls.

We also now require the size passed to sys$allocate_tls to be
page-aligned, to make things easier. Note that this doesn't waste memory
as the TLS data has to be allocated in separate pages anyway.
2021-04-30 18:47:39 +02:00
Itamar
373e8bcbc7 Kernel: Give a name to the Master TLS region allocation 2021-04-30 18:47:39 +02:00
Gunnar Beutner
3c0355a398 Kernel: Accepted socket file descriptors should not inherit flags
For example Linux accepts an additional argument for flags in accept4()
that let the user specify what flags they want. However, by default
accept() should not inherit those flags from the listener socket.
2021-04-30 11:43:19 +02:00
Jesse Buhagiar
60cdbc9397 Kernel/LibC: Implement setreuid 2021-04-30 11:35:17 +02:00
Brian Gianforcaro
a8765fa673 Kernel: Harden sys$select Vector usage against OOM.
Theoretically the append should never fail as we have in-line storage
of FD_SETSIZE, which should always be enough. However I'm planning on
removing the non-try variants of AK::Vector when compiling in kernel
mode in the future, so this will need to go eventually. I suppose it
also protects against some unforeseen bug where we we can append more
than FD_SETSIZE items.
2021-04-29 20:31:15 +02:00
Brian Gianforcaro
0ca668f59c Kernel: Harden sys$munmap Vector usage against OOM.
Theoretically the append should never fail as we have in-line storage
of 2, which should be enough. However I'm planning on removing the
non-try variants of AK::Vector when compiling in kernel mode in the
future, so this will need to go eventually. I suppose it also protects
against some unforeseen bug where we we can append more than 2 items.
2021-04-29 20:31:15 +02:00
Brian Gianforcaro
569c5a8922 Kernel: Harden sys$purge Vector usage against OOM.
sys$purge() is a bit unique, in that it is probably in the systems
advantage to attempt to limp along if we hit OOM while processing
the vmobjects to purge. This change modifies the algorithm to observe
OOM and continue trying to purge any previously visited VMObjects.
2021-04-29 20:31:15 +02:00
Brian Gianforcaro
b3096276bb Kernel: Harden sys$poll Vector usage against OOM. 2021-04-29 20:31:15 +02:00
Brian Gianforcaro
119b7be249 Kernel: Harden sys$execve Vector usage against OOM. 2021-04-29 20:31:15 +02:00
Brian Gianforcaro
454d2fd42a Kernel: Harden sys$readv / sys$writev Vector usage against OOM. 2021-04-29 20:31:15 +02:00
Brian Gianforcaro
cd29eb7867 Kernel: Harden sys$sendmsg / sys$recvmsg Vector usage against OOM. 2021-04-29 20:31:15 +02:00
Gunnar Beutner
d9ee2c6a89 Kernel: Avoid overrunning the user-specified buffers in select() 2021-04-28 23:05:10 +02:00
Gunnar Beutner
aa792062cb Kernel+LibC: Implement the socketpair() syscall 2021-04-28 14:19:45 +02:00
Jelle Raaijmakers
b630e39fbb Kernel: Check futex command if realtime clock is used 2021-04-27 09:19:55 +02:00
Gunnar Beutner
659507696c Kernel: Fix incorrect argument for thread_exit events 2021-04-26 23:26:58 +02:00
Gunnar Beutner
1c02848e54 Kernel: Log thread exits for global profiles 2021-04-26 23:26:58 +02:00
Gunnar Beutner
afeee35cbf Kernel: Avoid calling characters() where not necessary 2021-04-26 23:26:58 +02:00
Gunnar Beutner
eb798d5538 Kernel+Profiler: Improve profiling subsystem
This turns the perfcore format into more a log than it was before,
which lets us properly log process, thread and region
creation/destruction. This also makes it unnecessary to dump the
process' regions every time it is scheduled like we did before.

Incidentally this also fixes 'profile -c' because we previously ended
up incorrectly dumping the parent's region map into the profile data.

Log-based mmap support enables profiling shared libraries which
are loaded at runtime, e.g. via dlopen().

This enables profiling both the parent and child process for
programs which use execve(). Previously we'd discard the profiling
data for the old process.

The Profiler tool has been updated to not treat thread IDs as
process IDs anymore. This enables support for processes with more
than one thread. Also, there's a new widget to filter which
process should be displayed.
2021-04-26 17:13:55 +02:00
Linus Groh
dbe72fd962 Everywhere: Remove empty line after function body opening curly brace 2021-04-25 20:20:00 +02:00
Brian Gianforcaro
8d6e9fad40 Kernel: Remove the now defunct LOCKER(..) macro. 2021-04-25 09:38:27 +02:00
Jelle Raaijmakers
54f5b1346c Kernel: Support null act argument for sigaction syscall
Userspace can provide a null argument for the `act` argument to the
`sigaction` syscall to not set any new behavior. This is described
here:

https://pubs.opengroup.org/onlinepubs/007904875/functions/sigaction.html

Without this fix, the `copy_from_user(...)` invocation on `user_act`
fails and makes the syscall return early.
2021-04-24 23:00:28 +02:00
Andreas Kling
b91c49364d AK: Rename adopt() to adopt_ref()
This makes it more symmetrical with adopt_own() (which is used to
create a NonnullOwnPtr from the result of a naked new.)
2021-04-23 16:46:57 +02:00
Maciej Zygmanowski
6efcc2fc99 Kernel: Don't allow to kill kernel processes
The protection was only for SIGKILL before.
2021-04-23 13:26:02 +02:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00