Commit graph

3483 commits

Author SHA1 Message Date
Andreas Kling
21ccbc2167 Kernel: Expose process executable paths in /proc/all 2020-12-27 01:16:56 +01:00
Andreas Kling
87492e723b Kernel: Lock target process when generating core dump
Dumping core can happen at the end of a profiling run, and in that case
we have to protect the target process and take the lock while iterating
over its region map.

Fixes #4509.
2020-12-27 01:16:56 +01:00
Tom
74fa894994 Kernel: Remove subheap from list before removing memory
When the ExpandableHeap calls the remove_memory function, the
subheap is assumed to be removed and freed entirely. remove_memory
may drop the underlying memory at any time, but it also may cause
further allocation requests. Not removing it from the list before
calling remove_memory could cause a memory allocation in that
subheap while remove_memory is executing. which then causes issues
once the underlying memory is actually freed.
2020-12-26 19:55:01 +01:00
AnotherTest
7b5aa06702 Kernel: Allow 'elevating' unveil permissions if implicitly inherited from '/'
This can happen when an unveil follows another with a path that is a
sub-path of the other one:
```c++
unveil("/home/anon/.config/whoa.ini", "rw");
unveil("/home/anon", "r"); // this would fail, as "/home/anon" inherits
                           // the permissions of "/", which is None.
```
2020-12-26 16:10:04 +01:00
AnotherTest
a9184fcb76 Kernel: Implement unveil() as a prefix-tree
Fixes #4530.
2020-12-26 11:54:54 +01:00
Lenny Maiorani
b2316701a8 Everywhere: void arguments to C functions
Problem:
- C functions with no arguments require a single `void` in the argument list.

Solution:
- Put the `void` in the argument list of functions in C header files.
2020-12-26 10:10:27 +01:00
Sahan Fernando
6b01d1cf14 LibC: Enable compiler warnings for printf format strings 2020-12-26 10:05:50 +01:00
Andreas Kling
1cfdaf96c4 Kernel: Reset the process dumpable flag on successful non-setid exec
Once we've committed to a new memory layout and non-setid credentials,
we can reset the dumpable flag.
2020-12-26 01:31:24 +01:00
Andreas Kling
82f86e35d6 Kernel+LibC: Introduce a "dumpable" flag for processes
This new flag controls two things:
- Whether the kernel will generate core dumps for the process
- Whether the EUID:EGID should own the process's files in /proc

Processes are automatically made non-dumpable when their EUID or EGID is
changed, either via syscalls that specifically modify those ID's, or via
sys$execve(), when a set-uid or set-gid program is executed.

A process can change its own dumpable flag at any time by calling the
new sys$prctl(PR_SET_DUMPABLE) syscall.

Fixes #4504.
2020-12-25 19:35:55 +01:00
Andreas Kling
3c9bd911b8 Kernel: Make /proc/PID directories owned by the EUID:EGID
This is instead of the UID:GID, since that was allowing some very bad
information leaks like spawning "su" as an unprivileged user and having
full /proc access to it.

Work towards #4504.
2020-12-25 19:35:55 +01:00
Andreas Kling
057c1d4798 Kernel: Fix build with E1000_DEBUG 2020-12-25 19:35:55 +01:00
Andreas Kling
ed5c26d698 AK: Remove custom %w format string specifier
This was a non-standard specifier alias for %04x. This patch replaces
all uses of it with new-style formatting functions instead.
2020-12-25 17:05:05 +01:00
Andreas Kling
cb2c8f71f4 AK: Remove custom %b format string specifier
This was a non-standard specifier alias for %02x. This patch replaces
all uses of it with new-style formatting functions instead.
2020-12-25 17:04:28 +01:00
Andreas Kling
89d3b09638 Kernel: Allocate new main thread stack before committing to exec
If the allocation fails (e.g ENOMEM) we want to simply return an error
from sys$execve() and continue executing the current executable.

This patch also moves make_userspace_stack_for_main_thread() out of the
Thread class since it had nothing in particular to do with Thread.
2020-12-25 16:22:01 +01:00
Andreas Kling
2f1712cc29 Kernel: Move ELF auxiliary vector building out of Process class
Process had a couple of members whose only purpose was holding on to
some temporary data while building the auxiliary vector. Remove those
members and move the vector building to a free function in execve.cpp
2020-12-25 15:23:35 +01:00
Andreas Kling
40e9edd798 LibELF: Move AuxiliaryValue into the ELF namespace 2020-12-25 14:48:30 +01:00
Andreas Kling
6c9a6bea1e Kernel+LibELF: Abort ELF executable load sooner when something fails
Make it possible to bail out of ELF::Image::for_each_program_header()
and then do exactly that if something goes wrong during executable
loading in the kernel.

Also make the errors we return slightly more nuanced than just ENOEXEC.
2020-12-25 14:42:42 +01:00
Andreas Kling
791b32e3c6 Kernel: Remove an unnecessary cast in sys$execve() 2020-12-25 14:16:35 +01:00
Andreas Kling
9c640e67ac Kernel: Don't fetch full inode metadata in sys$execve()
We only need the size, so let's not fetch all the metadata.
2020-12-25 14:15:33 +01:00
Andreas Kling
c3eddbcb49 Kernel: Add back missing ELF::Image validity check
If the image is not a valid ELF we should just fail ASAP.
2020-12-25 14:13:44 +01:00
Andreas Kling
4986f268a5 Kernel: Convert dbg() => dbgln() in sys$execve() 2020-12-25 12:51:35 +01:00
Andreas Kling
73e151edd0 Kernel: Add formatter for VirtualAddress 2020-12-25 12:51:11 +01:00
Andreas Kling
09129782de Kernel: Simplify ELF loading logic in sys$execve() somewhat
Get rid of the lambda functions and put the logic inline in the program
header traversal loop instead. This makes the code quite a bit shorter
and hopefully makes it easier to see what's going on.
2020-12-25 02:33:57 +01:00
Andreas Kling
1e4c010643 LibELF: Remove ELF::Loader and move everyone to ELF::Image
This commit gets rid of ELF::Loader entirely since its very ambiguous
purpose was actually to load executables for the kernel, and that is
now handled by the kernel itself.

This patch includes some drive-by cleanup in LibDebug and CrashDaemon
enabled by the fact that we no longer need to keep the ref-counted
ELF::Loader around.
2020-12-25 02:14:56 +01:00
Andreas Kling
7551a66f73 Kernel+LibELF: Move sys$execve()'s loading logic from LibELF to Kernel
It was really weird that ELF loading was performed by the ELF::Loader
class instead of just being done by the kernel itself. This patch moves
all the layout logic from ELF::Loader over to sys$execve().

The kernel no longer cares about ELF::Loader and instead only uses an
ELF::Image as an interpreting wrapper around executables.
2020-12-25 01:22:55 +01:00
Andreas Kling
d7ad082afa Kernel+LibELF: Stop doing ELF symbolication in the kernel
Now that the CrashDaemon symbolicates crashes in userspace, let's take
this one step further and stop trying to symbolicate userspace programs
in the kernel at all.
2020-12-25 01:03:46 +01:00
Itamar
0cb636078a Kernel+LibELF: Allow Non ET_DYN executables to have an interpreter 2020-12-24 21:34:51 +01:00
Itamar
d64d0451e5 Kernel: Fix mmap with specific address for file backed mappings 2020-12-24 21:34:51 +01:00
Brendan Coles
b156c5a8eb ProcFS: pid_vm: Replace duplicated purgeable key with kernel+cacheable
ProcFS /proc/<pid>/vm map info no longer contains two `purgeable` keys.

The second `purgeable` key has been removed and replaced with keys for
`kernel` and `cacheable`.
2020-12-24 10:26:39 +01:00
Andreas Kling
51713901b1 Kernel: Tweak parameter name in Inode::read_entire()
This is a descriptION, not a descriptOR. :^)
2020-12-23 20:36:14 +01:00
Andreas Kling
1e21d49e86 Kernel: Fix wrong-looking overflow check in sys$execve()
This was harmless since sizeof(length) and sizeof(strings) are both 4
on x86 but let's check the right things regardless.
2020-12-23 20:34:22 +01:00
Andreas Kling
c6a0694f50 Kernel: Don't assert when reading from a listening-mode local socket
Instead just fail with EINVAL as a listening socket is never suitable
for reading from.

Fixes #4511.
2020-12-23 20:25:29 +01:00
Andreas Kling
23febb9d8e Kernel: Ptrace::handle_syscall() should return errors as KResult 2020-12-23 14:55:24 +01:00
Andreas Kling
eaa63fdda5 Kernel: Don't assert on PT_PEEK with kernelspace address
We were casting the address to Userspace<T> without validating it first
which is no good and will trap an assertion soon after.

Let's catch this sooner with an ASSERT in the Userspace<T> constructor
and update the PT_PEEK and PT_POKE handlers to avoid it.

Fixes #4505.
2020-12-23 14:50:20 +01:00
Andreas Kling
c25cf5fb56 Kernel: Panic if we're about to switch to a user thread with IOPL!=0
This is a crude protection against IOPL elevation attacks. If for
any reason we find ourselves about to switch to a user mode thread
with IOPL != 0, we'll now simply panic the kernel.

If this happens, it basically means that something tricked the kernel
into incorrectly modifying the IOPL of a thread, so it's no longer
safe to trust the kernel anyway.
2020-12-23 14:30:10 +01:00
Andreas Kling
c77dda6827 Kernel: Make KBuffer::try_create_with_bytes() actually copy the bytes
KBuffers created with this API were actually just zero-filled instead
of being populated with the provided bytes.

Fixes #4493.
2020-12-23 00:40:11 +01:00
Andreas Kling
6bfbc5f5f5 Kernel: Don't allow modifying IOPL via sys$ptrace() or sys$sigreturn()
It was possible to overwrite the entire EFLAGS register since we didn't
do any masking in the ptrace and sigreturn syscalls.

This made it trivial to gain IO privileges by raising IOPL to 3 and
then you could talk to hardware to do all kinds of nasty things.

Thanks to @allesctf for finding these issues! :^)

Their exploit/write-up: https://github.com/allesctf/writeups/blob/master/2020/hxpctf/wisdom2/writeup.md
2020-12-22 19:38:25 +01:00
Andreas Kling
b452dd13b6 Kernel: Allow sys$chmod() to modify the set-gid bit
We were incorrectly masking off the set-gid bit.

Fixes #4060.
2020-12-22 17:48:42 +01:00
Luke
72ce4abb99 Kernel/Net: Support all E1000 devices in the spec sheet
Since they're all covered by the same spec sheet, we can expect
the same code to cover most of the devices.

It can't currently differentiate between them, which would be nice to
add for determining what registers we can access.
2020-12-22 14:44:11 +01:00
Andreas Kling
2dfe5751f3 Kernel: Abort core dump generation if any substep fails
And make an effort to propagate errors out from the inner parts.
This fixes an issue where the kernel would infinitely loop in coredump
generation if the TmpFS filled up.
2020-12-22 10:09:41 +01:00
Luke
69d7a34bc2 Kernel/PCI: Add a bunch of debug output to accessors
This was useful for debugging this issue.
2020-12-22 09:24:48 +01:00
Luke
9ab9e548f4 Kernel/PCI: Create device configuration space mapping before creating a physical ID
When enumerating the hardware using MMIO mode, it would attempt to
create a physical ID first. To create a physical ID, it needs to
retrieve the capabilities of the device.

When enumerating the first device, there would be no device
configuration space mappings. Access::get_capabilities_pointer
calls PCI::read16, which in turn goes to MMIOAccess::read16_field.

MMIOAccess::read16_field attempts to get a device configuration space
and fully expects to get one. However, since this is the first device,
there are none and it crashes with an m_has_value assertion failure.

This fixes this by creating the device configuration space mapping
before creating the physical ID.

Testing with VMware Player 16.1.0.
2020-12-22 09:24:48 +01:00
Luke
0316f0627e Kernel/Net: E1000 interrupt rate register is 32-bit, not 16-bit
I looked at the spec sheet and noticed that it's 32-bit, not 16-bit.
This fixes E1000 causing an MMIO fault on VirtualBox.

Spec: https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf
Section 13.4.18
2020-12-22 09:03:46 +01:00
Tom
5f51d85184 Kernel: Improve time keeping and dramatically reduce interrupt load
This implements a number of changes related to time:
* If a HPET is present, it is now used only as a system timer, unless
  the Local APIC timer is used (in which case the HPET timer will not
  trigger any interrupts at all).
* If a HPET is present, the current time can now be as accurate as the
  chip can be, independently from the system timer. We now query the
  HPET main counter for the current time in CPU #0's system timer
  interrupt, and use that as a base line. If a high precision time is
  queried, that base line is used in combination with quering the HPET
  timer directly, which should give a much more accurate time stamp at
  the expense of more overhead. For faster time stamps, the more coarse
  value based on the last interrupt will be returned. This also means
  that any missed interrupts should not cause the time to drift.
* The default system interrupt rate is reduced to about 250 per second.
* Fix calculation of Thread CPU usage by using the amount of ticks they
  used rather than the number of times a context switch happened.
* Implement CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE and use it
  for most cases where precise timestamps are not needed.
2020-12-21 18:26:12 +01:00
Liav A
469f20d4ee Kernel: Introduce the StorageManagement class
The StorageManagement class has 2 roles:
1. During boot, it should find all storage controllers in the machine,
and then determine what is the boot device.
2. Later on boot, it is a registrar of all storage controllers and
storage devices. Thus, it could be used to show information about these
devices when implemented.

This change allows the user to specify a boot driver other than /dev/hda
and if it's connected in the machine - it will boot.
2020-12-21 00:19:21 +01:00
Liav A
78ae4b0530 Kernel: Change the indexing of storage devices in IDEController class
Previously, the indexing scheme was that 0 is Primary-Master, 1 is
Primary-Slave, 2 is Secondary-Master, 3 is Secondary-Slave.

Instead of merely matching between numbers to the channel & position,
the IDEController code will try to find all available drives connected to
the two channels, then it will create a Vector with nonnull RefPtr to
them. Then we take use the given index with this Vector.
2020-12-21 00:19:21 +01:00
Liav A
6a691306b5 Kernel: Add a method to gather the devices count of a Storage controller
Also, change device() method to be const.
2020-12-21 00:19:21 +01:00
Liav A
e3b3805abf Kernel: Add a method to check the type of a StorageController
Also, the device method in the StorageController class is public now.
2020-12-21 00:19:21 +01:00
Liav A
28599af387 Kernel: Allow to initialize an IDE device on the secondary channel
We now use major number 3, and the minor number is set to 0 or 2 if
initialized on the primary channel, otherwise 1 or 3 on the secondary
channel.
2020-12-21 00:19:21 +01:00
Liav A
0a2b00a1bf Kernel: Introduce the new Storage subsystem
This new subsystem is somewhat replacing the IDE disk code we had with a
new flexible design.

StorageDevice is a generic class that represent a generic storage
device. It is meant that specific storage hardware will override the
interface. StorageController is a generic class that represent
a storage controller that can be found in a machine.

The IDEController class governs two IDEChannels. An IDEChannel is
responsible to manage the master & slave devices of the channel,
therefore an IDEChannel is an IRQHandler.
2020-12-21 00:19:21 +01:00
Liav A
39c1783387 Kernel: Allow to install a real IRQ handler on a spurious one
IRQ 7 and 15 on the PIC architecture are used for spurious interrupts.
IRQ 7 could also be used for LPT connection, and IRQ 15 can be used for
the secondary IDE channel. Therefore, we need to allow to install a
real IRQ handler and check if a real IRQ was asserted. If so, we handle
them in the usual way.

A note on this fix - unregistering or registering a new IRQ handler
after we already registered one in the spurious interrupt handler is
not supported yet.
2020-12-21 00:19:21 +01:00
Liav A
cf0a12c68f Kernel: Add various methods to handle interrupts in the PCI subsystem
For now, we only are able to enable or disable pin based interrupts.
Later, when implemented, we could utilize MSI & MSI-X interrupts.
2020-12-21 00:19:21 +01:00
Liav A
97b36febd5 Kernel: Add a method to retrieve the Physical ID for a PCI address 2020-12-21 00:19:21 +01:00
Liav A
85b4256d10 PCI: Add list of capabilities for each device during first enumeration 2020-12-21 00:19:21 +01:00
Liav A
9d10eb473d Kernel: Add the DeviceController class in the PCI subsystem
Such device is not an IRQHandler by itself, but actually a controller of
many IRQ or MSI devices. The purpose of this class is to manage multiple
sources of interrupts.

For example, a generic ISA IDE controller controls 2 IRQ sources - 14
and 15. So, when we initialize the IDE controller, it will initialize
two IDE channels (also known as PATAChannels) to utilize IRQ 14 and 15,
respectively. NVMe with MSI-X support can theoretically handle up to
2048 interrupts.
2020-12-21 00:19:21 +01:00
Liav A
afba614d68 Kernel: Don't skip if found free page to allocate from a super region
This was a bad pattern that wasn't detected because we only had one
super physical region that was initialized by MemoryManager.
2020-12-21 00:15:58 +01:00
Lenny Maiorani
765936ebae
Everywhere: Switch from (void) to [[maybe_unused]] (#4473)
Problem:
- `(void)` simply casts the expression to void. This is understood to
  indicate that it is ignored, but this is really a compiler trick to
  get the compiler to not generate a warning.

Solution:
- Use the `[[maybe_unused]]` attribute to indicate the value is unused.

Note:
- Functions taking a `(void)` argument list have also been changed to
  `()` because this is not needed and shows up in the same grep
  command.
2020-12-21 00:09:48 +01:00
Andreas Kling
34e9df3c5e Kernel: Randomize memory location of the dynamic loader :^)
This should make it a little bit harder for those who would mess with
our loader.
2020-12-20 18:49:24 +01:00
Andreas Kling
02ef3f6343 Kernel: Ptrace should not assert on poke in non-mapped tracee memory 2020-12-20 18:49:24 +01:00
Andreas Kling
9bf02c32c0 Kernel: Activate SUID/SGID credentials earlier in sys$execve()
Switch on the new credentials before loading the new executable into
memory. This ensures that attempts to ptrace() the program from an
unprivileged process will fail.

This covers one bug that was exploited in the 2020 HXP CTF:
https://hxp.io/blog/79/hxp-CTF-2020-wisdom2/

Thanks to yyyyyyy for finding the bug! :^)
2020-12-20 18:49:18 +01:00
Andreas Kling
5505159a94 Kernel: Silence debug spam about select() being interrupted 2020-12-20 16:06:52 +01:00
Andreas Kling
e5eda151b4 Kernel: Silence debug spam when running dynamically linked programs 2020-12-20 16:06:39 +01:00
Andreas Kling
d893498e57 Kernel: Use fallible KBuffer API in PerformanceEventBuffer 2020-12-19 10:23:12 +01:00
Andreas Kling
3d02597316 Kernel: Avoid a heap allocation for every outgoing TCP packet 2020-12-18 19:22:26 +01:00
Andreas Kling
befabe31c9 Kernel/Net: Avoid a heap allocation for every outgoing UDP packet
We can use a stack buffer to build the UDP packet instead.
2020-12-18 19:22:26 +01:00
Andreas Kling
8cc81c2953 Kernel/Net: Make IPv4Socket::protocol_receive() take a ReadonlyBytes
The overrides of this function don't need to know how the original
packet was stored, so let's just give them a ReadonlyBytes view of
the raw packet data.
2020-12-18 19:22:26 +01:00
Andreas Kling
8e79bde2b7 Kernel: Move KBufferBuilder to the fallible KBuffer API
KBufferBuilder::build() now returns an OwnPtr<KBuffer> and can fail.
Clients of the API have been updated to handle that situation.
2020-12-18 19:22:26 +01:00
Andreas Kling
d936d86332 Kernel: Add KBuffer::try_create_with_bytes()
Here's another fallible KBuffer construction API that creates a KBuffer
and populates it with a range of bytes.
2020-12-18 19:22:26 +01:00
Andreas Kling
bcd2844439 TmpFS: Use fallible KBuffer API
If allocation fails, some TmpFS operations can now fail with ENOMEM.
2020-12-18 19:22:26 +01:00
Andreas Kling
47da86d136 Ext2FS: Fail the mount if BGD table cache allocation fails
Instead of asserting if we can't allocate enough memory for a BGD table
cache, just fail the mount instead.
2020-12-18 19:22:26 +01:00
Andreas Kling
8cde8ba511 Kernel: Add KBuffer::try_create_with_size()
We need to stop assuming that KBuffer allocation always succeeds.
This patch adds the following API:

- static OwnPtr<KBuffer> KBuffer::create_with_size(size_t);

All KBuffer clients should move towards using this (and handling any
failures with grace.)
2020-12-18 19:22:26 +01:00
Andreas Kling
4232874270 Kernel: Don't dump core when OOM-killing a process
Trying to generate a core dump under low memory conditions is not the
best idea.

Fixes #4428.
2020-12-18 11:22:21 +01:00
Liav A
5a146187cf Kernel: Workaround QEMU bug and initialize i8042 controller
ACPI 2 declared the third revision of FADT, that should have
IAPC_BOOT_ARCH flags in it, also to indicate if i8042 is present.
Q35 machine reports that it has FADT with revision 3, but the code
in QEMU simply ignores these flags and put zero on them no matter
the revision of FADT.
2020-12-18 10:02:14 +01:00
Liav A
f36feb42bd Kernel: Return a correct name string of async write request 2020-12-17 19:36:56 +01:00
Tom
c4176b0da1 Kernel: Fix Lock race causing infinite spinning between two threads
We need to account for how many shared lock instances the current
thread owns, so that we can properly release such references when
yielding execution.

We also need to release the process lock when donating.
2020-12-16 23:38:17 +01:00
Andreas Kling
4befc2c282 Kernel: Avoid null dereference in sys$profiling_disable()
If we can't create a profiling coredump object, we shouldn't try to
call write() on it.
2020-12-15 11:25:51 +01:00
Andreas Kling
be0816507a Kernel: Remove harmless OOB ELF header access in core dump generation 2020-12-15 11:24:46 +01:00
Andreas Kling
28c042e46f Kernel: Make CoreDump::m_num_program_headers const
This makes it an error to assign to it after construction.
2020-12-15 11:24:46 +01:00
Andreas Kling
ff8bf4db8d Kernel: Don't take LexicalPath as argument
LexicalPath is a big and heavy class that's really meant as a helper
for extracting parts of a path, not for storage or passing around.
Instead, pass paths around as strings and use LexicalPath locally
as needed.
2020-12-15 11:17:01 +01:00
Itamar
1efbbf3ac3 Kernel: Don't generate a backtrace when a process exists with non-zero
..status
2020-12-14 23:05:53 +01:00
Itamar
5392f42731 Kernel: Generate coredumps for profiled processes
These coredumps will be used by the Profile Viewer to symbolicate the
profiling samples.
2020-12-14 23:05:53 +01:00
Itamar
39890af833 Kernel: Pass full path of output coredump file to CoreDump 2020-12-14 23:05:53 +01:00
Itamar
349c6780ce LibELF: Refactor coredump notes section structures 2020-12-14 23:05:53 +01:00
Itamar
345abc3132 Kernel: Move InodeWatcher::Event into Kernel/API/InodeWatcherEvent
This allows userspace code to parse these events.
2020-12-14 23:05:53 +01:00
Itamar
b4842d33bb Kernel: Generate a coredump file when a process crashes
When a process crashes, we generate a coredump file and write it in
/tmp/coredumps/.

The coredump file is an ELF file of type ET_CORE.
It contains a segment for every userspace memory region of the process,
and an additional PT_NOTE segment that contains the registers state for
each thread, and a additional data about memory regions
(e.g their name).
2020-12-14 23:05:53 +01:00
Itamar
efe4da57df Loader: Stabilize loader & Use shared libraries everywhere :^)
The dynamic loader is now stable enough to be used everywhere in the
system - so this commit does just that.
No More .a Files, Long Live .so's!
2020-12-14 23:05:53 +01:00
Itamar
9ca1a0731f Kernel: Support TLS allocation from userspace
This adds an allocate_tls syscall through which a userspace process
can request the allocation of a TLS region with a given size.

This will be used by the dynamic loader to allocate TLS for the main
executable & its libraries.
2020-12-14 23:05:53 +01:00
Itamar
5b87904ab5 Kernel: Add ability to load interpreter instead of main program
When the main executable needs an interpreter, we load the requested
interpreter program, and pass to it an open file decsriptor to the main
executable via the auxiliary vector.

Note that we do not allocate a TLS region for the interpreter.
2020-12-14 23:05:53 +01:00
Andreas Kling
48589db3aa Kernel/Net: Socket connected state change should reevaluate blocks
This fixes an issue where TCP sockets could get into the Established
state too quickly and fail to unblock a subsequent sys$select() call.

This makes websites load *significantly* faster. :^)
2020-12-13 19:15:42 +01:00
Tom
1042762deb Kernel: Fix block recursion
Since the process lock is using the Lock class, re-locking the process
lock may cause another call to Thread::block. This caused some problems
with multiple blockers attempting to be used at the same time. To solve
this problem, remember if the process lock was held, and if it was then
relock after we're done with the blockers, just before returning.
2020-12-12 21:28:12 +01:00
Tom
c455fc2030 Kernel: Change wait blocking to Process-only blocking
This prevents zombies created by multi-threaded applications and brings
our model back to closer to what other OSs do.

This also means that SIGSTOP needs to halt all threads, and SIGCONT needs
to resume those threads.
2020-12-12 21:28:12 +01:00
Tom
47ede74326 Kernel: Execute timer handlers outside of irq handler
This allows us to do things in timer handlers that involve e.g. scheduling,
such as using the Lock class or unblocking threads.
2020-12-12 21:28:12 +01:00
Tom
4bbee00650 Kernel: disown should unblock any potential waiters
This is necessary because if a process changes the state to Stopped
or resumes from that state, a wait entry is created in the parent
process. So, if a child process does this before disown is called,
we need to clear those entries to avoid leaking references/zombies
that won't be cleaned up until the former parent exits.

This also should solve an even more unlikely corner case where another
thread is waiting on a pid that is being disowned by another thread.
2020-12-12 21:28:12 +01:00
Tom
da5cc34ebb Kernel: Fix some issues related to fixes and block conditions
Fix some problems with join blocks where the joining thread block
condition was added twice, which lead to a crash when trying to
unblock that condition a second time.

Deferred block condition evaluation by File objects were also not
properly keeping the File object alive, which lead to some random
crashes and corruption problems.

Other problems were caused by the fact that the Queued state didn't
handle signals/interruptions consistently. To solve these issues we
remove this state entirely, along with Thread::wait_on and change
the WaitQueue into a BlockCondition instead.

Also, deliver signals even if there isn't going to be a context switch
to another thread.

Fixes #4336 and #4330
2020-12-12 21:28:12 +01:00
Andreas Kling
97d789c75b Kernel: Fix null dereference when execve'ing ELF without PT_TLS header
Fixes #4387.
2020-12-11 22:59:46 +01:00
Tom
03fcd02dfd Kernel: Fix leaking Timer instances
When a Timer is queued we add a reference, so whenever we remove
a timer or fire it we should drop that reference.

Fixes #4382
2020-12-11 19:33:15 +01:00
Tom
766db673c1 Kernel: Flush TLBs concurrently
Instead of flushing the TLB on the current processor first and then
notifying the other processors to do the same, notify the others
first, and while waiting on the others flush our own.
2020-12-02 23:49:52 +01:00
Tom
5e08ae4e14 Kernel: Fix counting interrupts
Move counting interrupts out of the handle_interrupt method so that
it is done in all cases without the interrupt handler having to
implement it explicitly.

Also make the counter an atomic value as e.g. the LocalAPIC interrupts
may be triggered on multiple processors simultaneously.

Fixes #4297
2020-12-02 23:19:59 +01:00
Tom
12cf6f8650 Kernel: Add CLOCK_REALTIME support to the TimerQueue
This allows us to use blocking timeouts with either monotonic or
real time for all blockers. Which means that clock_nanosleep()
now also supports CLOCK_REALTIME.

Also, switch alarm() to use CLOCK_REALTIME as per specification.
2020-12-02 13:02:04 +01:00
Tom
4c1e27ec65 Kernel: Use TimerQueue for SIGALRM 2020-12-02 13:02:04 +01:00