This eliminates the window between calling Processor::current and
the member function where a thread could be moved to another
processor. This is generally not as big of a concern as with
Processor::current_thread, but also slightly more light weight.
..and allow implicit creation of KResult and KResultOr from ErrnoCode.
This means that kernel functions that return those types can finally
do "return EINVAL;" and it will just work.
There's a handful of functions that still deal with signed integers
that should be converted to return KResults.
This file was useful for debugging a long time ago, but has bitrotted
at this point. Instead of updating it, let's just remove it since
nothing is using it.
This patch merges the profiling functionality in the kernel with the
performance events mechanism. A profiler sample is now just another
perf event, rather than a dedicated thing.
Since perf events were already per-process, this now makes profiling
per-process as well.
Processes with perf events would already write out a perfcore.PID file
to the current directory on death, but since we may want to profile
a process and then let it continue running, recorded perf events can
now be accessed at any time via /proc/PID/perf_events.
This patch also adds information about process memory regions to the
perfcore JSON format. This removes the need to supply a core dump to
the Profiler app for symbolication, and so the "profiler coredump"
mechanism is removed entirely.
There's still a hard limit of 4MB worth of perf events per process,
so this is by no means a perfect final design, but it's a nice step
forward for both simplicity and stability.
Fixes#4848Fixes#4849
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.Everything:
The modifications in this commit were automatically made using the
following command:
find . -name '*.cpp' -exec sed -i -E 's/dbg\(\) << ("[^"{]*");/dbgln\(\1\);/' {} \;
When ProcFS could no longer allocate KBuffer objects to serve calls to
read, it would just return 0, indicating EOF. This then triggered
parsing errors because code assumed it read the file.
Because read isn't supposed to return ENOMEM, change ProcFS to populate
the file data upon file open or seek to the beginning. This also means
that calls to open can now return ENOMEM if needed. This allows the
caller to either be able to successfully open the file and read it, or
fail to open it in the first place.
There is a window between dropping the last reference and removing
a ProcFSInode from the lookup map. So, when looking up we need to
check if that Inode is being destructed.
This adds the ability for a Region to define volatile/nonvolatile
areas within mapped memory using madvise(). This also means that
memory purging takes into account all views of the PurgeableVMObject
and only purges memory that is not needed by all of them. When calling
madvise() to change an area to nonvolatile memory, return whether
memory from that area was purged. At that time also try to remap
all memory that is requested to be nonvolatile, and if insufficient
pages are available notify the caller of that fact.
This was a goofy kernel API where you could assign an icon_id (int) to
a process which referred to a global shbuf with a 16x16 icon bitmap
inside it.
Instead of this, programs that want to display a process icon now
retrieve it from the process executable instead.
This new flag controls two things:
- Whether the kernel will generate core dumps for the process
- Whether the EUID:EGID should own the process's files in /proc
Processes are automatically made non-dumpable when their EUID or EGID is
changed, either via syscalls that specifically modify those ID's, or via
sys$execve(), when a set-uid or set-gid program is executed.
A process can change its own dumpable flag at any time by calling the
new sys$prctl(PR_SET_DUMPABLE) syscall.
Fixes#4504.
This is instead of the UID:GID, since that was allowing some very bad
information leaks like spawning "su" as an unprivileged user and having
full /proc access to it.
Work towards #4504.
ProcFS /proc/<pid>/vm map info no longer contains two `purgeable` keys.
The second `purgeable` key has been removed and replaced with keys for
`kernel` and `cacheable`.
This implements a number of changes related to time:
* If a HPET is present, it is now used only as a system timer, unless
the Local APIC timer is used (in which case the HPET timer will not
trigger any interrupts at all).
* If a HPET is present, the current time can now be as accurate as the
chip can be, independently from the system timer. We now query the
HPET main counter for the current time in CPU #0's system timer
interrupt, and use that as a base line. If a high precision time is
queried, that base line is used in combination with quering the HPET
timer directly, which should give a much more accurate time stamp at
the expense of more overhead. For faster time stamps, the more coarse
value based on the last interrupt will be returned. This also means
that any missed interrupts should not cause the time to drift.
* The default system interrupt rate is reduced to about 250 per second.
* Fix calculation of Thread CPU usage by using the amount of ticks they
used rather than the number of times a context switch happened.
* Implement CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE and use it
for most cases where precise timestamps are not needed.
Problem:
- `(void)` simply casts the expression to void. This is understood to
indicate that it is ignored, but this is really a compiler trick to
get the compiler to not generate a warning.
Solution:
- Use the `[[maybe_unused]]` attribute to indicate the value is unused.
Note:
- Functions taking a `(void)` argument list have also been changed to
`()` because this is not needed and shows up in the same grep
command.
Use the TimerQueue to expire blocking operations, which is one less thing
the Scheduler needs to check on every iteration.
Also, add a BlockTimeout class that will automatically handle relative or
absolute timeouts as well as overriding timeouts (e.g. socket timeouts)
more consistently.
Also, rework the TimerQueue class to be able to fire events from
any processor, which requires Timer to be RefCounted. Also allow
creating id-less timers for use by blocking operations.
This is a new "browse" permission that lets you open (and subsequently list
contents of) directories underneath the path, but not regular files or any other
types of files.
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.
So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.
To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.
Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.
Since "rings" typically refer to code execution and user processes
can also execute in ring 0, rename these functions to more accurately
describe what they mean: kernel processes and user processes.
Add an ExpandableHeap and switch kmalloc to use it, which allows
for the kmalloc heap to grow as needed.
In order to make heap expansion to work, we keep around a 1 MiB backup
memory region, because creating a region would require space in the
same heap. This means, the heap will grow as soon as the reported
utilization is less than 1 MiB. It will also return memory if an entire
subheap is no longer needed, although that is rarely possible.
Unlike DirectoryEntry (which is used when constructing directories),
DirectoryEntryView does not manage storage for file names. Names are
just StringViews.
This is much more suited to the directory traversal API and makes
it easier to implement this in file system classes since they no
longer need to create temporary name copies while traversing.
Pledges and Veil state don't really make sense for kernel mode
processes, as they can do what ever they want since they are in
kernel mode. Make this clear in the system monitor UI by marking
these entries as null.
This would have caused an issue later when we enable -Wmissing-declarations, as
the compiler didn't see that Kernel::all_inodes() was being used elsewhere, too.
Also, this means that if the type changes later, there's not going to be weird
run-time issues, but rather a nice type error during compile time.
This enables a nice warning in case a function becomes dead code. Also, in case
of signal_trampoline_dummy, marking it external (non-static) prevents it from
being 'optimized away', which would lead to surprising and weird linker errors.
I found these places by using -Wmissing-declarations.
The Kernel still shows these issues, which I think are false-positives,
but don't want to touch:
- Kernel/Arch/i386/CPU.cpp:1081:17: void Kernel::enter_thread_context(Kernel::Thread*, Kernel::Thread*)
- Kernel/Arch/i386/CPU.cpp:1170:17: void Kernel::context_first_init(Kernel::Thread*, Kernel::Thread*, Kernel::TrapFrame*)
- Kernel/Arch/i386/CPU.cpp:1304:16: u32 Kernel::do_init_context(Kernel::Thread*, u32)
- Kernel/Arch/i386/CPU.cpp:1347:17: void Kernel::pre_init_finished()
- Kernel/Arch/i386/CPU.cpp:1360:17: void Kernel::post_init_finished()
No idea, not gonna touch it.
- Kernel/init.cpp:104:30: void Kernel::init()
- Kernel/init.cpp:167:30: void Kernel::init_ap(u32, Kernel::Processor*)
- Kernel/init.cpp:184:17: void Kernel::init_finished(u32)
Called by boot.S.
- Kernel/init.cpp:383:16: int Kernel::__cxa_atexit(void (*)(void*), void*, void*)
- Kernel/StdLib.cpp:285:19: void __cxa_pure_virtual()
- Kernel/StdLib.cpp:300:19: void __stack_chk_fail()
- Kernel/StdLib.cpp:305:19: void __stack_chk_fail_local()
Not sure how to tell the compiler that the compiler is already using them.
Also, maybe __cxa_atexit should go into StdLib.cpp?
- Kernel/Modules/TestModule.cpp:31:17: void module_init()
- Kernel/Modules/TestModule.cpp:40:17: void module_fini()
Could maybe go into a new header. This would also provide type-checking for new modules.
This compiles, and fixes two bugs:
- setpgid() confusion (see previous commit)
- tcsetpgrp() now allows to set a non-empty process group even if
the group leader has already died. This makes Serenity slightly
more POSIX-compatible.
This compiles, and contains exactly the same bugs as before.
The regex 'FIXME: PID/' should reveal all markers that I left behind, including:
- Incomplete conversion
- Issues or things that look fishy
- Actual bugs that will go wrong during runtime