We should consider whether the selected Thread is within the same jail
or not.
Therefore let's make it clear to callers with jail semantics if a called
method checks if the desired Thread object is within the same jail.
As for Thread::for_each_* methods, currently nothing in the kernel
codebase needs iteration with consideration for jails, so the old
Thread::for_each* were simply renamed to include "ignoring_jails" suffix
in their names.
Some syscalls could be simplified by using the non-static method
Process::get_thread_from_thread_list which should ensure that the
specified tid is of a Thread in the same Process of the current Thread.
These syscalls are not necessary on their own, and they give the false
impression that a caller could set or get the thread name of any process
in the system, which is not true.
Therefore, move the functionality of these syscalls to be options in the
prctl syscall, which makes it abundantly clear that these operations
could only occur from a running thread in a process that sees other
threads in that process only.
This new Kernel StdLib function will be used to copy contents of a
FixedStringBuffer with a null character to a user process.
The first user of this new function is the prctl option of
PR_GET_PROCESS_NAME which would copy a process name including a null
character to a user provided buffer.
When doing PR_{SET,GET}_PROCESS_NAME, it's not expected to pass a signed
integer for the buffer size (in arg2). Therefore, cast it immediately to
a size_t integer type, and let the FixedStringBuffer StdLib memory copy
functions in such cases to worry about possible overflows.
Now that support for 32-bit x86 has been removed, we don't have to worry
about the top half of `off_t`/`u64` values being chopped off when we try
to pass them in registers. Therefore, we no longer need the workaround
of pointers to stack-allocated values to syscalls.
Note that this changes the system call ABI, so statically linked
programs will have to be re-linked.
Using the kernel stack is preferable, especially when the examined
strings should be limited to a reasonable length.
This is a small improvement, because if we don't actually move these
strings then we don't need to own heap allocations for them during the
syscall handler function scope.
In addition to that, some kernel strings are known to be limited, like
the hostname string, for these strings we also can use FixedStringBuffer
to store and copy to and from these buffers, without using any heap
allocations at all.
Instead, use the FixedCharBuffer class to ensure we always use a static
buffer storage for these names. This ensures that if a Process or a
Thread were created, there's a guarantee that setting a new name will
never fail, as only copying of strings should be done to that static
storage.
The limits which are set are 32 characters for processes' names and 64
characters for thread names - this is because threads' names could be
more verbose than processes' names.
Since this is the block size that file system drivers *should* set,
let's name it the logical block size, just like most file systems such
as ext2 already do anyways.
Previously, we started parsing the ELF file again in a completely
different place, and without the partial mapping that we do while
validating.
Instead of doing manual parsing in two places, just capture the
requested stack size right after we validated it.
This is a preparation before we can create a usable mechanism to use
filesystem-specific mount flags.
To keep some compatibility with userland code, LibC and LibCore mount
functions are kept being usable, but now instead of doing an "atomic"
syscall, they do multiple syscalls to perform the complete procedure of
mounting a filesystem.
The FileBackedFileSystem IntrusiveList in the VFS code is now changed to
be protected by a Mutex, because when we mount a new filesystem, we need
to check if a filesystem is already created for a given source_fd so we
do a scan for that OpenFileDescription in that list. If we fail to find
an already-created filesystem we create a new one and register it in the
list if we successfully mounted it. We use a Mutex because we might need
to initiate disk access during the filesystem creation, which will take
other mutexes in other parts of the kernel, therefore making it not
possible to take a spinlock while doing this.
These 4 fields were made `Atomic` in
c3f668a758, at which time these were still
accessed unserialized and TOCTOU bugs could happen. Later, in
8ed06ad814, we serialized access to these
fields in a number of helper methods, removing the need for `Atomic`.
This has KString, KBuffer, DoubleBuffer, KBufferBuilder, IOWindow,
UserOrKernelBuffer and ScopedCritical classes being moved to the
Kernel/Library subdirectory.
Also, move the panic and assertions handling code to that directory.
After examination of all overriden Inode::traverse_as_directory methods
it seems like proper locking is already existing everywhere, so there's
no need to take the big process lock anymore, as there's no access to
shared process structures anyway.
"Wherever applicable" = most places, actually :^), especially for
networking and filesystem timestamps.
This includes changes to unzip, which uses DOSPackedTime, since that is
changed for the FAT file systems.
That's what this class really is; in fact that's what the first line of
the comment says it is.
This commit does not rename the main files, since those will contain
other time-related classes in a little bit.
These 2 are an actual separate types of syscalls, so let's stop using
special flags for bind mounting or re-mounting and instead let userspace
calling directly for this kind of actions.
This is quite useful for userspace applications that can't cope with the
restriction, but it's still useful to impose other non-configurable
restrictions by using jails.
Instead of storing x86_64 register names in `SC_create_thread_params`,
let the Kernel figure out how to pass the parameters to
`pthread_create_helper`.
We have a problem with the original utimensat syscall because when we
do call LibC futimens function, internally we provide an empty path,
and the Kernel get_syscall_path_argument method will detect this as an
invalid path.
This happens to spit an error for example in the touch utility, so if a
user is running "touch non_existing_file", it will create that file, but
the user will still see an error coming from LibC futimens function.
This new syscall gets an open file description and it provides the same
functionality as utimensat, on the specified open file description.
The new syscall will be used later by LibC to properly implement LibC
futimens function so the situation described with relation to the
"touch" utility could be fixed.
When switching to the new address space, we also have to switch the
Process::m_master_tls_* variables as they may refer to a region in
the old address space.
This was causing `su` to not run correctly.
Regression from 65641187ff.
This replaces the previous owning address space pointer. This commit
should not change any of the existing functionality, but it lays down
the groundwork needed to let us properly access the region table under
the address space spinlock during page fault handling.
Instead of setting up the new address space on it's own, and only swap
to the new address space at the end, we now immediately swap to the new
address space (while still keeping the old one alive) and only revert
back to the old one if we fail at any point.
This is done to ensure that the process' active address space (aka the
contents of m_space) always matches actual address space in use by it.
That should allow us to eventually make the page fault handler process-
aware, which will let us properly lock the process address space lock.