(mode & S_IFDIR) is not enough to check if "mode" is a directory,
we have to check all the bits in the S_IFMT mask.
Use the is_directory() helper to fix this bug.
..and allow implicit creation of KResult and KResultOr from ErrnoCode.
This means that kernel functions that return those types can finally
do "return EINVAL;" and it will just work.
There's a handful of functions that still deal with signed integers
that should be converted to return KResults.
This way, if something goes wrong, we get to keep the actual error.
Also, KResults are nodiscard, so we have to deal with that in Ext2FS
instead of just silently ignoring I/O errors(!)
When freeing an inode, we were checking if it's a directory *after*
wiping the inode metadata. This caused us to forget updating the block
group descriptor with the new directory count.
Before this change, truncating an Ext2FS inode to a larger size than it
was before would give you uninitialized on-disk data.
Fix this by zeroing out all the new space when doing an inode resize.
This is pretty naively implemented via Inode::write_bytes() and there's
lots of room for cleverness here in the future.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.Everything:
The modifications in this commit were automatically made using the
following command:
find . -name '*.cpp' -exec sed -i -E 's/dbg\(\) << ("[^"{]*");/dbgln\(\1\);/' {} \;
BlockBasedFileSystem::read_block method should get a reference of
a UserOrKernelBuffer.
If we need to force caching a block, we will call other method to do so.
clang trunk with -std=c++20 doesn't seem to properly look for an
aggregate initializer here when the type being constructed is a simple
aggregate (e.g. `struct Thing { int a; int b; };`). This template fails
to compile in a usage added 12/16/2020 in `AK/Trie.h`.
Both forms of initialization are supposed to call the
aggregate-initializers but direct-list-initialization delegating to
aggregate initializers is a new addition in c++20 that might not be
implemented yet.
If the inode's block list cache is empty, we forgot to assign the
result of computing the block list. The fact that this worked anyway
makes me wonder when we actually don't have a cache..
Thanks to szyszkienty for spotting this! :^)
e2fsck was complaining about blocks being allocated in an inode's list
of direct blocks while at the same time being free in the block bitmap.
It was easy to reproduce by creating a file with non-zero length and
then truncating it. This fixes the issue by clearing out the direct
block list when resizing a file to 0.
When computing the list of blocks to deallocate when freeing an inode,
we would stop collecting blocks after reaching the inode's block count.
Since we're getting rid of the inode, we need to also include the meta
blocks used by the on-disk block list itself.
The block group indices are 1-based for some reason. Because of that,
we were forgetting to check in the very last block group when doing
block allocation. This caused block allocation to fail even when the
superblock indicated that we had free blocks.
Fixes#3674.
First of all, this fixes a dumb info leak where we'd write kernel heap
addresses (StringImpl*) into userspace memory when reading a watcher.
Instead of trying to pass names to userspace, we now simply pass the
child inode index. Nothing in userspace makes use of this yet anyway,
so it's not like we're breaking anything. We'll see how this evolves.
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.
So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.
To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.
Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.
A change introduced in 5e01234 made it the resposibility of each
filesystem to have the file types returned from
'traverse_as_directory' match up with the DT_* types.
However, this caused corruption of the Ext2FS file format because
the Ext2FS uses 'traverse_as_directory' internally when manipulating
the file system. The result was a mixture between EXT2_FT_* and DT_*
file types in the internal Ext2FS structures.
Starting with this commit, the conversion from internal filesystem file
types to the user facing DT_* types happens at a later stage,
in the 'FileDescription::get_dir_entries' function which is directly
used by sys$get_dir_entries.
This fixes an issue we had in the git port where git would not
recognize untracked files (for example in 'git status').
When git used readdir, the 'd_type' field in the dirent struct contained
bad values (Specifically, it contained the values defiend in
Kernel/FileSystem/ext2_fs.h instead of the ones in LibC/dirent.h).
After this fix, we can create a new git repository with 'git init', and
then stage and commit files as usual.
Unlike DirectoryEntry (which is used when constructing directories),
DirectoryEntryView does not manage storage for file names. Names are
just StringViews.
This is much more suited to the directory traversal API and makes
it easier to implement this in file system classes since they no
longer need to create temporary name copies while traversing.
Certain implementations of Inode::directory_entry_count were calling
functions which returned errors, but had no way of surfacing them.
Switch the return type to KResultOr<size_t> and start observing these
error paths.
We can now properly initialize all processors without
crashing by sending SMP IPI messages to synchronize memory
between processors.
We now initialize the APs once we have the scheduler running.
This is so that we can process IPI messages from the other
cores.
Also rework interrupt handling a bit so that it's more of a
1:1 mapping. We need to allocate non-sharable interrupts for
IPIs.
This also fixes the occasional hang/crash because all
CPUs now synchronize memory with each other.
FileBackedFileSystem is one that's backed by (mounted from) a file, in other
words one that has a "source" of the mount; that doesn't mean it deals in
blocks. The hierarchy now becomes:
* FS
* ProcFS
* DevPtsFS
* TmpFS
* FileBackedFS
* (future) Plan9FS
* BlockBasedFS
* Ext2FS
These APIs were clearly modeled after Ext2FS internals, and make perfect sense
in Ext2FS context. The new APIs are more generic, and map better to the
semantics exported to the userspace, where inode identifiers only appear in
stat() and readdir() output, but never in any input.
This will also hopefully reduce the potential for races (see commit c44b4d61f3).
Lastly, this makes it way more viable to implement a filesystem that only
synthesizes its inodes lazily when queried, and destroys them when they are no
longer in use. With inode identifiers being used to reference inodes, the only
choice for such a filesystem is to persist any inode it has given out the
identifier for, because it might be queried at any later time. With direct
references to inodes, the filesystem will know when the last reference is
dropped and the inode can be safely destroyed.
Allow file system implementation to return meaningful error codes to
callers of the FileDescription::read_entire_file(). This allows both
Process::sys$readlink() and Process::sys$module_load() to return more
detailed errors to the user.
For singly-indirect blocks, "callback" is just "add_block".
For doubly-indirect blocks, "callback" is the lambda function
iterating on singly-indirect blocks: so instead of adding itself to the
list, the doubly-indirect block will add all its childs, but they add
themselves again when they run the callback of singly-indirect blocks.
And nothing adds the doubly-indirect block itself :(
This leads to a double free of all child blocks of the doubly-indirect
block, which is the failed assert described in #1549.
Closes: #1549.
read_block() and write_block() now accept the count (how many bytes to read
or write) and offset (where in the block to start; defaults to 0). Using these
new APIs, we can avoid doing copies between intermediary buffers in a lot more
cases. Hopefully this improves performance or something.