Now the filesystem is generated on-the-fly instead of manually adding and
removing inodes as processes spawn and die.
The code is convoluted and bloated as I wrote it while sleepless. However,
it's still vastly better than the old ProcFS, so I'm committing it.
I also added /proc/PID/fd/N symlinks for each of a process's open fd's.
This exposed a serious race condition in page_in_from_inode().
Reordered the logic and added a paging lock to VMObject.
Now, only one process can page in from a VMObject at a time.
There are definitely ways to optimize this, for instance by making
the locking be per-page instead. It's not something that I'm going
to worry about right now though.
Previously, calling Region::commit() would make sure to allocate any missing
physical pages, but they would contain uninitialized data. This was very
obvious when allocating GraphicsBitmaps as they would have garbage pixels
rather than being filled with black.
The MM quickmap mechanism didn't work when operating on a non-active page
directory (which happens when we're in the middle of exec, for example.)
This patch changes quickmap to reside in the shared kernel page directory.
Also added some missing clobber lists to inline asm that I stumbled on.
Make PageDirectory retainable and have each Region co-own the PageDirectory
they're mapped into. When unmapped, Region has no associated PageDirectory.
This allows Region to automatically unmap itself when destroyed.
Only booleans are supported at first. More types can be added easily.
Use this to add /proc/sys/wm_flash_flush which when enabled flashes pending
screen flush rects in yellow before they happen.
Okay, now ProcFS doesn't crash due to the crappy buffer size estimates
not really working out. This thing has dogshit performance and I will
fix that separately.
This is a lot better than having them in kmalloc memory. I'm gonna need
a way to keep track of which process owns which bitmap eventually,
maybe through some sort of resource keying system. We'll see.
The old approach only worked because of an overpermissive accident.
There's now a concept of supervisor physical pages that can be allocated.
They all sit in the low 4 MB of physical memory and are identity mapped,
shared between all processes, and only ring 0 can access them.
This container is really just there to keep a retain on the individual
PhysicalPages for each page table. A HashMap does the job with far greater
space efficiency.
mmap() will now map uncommitted pages that get allocated and zeroed upon the
first access. I also made /proc/PID/vm show number of "committed" bytes in
each region. This is so cool! :^)
- Process::exec() needs to restore the original paging scope when called
on a non-current process.
- Add missing InterruptDisabler guards around g_processes access.
- Only flush the TLB when modifying the active page tables.
This is really sweet! :^) The four instances of /bin/sh spawned at
startup now share their read-only text pages.
There are problems and limitations here, and plenty of room for
improvement. But it kinda works.
All right, we can now mmap() a file and it gets magically paged in from fs
in response to an NP page fault. This is really cool :^)
I need to refactor this to support sharing of read-only file-backed pages,
but it's cool to just have something working.
First of all, change sys$mmap to take a struct SC_mmap_params since our
sycsall calling convention can't handle more than 3 arguments.
This exposed a bug in Syscall::invoke() needing to use clobber lists.
It was a bit confusing to debug. :^)
sys$fork() now clones all writable regions with per-page COW bits.
The pages are then mapped read-only and we handle a PF by COWing the pages.
This is quite delightful. Obviously there's lots of work to do still,
and it needs better data structures, but the general concept works.
This turned out way better than the old code. ELF loading is now quite
straightforward, and we don't need the weird concept of subregions anymore.
Next step is to respect the is_writable flag.
This was the fix:
-process.m_page_directory[0] = m_kernel_page_directory[0];
-process.m_page_directory[1] = m_kernel_page_directory[1];
+process.m_page_directory->entries[0] = m_kernel_page_directory->entries[0];
+process.m_page_directory->entries[1] = m_kernel_page_directory->entries[1];
I spent a good two hours scratching my head, not being able to figure out why
user process page directories felt they had ownership of page tables in the
kernel page directory.
It was because I was copying the entire damn kernel page directory into
the process instead of only sharing the two first PDE's. Dang!
This is quite cool! The syscall entry point plumbs the register dump
down to sys$fork(), which uses it to set up the child process's TSS
in order to resume execution right after the int 0x80 fork() call. :^)
This works pretty well, although there is some problem with the kernel
alias mappings used to clone the parent process's regions. If I disable
the MM::release_page_directory() code, there's no problem. Probably there's
a premature freeing of a physical page somehow.
This is way better than walking the region lists. I suppose we could
even let the hardware trigger a page fault and handle that. That'll
be the next step in the evolution here I guess.
I added an RAII helper called OtherTaskPagingScope. While present,
it switches the kernel over to using another task's page directory.
This is perfect for e.g walking the stack in /proc/PID/stack.