Kernel: Don't finalize a thread while it still has code running

After marking a thread for death we might end up finalizing the thread
while it still has code to run, e.g. via:

Thread::block -> Thread::dispatch_one_pending_signal
-> Thread::dispatch_signal -> Process::terminate_due_to_signal
-> Process::die -> Process::kill_all_threads -> Thread::set_should_die

This marks the thread for death. It isn't destroyed at this point
though.

The scheduler then gets invoked via:

Thread::block -> Thread::relock_process

At that point we still have a registered blocker on the stack frame
which belongs to Thread::block. Thread::relock_process drops the
critical section which allows the scheduler to run.

When the thread is then scheduled out the scheduler sets the thread
state to Thread::Dying which allows the finalizer to destroy the Thread
object and its associated resources including the kernel stack.

This probably also affects objects other than blockers which rely
on their destructor to be run, however the problem was most noticible
because blockers are allocated on the stack of the dying thread and
cause an access violation when another thread touches the blocker
which belonged to the now-dead thread.

Fixes #7823.
This commit is contained in:
Gunnar Beutner 2021-06-06 11:40:11 +02:00 committed by Andreas Kling
parent cab2ee5ea2
commit 3c2a6a25da
Notes: sideshowbarker 2024-07-18 12:46:26 +09:00
2 changed files with 12 additions and 18 deletions

View file

@ -175,8 +175,6 @@ bool Scheduler::pick_next()
{
VERIFY_INTERRUPTS_DISABLED();
auto current_thread = Thread::current();
// Set the m_in_scheduler flag before acquiring the spinlock. This
// prevents a recursive call into Scheduler::invoke_async upon
// leaving the scheduler lock.
@ -194,22 +192,6 @@ bool Scheduler::pick_next()
ScopedSpinLock lock(g_scheduler_lock);
if (current_thread->should_die() && current_thread->state() == Thread::Running) {
// Rather than immediately killing threads, yanking the kernel stack
// away from them (which can lead to e.g. reference leaks), we always
// allow Thread::wait_on to return. This allows the kernel stack to
// clean up and eventually we'll get here shortly before transitioning
// back to user mode (from Processor::exit_trap). At this point we
// no longer want to schedule this thread. We can't wait until
// Scheduler::enter_current because we don't want to allow it to
// transition back to user mode.
if constexpr (SCHEDULER_DEBUG)
dbgln("Scheduler[{}]: Thread {} is dying", Processor::id(), *current_thread);
current_thread->set_state(Thread::Dying);
}
if constexpr (SCHEDULER_RUNNABLE_DEBUG) {
dump_thread_list();
}

View file

@ -241,6 +241,16 @@ void Thread::die_if_needed()
u32 unlock_count;
[[maybe_unused]] auto rc = unlock_process_if_locked(unlock_count);
dbgln_if(THREAD_DEBUG, "Thread {} is dying", *this);
{
ScopedSpinLock lock(g_scheduler_lock);
// It's possible that we don't reach the code after this block if the
// scheduler is invoked and FinalizerTask cleans up this thread, however
// that doesn't matter because we're trying to invoke the scheduler anyway
set_state(Thread::Dying);
}
ScopedCritical critical;
// Flag a context switch. Because we're in a critical section,
@ -316,6 +326,8 @@ LockMode Thread::unlock_process_if_locked(u32& lock_count_to_restore)
void Thread::relock_process(LockMode previous_locked, u32 lock_count_to_restore)
{
VERIFY(state() == Thread::Running);
// Clearing the critical section may trigger the context switch
// flagged by calling Scheduler::donate_to or Scheduler::yield
// above. We have to do it this way because we intentionally