robert-w-gries / rxinu

Rust implementation of Xinu educational operating system

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Random deadlocks caused by PIT interrupt during allocation

robert-w-gries opened this issue · comments

Problem

Now that the timer IRQ handler is calling resched(), any code that is not wrapped in a call to the interrupts::disable_then_restore() function can be interrupted to schedule a new process.

Since we do not support preemption in our kernel yet, we essentially wrap all scheduling code in disable_then_restore(). This prevents our kernel from being interrupted while holding important locks, such as the PIC locks or the process table locks.

However, there is still one remaining issue with deadlocks left. Currently, when processes use the allocator API to create structures like Vec or String, there is a chance that the kernel can interrupt the linked-list-allocator while it is holding the Heap lock.

We need to find a way to disable interrupts before using the allocator API, preferably without vendoring the linked-list-allocator code in this repo.

Possible Solutions

Naive solution

We can wrap every allocator call in a process with disable_then_restore(). This is impractical and poor design.

Use syscall for all memory allocation

I don't know if this will be feasible since the allocator api is invoked automatically by creating a Vec or String. It seems possible but undesirable since Vec and String creation would need to be wrapped to use the syscall.

Wrap the linked-list-allocator

We might be able to wrap the linked-list-allocator with our own allocator that just disables interrupts then calls the allocate/deallocate methods in linked-list-allocator. I think this is our best path forward.

Debugging

Call #4 in the call stack below is where we trigger the deadlock in linked-list-allocator. You can tell it's a deadlock because interrupts stop firing and the same lock check keeps getting executed in a loop.

#4  0x0000000000131ac4 in linked_list_allocator::{{impl}}::dealloc (self=0x40004090, 
    ptr=0x400022d8 "\000", layout=...)
    at /home/rob/.cargo/registry/src/github.com-1ecc6299db9ec823/linked_list_allocator-0.4.3/src/lib.rs:176
    unsafe fn dealloc(&mut self, ptr: *mut u8, layout: Layout) {
        self.0.lock().deallocate(ptr, layout)
    }
Click to expand debug info
(gdb) list
1486	
1487	#[inline]
1488	unsafe fn atomic_load<T>(dst: *const T, order: Ordering) -> T {
1489	    match order {
1490	        Acquire => intrinsics::atomic_load_acq(dst),
1491	        Relaxed => intrinsics::atomic_load_relaxed(dst),
1492	        SeqCst => intrinsics::atomic_load(dst),
1493	        Release => panic!("there is no such thing as a release load"),
1494	        AcqRel => panic!("there is no such thing as an acquire/release load"),
1495	        __Nonexhaustive => panic!("invalid memory ordering"),
(gdb) info stack
#0  core::sync::atomic::atomic_load<u8> (dst=0x147018 <rxinu::HEAP_ALLOCATOR> "\001\000", 
    order=core::sync::atomic::Ordering::Relaxed)
    at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/sync/atomic.rs:1491
#1  0x000000000014581c in core::sync::atomic::AtomicBool::load (
    self=0x147018 <rxinu::HEAP_ALLOCATOR>, order=core::sync::atomic::Ordering::Relaxed)
    at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/sync/atomic.rs:316
#2  0x0000000000132d5f in spin::mutex::Mutex<linked_list_allocator::Heap>::obtain_lock<linked_list_allocator::Heap> (self=0x147018 <rxinu::HEAP_ALLOCATOR>)
    at /home/rob/.cargo/registry/src/github.com-1ecc6299db9ec823/spin-0.4.6/src/mutex.rs:167
#3  0x0000000000132d95 in spin::mutex::Mutex<linked_list_allocator::Heap>::lock<linked_list_allocator::Heap> (self=0x147018 <rxinu::HEAP_ALLOCATOR>)
    at /home/rob/.cargo/registry/src/github.com-1ecc6299db9ec823/spin-0.4.6/src/mutex.rs:191
#4  0x0000000000131ac4 in linked_list_allocator::{{impl}}::dealloc (self=0x40004090, 
    ptr=0x400022d8 "\000", layout=...)
    at /home/rob/.cargo/registry/src/github.com-1ecc6299db9ec823/linked_list_allocator-0.4.3/src/lib.rs:176
#5  0x0000000000129964 in rxinu::__rg_allocator_abi::__rg_dealloc (arg0=0x400022d8 "\000", 
    arg1=8192, arg2=8) at src/lib.rs:123
#6  0x000000000011456e in alloc::heap::{{impl}}::dealloc (self=0x40001b70, ptr=0x400022d8 "\000", 
    layout=...)
    at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/liballoc/heap.rs:104
#7  0x0000000000111e70 in alloc::raw_vec::RawVec<usize, alloc::heap::Heap>::dealloc_buffer<usize,alloc::heap::Heap> (self=0x40001b70)
    at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/liballoc/raw_vec.rs:687
#8  0x0000000000112f05 in alloc::raw_vec::{{impl}}::drop<usize,alloc::heap::Heap> (self=0x40001b70)
    at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/liballoc/raw_vec.rs:696
#9  0x0000000000141595 in core::ptr::drop_in_place<alloc::raw_vec::RawVec<usize, alloc::heap::Heap>>
    ()
    at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr.rs:59
#10 0x00000000001411cf in core::ptr::drop_in_place<alloc::vec::Vec<usize>> ()
---Type <return> to continue, or q <return> to quit---
    at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr.rs:59
#11 0x0000000000141b44 in core::ptr::drop_in_place<core::option::Option<alloc::vec::Vec<usize>>> ()
    at /home/rob/.rustup/toolchains/nightly-2017-12-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr.rs:59
#12 0x0000000000121095 in rxinu::scheduling::cooperative_scheduler::{{impl}}::kill (
    self=0x148bc8 <<rxinu::scheduling::SCHEDULER as core::ops::deref::Deref>::deref::__stability::LAZY+8>, id=...) at src/scheduling/cooperative_scheduler.rs:84
#13 0x0000000000119703 in rxinu::scheduling::process::process_ret () at src/scheduling/process.rs:108
#14 0x0000000000148bc8 in <rxinu::scheduling::SCHEDULER as core::ops::deref::Deref>::deref::__stability::LAZY ()
#15 0x00000000001489f0 in ?? ()
#16 0x0000000000000001 in ?? ()
#17 0x0000000000000001 in ?? ()
#18 0x0000000000000000 in ?? ()