Mysterious endless spinlock in PIT init when compiling with FEATURES=vga
robert-w-gries opened this issue · comments
When building with make run FEATURES=vga
, we hit an endless spinlock in PIT init function here
PIT.lock()[0].write(PIT_SET);
(gdb)
<spin::mutex::Mutex<T>>::obtain_lock (self=0x146b98 <rxinu::device::pit::PIT>)
at /home/rob/.cargo/registry/src/github.com-1ecc6299db9ec823/spin-0.4.7/src/mutex.rs:169
169 cpu_relax();
It should be impossible for PIT
to already be locked at this point because this line in init()
is the first instance where PIT
is locked.
Strangely, removing code from a loop lower down in rust_main
fixes the deadlock:
diff --git a/src/lib.rs b/src/lib.rs
index dae5ba2..6085fc9 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -59,16 +59,6 @@ pub extern "C" fn rust_main(multiboot_information_address: usize) {
syscall::create(rxinu_main, String::from("rxinu_main"));
loop {
- #[cfg(feature = "serial")]
- {
- use device::uart_16550 as uart;
- uart::read(1024);
- }
- #[cfg(feature = "vga")]
- {
- use device::keyboard::ps2 as kbd;
- kbd::read(1024);
- }
}
}
This needs more investigation.
- Check if older versions of nightly compiler work with original code
- Check for bugs related to file size
I do not see the spinlock with rustc 1.25.0-nightly (27a046e93 2018-02-18)
.
I will need to see which version of rustc causes the issue.
This issue is introduced in nightly-2018-04-07
I've discovered that the value of PIT.lock
value is changed during the lazy static initialization of IDT
.
19 static ref IDT: [IdtEntry; 256] = {
20 use arch::x86::interrupts::exception::*;
21
22 let mut idt: [IdtEntry; 256] = [IdtEntry::MISSING; 256];
23
24 idt[0] = intr_handler_entry(divide_by_zero as usize);
25 idt[1] = intr_handler_entry(debug as usize);
26 idt[2] = intr_handler_entry(non_maskable as usize);
27 idt[3] = intr_handler_entry(break_point as usize);
28 idt[4] = intr_handler_entry(overflow as usize);
intr_handler_entry
simply calls create_idt_entry
. which returns an x86::IdtEntry
:
fn intr_handler_entry(ptr: usize) -> IdtEntry {
create_idt_entry(ptr, PrivilegeLevel::Ring0)
}
It is after this call to create_idt_entry()
where we see the value of PIT.lock
increase by 0x40
. Each successive call to create_idt_entry()
increases the value of PIT.lock
by 0x50. This behavior is inexplicable. Perhaps we are hitting undefined behavior?
Here are the gdb logs:
PIT.lock
is changed to 0x40
Breakpoint 2, <rxinu::arch::x86::idt::IDT as core::ops::deref::Deref>::deref::__static_ref_initialize () at src/arch/x86/idt.rs:24
24 idt[0] = intr_handler_entry(divide_by_zero as usize);
(gdb) p rxinu::device::pit::PIT.lock
$2 = core::sync::atomic::AtomicBool {v: core::cell::UnsafeCell<u8> {value: 0}}
(gdb)
<rxinu::arch::x86::idt::IDT as core::ops::deref::Deref>::deref::__static_ref_initialize () at src/arch/x86/idt.rs:25
25 idt[1] = intr_handler_entry(debug as usize);
(gdb) p rxinu::device::pit::PIT.lock
$8 = core::sync::atomic::AtomicBool {v: core::cell::UnsafeCell<u8> {value: 0}}
(gdb) s
rxinu::arch::x86::idt::intr_handler_entry (ptr=1168704) at src/arch/x86/idt.rs:85
85 create_idt_entry(ptr, PrivilegeLevel::Ring0)
(gdb) p rxinu::device::pit::PIT.lock
$9 = core::sync::atomic::AtomicBool {v: core::cell::UnsafeCell<u8> {value: 0}}
(gdb) s
rxinu::arch::x86::idt::create_idt_entry (ptr=1168704, privilege=x86::shared::PrivilegeLevel::Ring0) at src/arch/x86/idt.rs:105
105 VAddr::from_usize(ptr),
(gdb) p rxinu::device::pit::PIT.lock
$10 = core::sync::atomic::AtomicBool {v: core::cell::UnsafeCell<u8> {value: 64}}
PIT.lock
is increased by 0x50
(gdb) s
<rxinu::arch::x86::idt::IDT as core::ops::deref::Deref>::deref::__static_ref_initialize () at src/arch/x86/idt.rs:26
26 idt[2] = intr_handler_entry(non_maskable as usize);
(gdb) s
rxinu::arch::x86::idt::intr_handler_entry (ptr=1168784) at src/arch/x86/idt.rs:85
85 create_idt_entry(ptr, PrivilegeLevel::Ring0)
(gdb)
rxinu::arch::x86::idt::create_idt_entry (ptr=1168784, privilege=x86::shared::PrivilegeLevel::Ring0) at src/arch/x86/idt.rs:105
105 VAddr::from_usize(ptr),
(gdb) p/x rxinu::device::pit::PIT
$12 = spin::mutex::Mutex<[rxinu::syscall::io::port::Port<u8>; 2]> {lock: core::sync::atomic::AtomicBool {v: core::cell::UnsafeCell<u8> {value: 0x90}}
[snip]
Time for assembly code. Here is the beginning of create_idt_entry()
(remember that our PIT.lock
value was changed before the call to VAddr::from_usize(ptr)
)
0000000000125920 <_ZN5rxinu4arch3x863idt16create_idt_entry17hb6129a1fdda20043E>:
125920: 55 push %rbp
125921: 48 89 e5 mov %rsp,%rbp
125924: 48 83 ec 40 sub $0x40,%rsp
125928: 88 d0 mov %dl,%al
12592a: 48 89 f9 mov %rdi,%rcx
>>12592d: 48 89 75 e8 mov %rsi,-0x18(%rbp)
125931: 88 45 f6 mov %al,-0xa(%rbp)
125934: 48 8b 75 e8 mov -0x18(%rbp),%rsi
125938: 48 89 7d e0 mov %rdi,-0x20(%rbp)
12593c: 48 89 f7 mov %rsi,%rdi
12593f: 48 89 4d d8 mov %rcx,-0x28(%rbp)
125943: e8 d8 40 01 00 callq 139a20 <_ZN3x866shared6paging5VAddr10from_usize17h4eab7afc2319aa81E>
The value of PIT.lock
is changed in instruction 0x12592d
Value of registers before instruction 0x12592d
is executed
(gdb) info registers
rax 0x147c50 1342544
rbx 0x0 0
rcx 0xd500 54528
rdx 0x0 0
rsi 0x11d540 1168704
rdi 0x147c50 1342544
rbp 0x146bb0 0x146bb0
rsp 0x146bb0 0x146bb0
[snip]
rip 0x125924 0x125924 <rxinu::arch::x86::idt::create_idt_entry+4>
[snip]
Our rsi
register has the value 0x11d540
. This address is the function pointer to our debug
exception handler.
(gdb) x 0x11d540
0x11d540 <rxinu::arch::x86::interrupts::exception::debug>: 0xe5894855
nightly-2018-04-07
Broken Behavior
When instruction 0x12592d
is executed, we store our function pointer in a pointer on the stack. This pointer to the function pointer is located three bytes above our base pointer, -0x18(%rbp)
.
Strangely, we see that we are storing our exception::debug
function pointer in the PIT
struct.
(gdb) info registers
[snip]
rbp 0x146bb0 0x146bb0
(gdb) x 0x146bb0-0x18
0x146b98 <_ZN5rxinu6device3pit3PIT17h43d86267d481add5E>: 0x0011d540
(gdb) p/x rxinu::device::pit::PIT.lock
$5 = core::sync::atomic::AtomicBool {v: core::cell::UnsafeCell<u8> {value: 0x40}}
Since we store our function pointer=0x11d540
in the static PIT: Mutex
structure, we see that the first field of Mutex, lock
, will have the value of 0x11d540
.
However, the lock
field of PIT
is of size u8
, so we see a value of 0x40
for the lock. We call the create_idt_entry()
over 30 times, and the value of PIT.lock
is changed to the value of the exception function pointer every time.
@robert-w-gries Do you have your current state online somewhere (e.g. in a branch)? I just tried to run the master branch but it doesn't have the required compiler_rt changes for the 2018-04-07
nightly.
Ok, I performed the compiler_rt
changes and the build succeeded. When I run it, the output is:
warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
kernel start: 0x100000, kernel end: 0x153000
multiboot start: 0x298de8, multiboot end: 0x2994b0
mapping section at addr: 0x100000, size: 0x5000
mapping section at addr: 0x105000, size: 0x3d000
mapping section at addr: 0x142000, size: 0x3000
mapping section at addr: 0x145000, size: 0x9000
mapping section at addr: 0x14e000, size: 0x0
mapping section at addr: 0x14e000, size: 0x0
mapping section at addr: 0x14e000, size: 0x0
mapping section at addr: 0x14e000, size: 0x5000
guard page at 0x149000
[ OK ] PIC Driver
[ OK ] Serial Driver
[ OK ] PS/2 Driver
[ OK ] Programmable Interval Timer
It did not crash!
HEAP START = 0x40000000
HEAP END = 0x4007d000
In main process!
In rxinu_main::created_process!
You can now type...
typing
works
without problems!
Seems like everything is working fine.
> rustc --version --verbose
rustc 1.27.0-nightly (056f589fb 2018-04-07)
binary: rustc
commit-hash: 056f589fb8bcd70e7caa2bc7b3ede45624bb8e6d
commit-date: 2018-04-07
host: x86_64-unknown-linux-gnu
release: 1.27.0-nightly
LLVM version: 6.0
> cargo --version --verbose
cargo 1.26.0-nightly (b70ab13b3 2018-04-04)
release: 1.26.0
commit-hash: b70ab13b31628e91b05961d55c07abf20ad49de6
commit-date: 2018-04-04
> xargo --version
xargo 0.3.12
cargo 1.26.0-nightly (b70ab13b3 2018-04-04)
Hi Phil, did you build with the following command?
make run FEATURES=vga
Thanks for the repro! What system are you using?
I haven't had time to debug this issue since last night. I think the next step is figuring out why PIT
is on the stack during IDT entry creation.
I've never seen a bug like this before
It looks like the PIT
's memory location is wrong when using nightly-04-07
.
Keep in mind that mov $function_pointer, -0x18(%rbp=0x146bb0)
= mov $function_pointer, (0x146b98)
Broken behavior
Dumping the symbol table shows that PIT
has an incorrect location:
$ objdump -t build/rxinu-x86-x86_64.bin | grep PIT
0000000000146b98 l O .data 0000000000000006 _ZN5rxinu6device3pit3PIT17h43d86267d481add5E
Breakpoint 1, rust_main (multiboot_information_address=2778056) at src/lib.rs:48
48 arch::interrupts::disable();
// (%rbp - 0x18) == (PIT mem location)
(gdb) x 0x146bb0-0x18
0x146b98 <_ZN5rxinu6device3pit3PIT17h43d86267d481add5E>: 0x00430000
(gdb) p/x &(rxinu::device::pit::PIT)
$1 = 0x146b98
Working behavior
$ objdump -t build/rxinu-x86-x86_64.bin | grep PIT
00000000001469e0 l O .data 0000000000000006 _ZN5rxinu6device3pit3PIT17hc39ac3f92415694dE
Breakpoint 1, rust_main (multiboot_information_address=2762152) at src/lib.rs:48
48 arch::interrupts::disable();
// stack is zero'd
(gdb) x 0x146b98
0x146b98: 0x00000000
(gdb) p/x &(rxinu::device::pit::PIT)
$1 = 0x1469e0
(gdb) x 0x1469e0
0x1469e0 <_ZN5rxinu6device3pit3PIT17hc39ac3f92415694dE>: 0x00430000
// continue until create_idt_entry(), we see that rbp-0x18 contains the function pointer value and does not overlap with PIT mem location
(gdb) x 0x146bb0-0x18
0x146b98: 0x0011d8d0
I ran out of avenues of investigation and started experimenting with my source code to see how removing lines/adding lines would affect the behavior, and I found more weirdness.
Diff
diff --git a/Cargo.toml b/Cargo.toml
index e5fe526..8a63030 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -14,7 +14,6 @@ bitflags = "1.0.1"
linked_list_allocator = "0.5"
multiboot2 = { git = "https://github.com/phil-opp/multiboot2-elf64" }
once = "0.3.2"
-rlibc = "1.0.0"
spin = "0.4.6"
volatile = "0.1.0"
diff --git a/Xargo.toml b/Xargo.toml
index a16927a..c5e15fe 100644
--- a/Xargo.toml
+++ b/Xargo.toml
@@ -8,4 +8,5 @@ alloc = {}
stage = 0
[dependencies.compiler_builtins]
+features = ["mem"]
stage = 1
diff --git a/src/lib.rs b/src/lib.rs
index dae5ba2..d7ae4b8 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -28,7 +28,6 @@ extern crate bit_field;
extern crate compiler_builtins;
extern crate linked_list_allocator;
extern crate multiboot2;
-extern crate rlibc;
extern crate spin;
extern crate volatile;
extern crate x86;
New Behavior
After this change, I no longer see that PIT.lock
value is changed during create_idt_entry()
. However, I now see that an instruction pointer to <x86::bits64::irq::IdtEntry::new+103>
is stored in PIT
after the instruction callq 13a480 <VAddr::as_usize>
is executed
(gdb) p/x &(rxinu::device::pit::PIT)
$1 = 0x146aa8
(gdb) watch *0x146aa8
Hardware watchpoint 2: *0x146aa8
(gdb) c
Continuing.
Hardware watchpoint 2: *0x146aa8
Old value = 4390912
New value = 1289367
(gdb) x 1289367
0x13ac97 <x86::bits64::irq::IdtEntry::new+103>: 0xc0458948
(gdb) info stack
#0 x86::shared::paging::VAddr::as_usize (self=0x0)
at /home/rob/.cargo/git/checkouts/rust-x86-24ca3b49ed2e6039/1e2efb6/src/shared/paging.rs:11
#1 0x000000000013ac97 in x86::bits64::irq::IdtEntry::new (handler=..., gdt_code_selector=...,
dpl=x86::shared::PrivilegeLevel::Ring0, ty=x86::bits64::irq::Type::InterruptGate, ist_index=0)
at /home/rob/.cargo/git/checkouts/rust-x86-24ca3b49ed2e6039/1e2efb6/src/bits64/irq.rs:71
I'm pretty sure this is a stack overflow of the startup stack in startup-common.nasm
. The layout of the bss section is:
SYMBOL TABLE:
0000000000147000 l d .bss 0000000000000000 .bss
000000000014e028 l O .bss 0000000000000060 _ZN72_$LT$rxinu..scheduling..SCHEDULER$u20$as$u20$core..ops..deref..Deref$GT$5deref11__stability4LAZY17h9e7789cc392989d8E
000000000014e0b0 l O .bss 0000000000001010 _ZN70_$LT$rxinu..arch..x86..idt..IDT$u20$as$u20$core..ops..deref..Deref$GT$5deref11__stability4LAZY17haf31311466b783edE
000000000014f112 l O .bss 0000000000000001 _ZN5rxinu4arch3x866memory4init26assert_has_not_been_called6CALLED17hebe677239b36b4f4E
0000000000147000 l .bss 0000000000000000 stack_bottom
000000000014b000 l .bss 0000000000000000 stack_top
000000000014b000 l .bss 0000000000000000 p4_table
000000000014c000 l .bss 0000000000000000 p3_table
000000000014d000 l .bss 0000000000000000 p2_table
000000000014f0c0 g O .bss 0000000000000048 .hidden _ZN5rxinu4arch3x863gdt3GDT17hb0bd0fdb5e589c68E
000000000014e000 g O .bss 0000000000000028 .hidden _ZN5rxinu14HEAP_ALLOCATOR17hc6ce08083adf9802E
000000000014f118 g O .bss 0000000000000008 .hidden _ZN5rxinu6device3pit9PIT_TICKS17h5d06cc9ba9f087bbE
000000000014f108 g O .bss 000000000000000a .hidden _ZN5rxinu6device8keyboard5STATE17hd52ed47736ea8f64E
000000000014e088 g O .bss 0000000000000028 .hidden _ZN5rxinu6device8keyboard3ps212PS2_KEYBOARD17hef3b749fc73804f3E
Note that the symbol table is not ordered by address. The stack lives on the lower end of the section. So when it overflows, it overwrites the section below.
Sections:
Idx Name Size VMA LMA File off Algn
0 .rodata 00005000 0000000000100000 0000000000100000 00001000 2**4
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .text 0003f000 0000000000105000 0000000000105000 00006000 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
2 .data 00003000 0000000000144000 0000000000144000 00045000 2**3
CONTENTS, ALLOC, LOAD, DATA
3 .bss 00009000 0000000000147000 0000000000147000 00048000 2**12
ALLOC
The section below is the .data section with the following contents:
SYMBOL TABLE:
0000000000144000 l d .data 0000000000000000 .data
0000000000146b98 l O .data 0000000000000006 _ZN5rxinu6device3pit3PIT17h70f10cd7544321a1E
0000000000144000 w O .data 0000000000000008 .hidden DW.ref.rust_eh_personality
0000000000145930 g O .data 0000000000000038 .hidden _ZN5rxinu6device10uart_165504COM117h2d2e3f7279eb3b13E
00000000001442f8 g O .data 0000000000000004 .hidden _ZN5rxinu6device19ps2_controller_804210CONTROLLER17ha01be719b3da9987E
0000000000145968 g O .data 0000000000000038 .hidden _ZN5rxinu6device10uart_165504COM217h7f32cb7f66847b02E
0000000000145468 g O .data 0000000000000030 .hidden vtable.A.llvm.5436011002358689358
0000000000146160 g O .data 0000000000000020 .hidden _ZN5rxinu6device3vga3VGA17h684b5db027fb2639E
00000000001442fc g O .data 0000000000000004 .hidden _ZN5rxinu6device19ps2_controller_80426DEVICE17hbfd41b6c4688d127E
00000000001462fe g O .data 0000000000000006 .hidden _ZN5rxinu6device8pic_82595SLAVE17hd4a0874b713c487bE
00000000001458a0 g O .data 0000000000000048 .hidden _ZN5rxinu4arch3x8610interrupts9exception8SECURITY17h21c4d10285713339E
0000000000145158 g O .data 0000000000000028 .hidden byte_str.6.llvm.8013824305494789554
00000000001462f8 g O .data 0000000000000006 .hidden _ZN5rxinu6device8pic_82596MASTER17h53ecb500c298b076E
0000000000145390 g O .data 0000000000000030 .hidden vtable.m.llvm.14348117055639440522
0000000000144418 g O .data 00000000000005e8 _ZN3x866shared3irq10EXCEPTIONS17hd0bf37a43cc76f39E
Near the top of this section, at 0x146b98
, lives the PIT. If the stack pointer points to this value, this means that the stack grew past stack_bottom
into the .data
area.
Increasing the stack size in startup-common.nasm
fixes the problem:
section .bss
align 4096
stack_bottom:
- resb 4096 * 4
+ resb 4096 * 5
stack_top:
A stack size of 5 pages seems to suffice, but I would use a larger size to be sure.
So why does this problem happen only in some nightlies? Well, it seems that code generation changed somehow so that the stack usage increased. This could be a bug or it is just that some optimization changed and is less optimal in this case (but probably better in other cases).
I hope this helps!
Thanks Phil, great investigation work! This has definitely been a learning experience for me 😄
You're welcome :)