icedland / iced

Blazing fast and correct x86/x64 disassembler, assembler, decoder, encoder for Rust, .NET, Java, Python, Lua

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Weird Issue with Test and Thread Local Storage

0xcaff opened this issue · comments

It seems that when using the thread_local nightly feature (rust-lang/rust#29594) inside a test, iced breaks (the decoder fails to construct).

Here's a minimum reproduction:

#![feature(thread_local)]

use std::slice;
use iced_x86::{Decoder, DecoderOptions};

#[thread_local]
static mut EXTERNAL_STACK_SEGMENT: [u8; 4096 * 1000] = [0; 4096 * 1000];

fn uses() {
    unsafe { EXTERNAL_STACK_SEGMENT[10] = 123; }
}

fn main() {
    func();
}

fn func() {
    let code_bytes =
        unsafe { slice::from_raw_parts(uses as *const u8, 1024) };

    let mut decoder = Decoder::new(64, code_bytes, DecoderOptions::NONE);
    let instruction = decoder.iter().take(12).collect::<Vec<_>>();

    for instr in instruction {
        println!("{}", instr);
    }
}

#[cfg(test)]
mod tests {
    use crate::{func};

    #[test]
    fn test() {
        func()
    }
}

cargo run

[martin@unknownb8098a43f382 iced_repro]$ cargo run
   Compiling iced_repro v0.1.0 (/home/martin/projects/iced_repro)
    Finished dev [unoptimized + debuginfo] target(s) in 0.38s
     Running `target/debug/iced_repro`
mov rax,fs:[0]
lea rax,[rax-3E8050h]
mov [rsp-8],rax
mov rax,[rsp-8]
mov byte ptr [rax+0Ah],7Bh
ret
nop
push rax
call 0000000000000030h
pop rax
ret
nop dword ptr [rax+rax]

As expected.

cargo test

[martin@unknownb8098a43f382 iced_repro]$ cargo test
    Finished test [unoptimized + debuginfo] target(s) in 0.00s
     Running unittests src/main.rs (target/debug/deps/iced_repro-744be85695f52bd9)

running 1 test

thread 'tests::test' has overflowed its stack
fatal runtime error: stack overflow
error: test failed, to rerun pass `--bin iced_repro`

Caused by:
  process didn't exit successfully: `/home/martin/projects/iced_repro/target/debug/deps/iced_repro-744be85695f52bd9` (signal: 6, SIGABRT: process abort signal)

Stepping through this, it seems like the crash happens even before attempting to parse the instructions.

image

Changing EXTERNAL_STACK_SEGMENT size

Changing

-static mut EXTERNAL_STACK_SEGMENT: [u8; 4096 * 1000] = [0; 4096 * 1000];
+static mut EXTERNAL_STACK_SEGMENT: [u8; 1000] = [0; 1000];

Seems to change the output code and no longer triggers the issue.

before version (broken)
image

const uint8_t data[31] = 
{
	0x64, 0x48, 0x8b, 0x04, 0x25, 0x00, 0x00, 0x00, 0x00, 0x48, 0x8d, 0x80, 0xb0, 0x7f, 0xc1, 0xff,
	0x48, 0x89, 0x44, 0x24, 0xf8, 0x48, 0x8b, 0x44, 0x24, 0xf8, 0xc6, 0x40, 0x0a, 0x7b, 0xc3
};

after version (working)
image

const uint8_t data[10] = 
{
	0x64, 0xc6, 0x04, 0x25, 0xd2, 0xfb, 0xff, 0xff, 0x7b, 0xc3
};

Environment Information

rust-toolchain.toml

[toolchain]
channel = "nightly-2024-01-06"
components = ["rustfmt"]

Cargo.lock

# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
version = 3

[[package]]
name = "iced-x86"
version = "1.21.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7c447cff8c7f384a7d4f741cfcff32f75f3ad02b406432e8d6c878d56b1edf6b"
dependencies = [
 "lazy_static",
]

[[package]]
name = "iced_repro"
version = "0.1.0"
dependencies = [
 "iced-x86",
]

[[package]]
name = "lazy_static"
version = "1.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e2abad23fbc42b3700f2f279844dc832adb2b2eb069b2df918f455c4e18cc646"

This is all for target x86_64-unknown-linux-gnu

Hunches

My guess is this has something to do with lazy_static

Running with

RUST_MIN_STACK=8388608 cargo test

fixes the issue. I believe this has nothing to do with iced. I guess the tls eats up the thread stack space?

https://stackoverflow.com/questions/42955243/cargo-test-release-causes-a-stack-overflow-why-doesnt-cargo-bench/42960702#42960702

It's possible that some of the iced init code called by the lazy_static uses a lot of stack space, I haven't checked, since I never tried the the nightly thread_local feature. Anyway that init code will be rewritten in the next version.