rust3ds / ctru-rs

Rust wrapper for libctru

Home Page:https://rust3ds.github.io/ctru-rs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Zombies ate my 3DS? (kernel panic after multiple 3dsx app exits)

AzureMarker opened this issue · comments

Fun title aside, some odd things happen if you run a 3dsx program a bunch of times. When exiting the 5th run, I get an ARM exception: prefetch abort (kernel panic). This might be related to the std threads, or it's always been there, but it's pretty weird.

The "Fault status" is "Debug event" and it's an exception on core 1 (sys core, which makes sense since the error says kernel panic).

I noticed in Luma3DS's process list that there's zombie 3dsx_app processes when you open/close a program a few times. It panics when there are 4 zombies and you close the 5th 3dsx_app process.

The program I can definitely reproduce this with is thread-info in #46, but I also get it when running the ctru-rs tests (also under std threads though). I haven't tried testing more programs yet, or without std threads.

commented

Hmm. Are we sure the threads are deallocated properly? Especially those on the system core. We should check for that and see what we get.

I did some testing and I only see this happening when svcGetThreadList is called (all others, like the hello world program, don't create zombies). I found a relevant system flaw documented here about this system call not decrementing a reference count: http://3dbrew.org/wiki/3DS_System_Flaws#ARM11_software.

Note that we're currently using this system call as part of pthread_getschedparam/pthread_setschedparam (and therefore std::os::horizon::thread::current_priority) via get_thread_handle here:
https://github.com/Meziu/pthread-3ds/blob/187d2ae5db7c4571db4dcec3ace3bc2a617361e9/src/lib.rs#L111

We have a few options:

  1. Find a way to decrement the reference count ourselves.
  2. Avoid using this system call (find another way to get the main thread's handle).
  3. Replace the standard pthread functions for priority with a single pthread_getpriority_np function that can just use CUR_THREAD_HANDLE (no searching for the specified thread's handle and having to handle the main thread case).

I opened rust3ds/pthread-3ds#12 to avoid calling svcGetThreadList during thread spawning (when creating the default pthread attributes value).

commented

Is this issue closed? I couldn’t replicate it myself, though it seems like you fixed it(?). Tell me if anything is needed.

I think it's still an issue for the APIs that call svcGetThreadList, which includes e.g. pthread_{get,set}schedparam.

Here's an idea: what if instead of using ctru_sys::Thread as pthread_t, we used a simple struct like this:

// pthread-3ds
struct PThread {
    ctru_thread: ctru_sys::Thread,
    thread_id: u32, // or whatever type this needs to be, I think it's u32
}

// libc
pub type pthread_t = *mut ::c_void; // really a *mut PThread

I think then we could use threadGetHandle where needed, or svcOpenThread as a fallback if ctru_thread.is_null() (main thread case), to get the the handle for using the priority APIs etc.

There would be a little bit of extra memory management and bookkeeping, but not too much different than what we already have.

Let me know what you think, I'm fairly sure this would work and could avoid some of these issues. I will try to get a proof of concept working when I have time as well.


Side note: I've noticed that the result of threadGetHandle seems to be different than the result of calling svcGetThreadId -> svcOpenThread to get that handle... so perhaps there is some discrepancy or internal handle state for which we would want a reference to both anyway.

I was writing a comment but forgot to finish it. I haven't read @ian-h-chamberlain's comment fully yet, but the issue isn't currently fixed because the thread list function is still used by std::os::horizon::thread::current_priority.

@ian-h-chamberlain That sounds like a good solution, do you want to make a PR?

Edit: not sure if we need to change libc. Check what the other platforms do. I think as long as the ABI is the same between long and *void it's fine. But feel free to make the change since it's technically more accurate.

@Meziu @ian-h-chamberlain let me know also if you're having trouble reproducing the issue. It should definitely reproduce if you call svcGetThreadList, even in just a temporary example. But the thread-info example from #46 is one of the ways I reproduced it.

You should be able to open Luma3DS's menu and look at the process list, and at the bottom see some 3dsx_app processes marked as Zombie.

Edit: not sure if we need to change libc. Check what the other platforms do. I think as long as the ABI is the same between long and *void it's fine. But feel free to make the change since it's technically more accurate.

Looks like it varies by platform, some use various int type,s, some use *c_void, so we're probably fine here.

I'll open a PR based on what I have now, although it's unfortunate that any example using svcGetThreadList will still zombie... :(

Yeah, I've been scouring the internet looking for anything else that references the issue, but I haven't found anything. Luckily only thread-info uses it explicitly, and I've just pushed a change to remove that usage.