ned14 / quickcpplib

Eliminate all the tedious hassle when making state-of-the-art C++ 14 - 23 libraries!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Signal guard breaks WinDbg Time Travel debugging

gix opened this issue · comments

When TTD starts a process, it injects its CPU recorder thread before anything else. The loader then invokes __dyn_tls_init, which in turn invokes the _win32_set_terminate_handler_per_thread initializer, which uses std::get_terminate() before the runtime is initialized and crashes.

Not sure what the guarantees are about when thread local initializers are run and whether this is unsafe or not. It was surprising at least, since I'm not even using signal guards (with LLFIO).

The callstack looks like this, with __acrt_FlsSetValue calling a bogus function pointer:

__acrt_FlsSetValue(unsigned long fls_index=0xffffffff, void * fls_data=0xffffffff)
internal_get_ptd_head()
internal_getptd_noexit(const __crt_scoped_get_last_error_reset & last_error_reset={...}, const unsigned int global_state_index=0x00000000)
internal_getptd_noexit()
__acrt_getptd()
_get_terminate()
std::get_terminate()
_win32_set_terminate_handler_per_thread_t::_win32_set_terminate_handler_per_thread_t()
`dynamic initializer for '_win32_set_terminate_handler_per_thread''()
__dyn_tls_init(void * __formal=0x000b0000, unsigned long dwReason=0x00000002, void * __formal=0x00000000)
_LdrxCallInitRoutine@16()
LdrpCallInitRoutine()
LdrpCallTlsInitializers()
LdrpInitializeThread()
_LdrpInitialize()
LdrInitializeThunk()

This is a weird situation and I'm not sure how best to proceed. _win32_set_terminate_handler_per_thread is a bog standard C++ static variable, if it's being called then the CRT has been initialised and it's doing ordinary static variable init. It should always be absolutely legal to call std::get_terminate() from within C++ static variable init. I'm therefore minded that this is actually a corner case bug within Microsoft's stack, but that doesn't help you.

Best I can offer is a macro which disables the installation of the terminate and bad alloc handlers?

To explain why these need to be poked into every newly created thread, the problem is those handlers on MSVC are thread local. If one later sets up a global signal guard for terminate or bad alloc, it could not catch signals raised. So even if you never use signal guard, as soon as it is loaded into a process it has to run the above code.

It looks like I can disable this already with LLFIO_FORCE_SIGNAL_DETECTION_OFF?

Wouldn't it be possible to detect the case of such an outlier thread? AFAICS thread local initializers are run after normal static initializers, so this thread local initializer could be skipped if the runtime is not initialized yet.

It looks like I can disable this already with LLFIO_FORCE_SIGNAL_DETECTION_OFF?

Yes, however that is overkill. LLFIO doesn't use the terminate and bad alloc handlers. It only cares about trapping failures of writes into mmap regions.

Wouldn't it be possible to detect the case of such an outlier thread? AFAICS thread local initializers are run after normal static initializers, so this thread local initializer could be skipped if the runtime is not initialized yet.

If static var init has begun, by definition the runtime is initialised.

Almost certainly above the FlsSetValue() is being called on the return value of FlsAlloc(), which if it fails returns -1. Probably the win32 error code returned will be useless, what you really need is the status code returned by the underlying RtlFlsAlloc from the NT kernel. It may explain why it is failing.

That might explain the cause of the problem, though it's almost certainly Microsoft code which needs to change to fix the problem. At least you'd have a repro to send to Microsoft.

I'll try to find some time soon to patch quickcpplib with a macro which works around your problem for now. It'll be at least latter half of this week, sorr.y

Ok, macro added.

The fix_signal_guard branch will eventually let me fix this properly, though I haven't found the spare time to work on it recently :(