Wild Wild West: The Steel Assassin configuration window turns black
automatedbugreportingfacility opened this issue · comments
First of all, this is truly an impressive piece of software with great compatibility, I was surprised to find out that an obscure game from 1999 works almost perfectly! Here's one thing I've noticed so far:
When Wild Wild West: The Steel Assassin is launched, a configuration box is displayed. It looks correctly for a short while:
Then, Gdi::installHooks()
is called and the window turns to this:
Once I move the window off the screen and drag it back, it's redrawn correctly:
This bug occurs in the full version of the game, I couldn't reproduce it in the demo versions.
GPU: GeForce GTX 970 (driver: 552.22)
OS: Windows 10 22H2
Log: DDrawCompat-VRDMfc.zip
If the game is run from a non-administrative account, it is necessary to enable "Run this program as an administator" for C:\Program Files (x86)\Southpeak Interactive\west\launch32.exe to fix a DLLRegisterServer error. The binary loading ddraw.dll is \west\bin\VRDMfc.exe.
Please let me know if I should provide any more information.
Then, Gdi::installHooks() is called
More specifically, the window turns black after the Dll::createThread()
call in Gdi::GuiThread::installHooks()
.
If a debugger is being used, it might be necessary to ensure that the window of the debugger does not overlap with the game window. Presumably, that may cause the game window to redraw, and the glitch disappears.
Thanks for the fix, it seems to work well. I've tested all the various wrappers out there, and yours is currently the only one capable of handling this game -- great job!
Btw, I was looking into the bug a bit, and noticed that it was a race condition -- if the window thread handled WM_WINDOWPOSCHANGED
before WM_SYNCPAINT
, then things went wrong. If WM_SYNCPAINT
came in first, the window was drawn correctly, not sure why. I'll try to figure it out someday, but I'm a complete noob in this graphics stuff.
After overcoming some WinDbg idiosyncrasies, I managed to debug the thing, and I noticed that there might be an issue with handling of Window::invalidatedRegion
. Here's what happens:
The Gui thread starts the InitTopLevelWindow()
call, which is guarded by the critical section lock. In the meantime, the window thread receives a WM_WINDOWPOSCHANGED
message and starts waiting to acquire the lock. The Gui thread creates the Window object for the top-level window, initializes Window::invalidatedRegion
to a NULLREGION
in the constructor, and correctly updates it in updatePosition()
to be a SIMPLEREGION
whose size corresponds to the entire window rect. The Gui thread then sends a WM_SYNCPAINT
message to the window thread -- however, at this point, the window thread is already busy waiting for the lock in WM_WINDOWPOSCHANGED
. Then, initTopLevelWindow()
completes, the critical section lock held by the Gui thread is released, and the window thread starts its own round of updateAll()
. Now, an incorrect (I believe) thing happens: when it encounters the top-level window, it nukes its invalidatedRegion
in updatePosition()
. Then, the WM_WINDOWPOSCHANGED
handler completes, and RedrawWindow()
is called in response to the WM_SYNCPAINT
message sent from the Gui thread, but there's nothing left to redraw anymore.
While the fix should ensure that the init shouldn't glitch anymore, do you think it's possible that a similar race may occur with multiple window threads? I was also wondering if there was any way to validate that regions have been handled in RedrawWindow()
before throwing them away.
I think you are right. This might be a better fix:
ddraw.zip (diff.txt compared to v0.5.2)
I was also wondering if there was any way to validate that regions have been handled in RedrawWindow() before throwing them away.
I'm not sure I understand what you mean by this. There is no mention in the docs that the region must be kept because it might still be accessed after RedrawWindow returns. Then when would it be safe to release it? Pretty sure the function is already done with it by the time it returns.
Thanks, the fix looks robust. The only thing I noticed is that the handler for WM_SYNCPAINT
doesn't acquire any lock, so it may redraw a window at the same time as another thread is updating windows or also redrawing. Just a thing I noticed -- I have no idea if that poses any correctness risks.
I'm not sure I understand what you mean by this. There is no mention in the docs that the region must be kept because it might still be accessed after RedrawWindow returns. Then when would it be safe to release it? Pretty sure the function is already done with it by the time it returns.
I meant a way to verify that a RedrawWindow()
call had been made for a particular invalidated region before discarding it in updatePosition()
. But that's not necessary with the changes you've just made.
Btw, I accidentally found another bug affecting this game's configuration window -- do you want me to post here, open another ticket, or wait for v0.5.3 before reporting?
The only thing I noticed is that the handler for WM_SYNCPAINT doesn't acquire any lock, so it may redraw a window at the same time as another thread is updating windows or also redrawing.
That's how it would normally work too, so I don't think extra synchronization is needed. It's the application's responsibility to synchronize in these cases.
Btw, I accidentally found another bug affecting this game's configuration window -- do you want me to post here, open another ticket, or wait for v0.5.3 before reporting?
Either is fine with me. There is no need to wait for v0.5.3.
Okay, so the process implements a lockup detector that can be triggered as follows:
- Open launch32.exe;
- Wait 5 seconds;
- Open Notepad -- the game window should now be in the background;
- Type something in -- so that attempting to exit Notepad would not close it immediately;
- Press ALT + F4 -- it's important that this combination is pressed. A confirmation dialog should appear;
- Wait 2 seconds -- a game dialog with the text "Lockup detected. Do you want to abort this game?" should appear. If it doesn't, dismiss the confirmation dialog and repeat steps 5 & 6.
When the lockup dialog appears, a black rectangle is displayed in the top-left corner. This rectangle is not present when running the game without the wrapper in Windows 10 22H2 or Windows 98 SE.
You can find the lockup handler code in svpdrv.dll+0x138e5:
DirectDrawFlipToGDI();
ShowCursor(1);
pHStack_2f4 = (HWND)0x0;
tStack_2f0.x = 0;
tStack_2f0.y = 0;
GetClientRect(DAT_2701db40,&tStack_2dc);
ClientToScreen(DAT_2701db40,&tStack_2f0);
AllocBlankWin(&pHStack_2f4,s_Temp_check_for_close_2701ca04,0x90000000,tStack_2f0.x,
tStack_2f0.y,tStack_2dc.right,tStack_2dc.bottom,*DAT_2701ccf8);
iVar5 = MessageBoxA(pHStack_2f4,s_Lockup_detected._Do_you_want_to_a_2701c9d0,&DAT_2701d644,
0x50004);
I can't repro the issue here on Windows 11, even with the classic version of Notepad (Windows 11 uses a newer Windows Store version by default). After the confirmation dialog appears in Notepad, nothing happens with the game launcher. Tried it without DDrawCompat too.
I've actually been looking into it today, and AllocBlankWin()
does this:
tagWNDCLASSA local_2c;
local_2c.style = CS_BYTEALIGNCLIENT;
local_2c.lpfnWndProc = (WNDPROC)&LAB_27001000;
local_2c.cbClsExtra = 0;
local_2c.cbWndExtra = 4;
local_2c.hInstance = svpdrv.dll;
local_2c.hIcon = (HICON)0x0;
local_2c.hCursor = (HCURSOR)0x0;
local_2c.hbrBackground = (HBRUSH)0x0;
local_2c.lpszMenuName = (LPCSTR)0x0;
local_2c.lpszClassName = "VrdBlankClass";
RegisterClassA(&local_2c);
hWnd = CreateWindowExA(0, // dwExStyle
"VrdBlankClass", // lpClassName
"Temp check for close", // lpWindowName
WS_VISIBLE | WS_POPUP, // dwStyle
0, // X
0, // Y
800, // nWidth
600, // nHeight
(HWND)0x0, // hWndParent
(HMENU)0x0, // hMenu
svpdrv.dll, // hInstance
(LPVOID)0x0); // lpParam
SetWindowTextA(hWnd, "Temp check for close");
SetWindowPos(hWnd,
HWND_TOPMOST, // hWndInsertAfter
0, // X
0, // Y
0, // cx
0, // cy
SWP_NOCOPYBITS | SWP_NOREDRAW | SWP_NOMOVE | SWP_NOSIZE); // uFlags
I believe it's important that dwStyle
specifies WS_POPUP
-- the window is visible without it.
When DDrawCompat is used, the black window is drawn in updatePresentationWindowPos()
when SetWindowPos()
is called on the invisible pop-up window's child.
I have no idea why the lockup handler doesn't trigger for you, it works reliably on Windows 10 22H2 here. I've extracted the game code and put it in a test case: main.zip -- please let me know if it works for you. I also managed to force drawing the black rectangle without DDrawCompat, but only if I inject a Sleep()
call (not present in the game) before MessageBoxA()
and either try clicking multiple times within the invisible window area or right-clicking the window's tab on the taskbar.
Thanks, your code sample is enough. So the game creates a "visible" window but it doesn't actually paint anything on it, so it appears completely transparent under normal circumstances. The black box with DDrawCompat is visible because it creates a presentation window above all visible windows and redirects presentation to it. This is done to avoid flickering between graphics content that DDrawCompat can hook, and those that it can't (e.g. GDI calls made from kernel mode).
I'm not sure why the window is set to visible in this case. This would cause other issues, like not being able to click through it, and windows behind it not repainting (at least without DWM). Also, if you drag other windows over it (again without DWM, like in Win98), those other windows should smear their image remnants all over it. At least that's what I would expect. This is at least mostly correctly emulated by DDrawCompat.
Unfortunately it is not trivial to track which pixels on such a window have been actually painted before and which ones should still be transparent. In some cases it's even impossible, e.g. if the painting is done through DirectDraw's Lock method on the primary surface. I don't think there is much of a real use case for this anyway, so I'm not planning to improve this situation for now.
Thanks for looking into it. I tested on Windows 98 SE, and the smearing effect is there indeed. I also realized why the reproduction steps didn't work for you -- I missed step 0, which is setting the DWORD Logs
in HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Southpeak Interactive
to 1. I did that by setting "Alerts Enabled" from WESTGAME_CD/west/Setup/redist/DXTest.exe, and then forgot about that. Sorry!
In the process of testing a performance issue, I accidentally found a crash in DDrawCompat. It can be reproduced by opening launch32.exe and quickly pressing the "Hardware Configuration" button. It's intermittent, so it might be necessary to repeat those steps a few times. The stack trace is as follows:
[0x0] 0x0!+ 0x19f8c0 0x5d9acff2
[0x1] DDRAW!Compat::queryInterface+0x17 0x19f8c4 0x2150330d
[0x2] DDRAW!Compat::queryInterface+0x1f 0x19f8c4 0x2150330d
[0x3] DDRAW!CompatPtr<IDirectDraw7>::from+0x1f 0x19f8c4 0x2150330d
[0x4] DDRAW!?A0x8905f59c::onDirectDrawCreate+0x27 0x19f8c4 0x2150330d
[0x5] DDRAW!`anonymous namespace'::directDrawFunc<0,long (__stdcall*)(_GUID *,IDirectDraw * *,IUnknown *),_GUID *,IDirectDraw * *,IUnknown *>+0xc2 0x19f8c4 0x2150330d
[0x6] xvideoio!DirectDrawInit+0xbd 0x19f970 0x27014182
[0x7] svpdrv!VRDDirectDrawInit+0x35 0x19f988 0x29024ddd
[0x8] svprun!LoadProduction+0x33d 0x19f9e8 0x29024bc6
[0x9] svprun!LoadProduction+0x126 0x19fb1c 0x507f32
[0xa] vrdmfc + 0x7f32!vrdmfc+0x7f32 0x19fb48 0x77580a46
[0xb] ntdll!RtlpValidateHeap+0x21 0x19fbc0 0x0
DDRAW!CompatVtable<IDirectDrawVtbl>::s_origVtable
is 0x0, and so is EIP. The root cause is pretty clear after inspecting the stack trace of another thread:
[0x0] ntdll!NtCreateFile+0xc 0x26fdc10 0x75bb04cb
[0x1] KERNELBASE!CreateFileInternal+0x4eb 0x26fdc14 0x75baffce
[0x2] KERNELBASE!CreateFileW+0x5e 0x26fdcd8 0x75bacfa1
[0x3] KERNELBASE!CreateFileA+0x31 0x26fdd0c 0x5d76bd68
[0x4] ddraw_5d760000!HackMeBaby+0x468 0x26fdd3c 0x5d768b89 // Huh?
[0x5] ddraw_5d760000!InternalDirectDrawCreate+0x889 0x26fe0e0 0x5d78e0c9
[0x6] ddraw_5d760000!DirectDrawCreateEx+0x79 0x26fe7d0 0x5d9ae175
[0x7] DDRAW!`anonymous namespace'::installHooks+0x235 0x26fe7f8 0x5d9ad0c1
[0x8] DDRAW!`anonymous namespace'::directDrawFunc<16,long (__stdcall*)(int (__stdcall*)(_GUID *,char *,char *,void *,HMONITOR__ *),void *,unsigned long),int (__stdcall*)(_GUID *,char *,char *,void *,HMONITOR__ *),void *,unsigned long>+0x71 0x26fe844 0x502506
[0x9] vrdmfc + 0x2506!vrdmfc+0x2506 0x26fe8d0 0x508ab9
[0xa] vrdmfc + 0x8ab9!vrdmfc+0x8ab9 0x26fe8e8 0x0
directDrawFunc()
shouldn't proceed until the first thread to hit installHooks()
completes the job.
It can be reproduced by opening launch32.exe and quickly pressing the "Hardware Configuration" button. It's intermittent, so it might be necessary to repeat those steps a few times.
One more important detail here: CpuAffinity
must be set to all
. The game does some shenanigans with winmm::timeGetTime()
, and CpuAffinity=all
is required to fix the frame rate when videos are being played. Apparently, using more cores also makes the race condition described above more likely to occur.
Thanks, some synchronization is indeed missing there. This should fix that:
ddraw.zip (diff.txt compared to v0.5.2)
The previous patch is also included.
One more important detail here: CpuAffinity must be set to all. The game does some shenanigans with winmm::timeGetTime(), and CpuAffinity=all is required to fix the frame rate when videos are being played.
I can't tell any difference in the video playback rate. According to DDrawCompat's stats overlay, the videos play at a fixed 15 flips per second either way. Maybe you are referring to the blit rate? But that's probably just the cursor's refresh rate, so it doesn't seem that important during videos.
The fix works great, no more crashes, thanks!
As for performance, 2 aspects seem to have an impact on this game:
- There are 3 page handling settings available in "Hardware Configuration": "No page flipping", "Double buffer" (this is the default one), and "Triple buffer". When the game is run with "No page flipping", animation is a buttery smooth 60 FPS (or whatever the video/animation frame rate is). However, if "Double buffer" or "Triple buffer" is selected, animation becomes more unstable and stuttery. As a test case, select double buffering and "Force software 3D renderer" from hardware configuration, "New game", press Esc to skip the intro and enter the shooting range, and try running left and right. If the game is run without DDrawCompat (DWM8And16BitMitigation required, will glitch but should work), animation of the running character is smooth, not so much with DDrawCompat. Might just be some stupid thing the game does, but if you give me some guidance (i.e. what to use for profiling and what to look for), I could try to investigate.
- Even with "No page flipping" selected, the performance of cursor animation will degrade without
CpuAffinity=all
, it's not just very choppy during videos (although it's very prominent there), but also more stuttery during the game (which also plays some animations here and there). This is reflected in the lower blit rate, as you noticed. Incidentally, I played 2 games with DDrawCompat and both work better withCpuAffinity=all
(the other being Hooligans: Storm Over Europe, where it fixes a synchronization problem with video subtitles), but this sample size is obviously not big enough to draw any meaningful conclusions :)
The black launcher issue is now fixed in v0.5.3.