ssokolow / quicktile

Adds window-tiling hotkeys to any X11 desktop. (An analogue to WinSplit Revolution for people who don't want to use Compiz Grid)

Home Page:https://ssokolow.com/quicktile/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Race between get_monitor and window closing

opened this issue · comments

When this happened, I had two windows open: mpv in the foreground and caja in the background. My window manager is xfwm4 4.12.4 and I have one monitor. I was holding the right arrow key to fast forward through a video. I don't have a binding for Right configured in quicktile, but I do have bindings for <Ctrl><Alt>Right and <Ctrl><Alt><Shift>Right (which results in a binding for Right being created via _vary_modmask). I fast forwarded past the end of the video, which resulted in mpv exiting and the window focus changing back to caja. I was still holding the right arrow key at that point. Immediately after mpv exited, this error appeared:

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/quicktile/keybinder.py", line 138, cb_xevent(self=<quicktile.keybinder.KeyBinder object>, src=<Xlib.display._BaseDisplay object>, cond=1, handle=<Xlib.display._BaseDisplay object>)
                if xevent.type == X.KeyPress:
                    self.handle_keypress(xevent)
  variables: {  'self.handle_keypress': (  'local',
                              <bound method KeyBinder.handle_keypress of <quicktile.keybinder.KeyBinder object at 0x7fb5575d1390>>),
   'xevent': (  'local',
                <class 'Xlib.protocol.event.KeyPress'>(event_y = 1054, state = 0, type = 2, child = <<class 'Xlib.display.Window'> 0x007018d7>, detail = 114, window = <<class 'Xlib.display.Window'> 0x000001e4>, same_screen = 1, time = 1983878097, root_y = 1054, root_x = 656, root = <<class 'Xlib.display.Window'> 0x000001e4>, event_x = 656, sequence_number = 53))}
  File "/usr/lib/python2.7/site-packages/quicktile/keybinder.py", line 159, handle_keypress(self=<quicktile.keybinder.KeyBinder object>, xevent=<class 'Xlib.protocol.event.KeyPress'>(event_y =...0x000001e4>, event_x = 656, sequence_number = 53))
            # Call the associated callback
            self._keys[keysig]()
  variables: {  'keysig': ('local', (114, 0)),
   'self._keys': (  'local',
                    {  (111, 0): <function call at 0x7fb5572875f0>,
                       (111, 12): <function call at 0x7fb557287500>,
                       (111, 13): <function call at 0x7fb5572875f0>,
                       (111, 14): <function call at 0x7fb557287500>,
                       (111, 15): <function call at 0x7fb5572875f0>,
                       (111, 28): <function call at 0x7fb557287500>,
                       (111, 29): <function call at 0x7fb5572875f0>,
                       (111, 30): <function call at 0x7fb557287500>,
                       (111, 31): <function call at 0x7fb5572875f0>,
                       (113, 0): <function call at 0x7fb5572876e0>,
                       (113, 12): <function call at 0x7fb5572876e0>,
                       (113, 13): <function call at 0x7fb557287488>,
                       (113, 14): <function call at 0x7fb5572876e0>,
                       (113, 15): <function call at 0x7fb557287488>,
                       (113, 28): <function call at 0x7fb5572876e0>,
                       (113, 29): <function call at 0x7fb557287488>,
                       (113, 30): <function call at 0x7fb5572876e0>,
                       (113, 31): <function call at 0x7fb557287488>,
                       (114, 0): <function call at 0x7fb557287410>,
                       (114, 12): <function call at 0x7fb557287398>,
                       (114, 13): <function call at 0x7fb557287410>,
                       (114, 14): <function call at 0x7fb557287398>,
                       (114, 15): <function call at 0x7fb557287410>,
                       (114, 28): <function call at 0x7fb557287398>,
                       (114, 29): <function call at 0x7fb557287410>,
                       (114, 30): <function call at 0x7fb557287398>,
                       (114, 31): <function call at 0x7fb557287410>,
                       (116, 0): <function call at 0x7fb557287668>,
                       (116, 12): <function call at 0x7fb557287578>,
                       (116, 13): <function call at 0x7fb557287668>,
                       (116, 14): <function call at 0x7fb557287578>,
                       (116, 15): <function call at 0x7fb557287668>,
                       (116, 28): <function call at 0x7fb557287578>,
                       (116, 29): <function call at 0x7fb557287668>,
                       (116, 30): <function call at 0x7fb557287578>,
                       (116, 31): <function call at 0x7fb557287668>})}
  File "/usr/lib/python2.7/site-packages/quicktile/keybinder.py", line 225, call(func='right')
                       `WindowManager` instance"""
                    commands.call(func, winman)
  variables: {  'commands.call': (  'local',
                       <bound method CommandRegistry.call of <quicktile.commands.CommandRegistry object at 0x7fb5575b70d0>>),
   'func': ('local', 'right'),
   'winman': ('local', <quicktile.wm.WindowManager object at 0x7fb566d77590>)}
  File "/usr/lib/python2.7/site-packages/quicktile/commands.py", line 178, call(self=<quicktile.commands.CommandRegistry object>, command='right', winman=<quicktile.wm.WindowManager object>, *args=(), **kwargs={})
                              command, args, kwargs)
                cmd(winman, *args, **kwargs)
  variables: {  'args': ('local', None),
   'cmd': ('local', <function cycle_dimensions at 0x7fb5575be6e0>),
   'kwargs': ('local', None),
   'winman': ('local', <quicktile.wm.WindowManager object at 0x7fb566d77590>)}
  File "/usr/lib/python2.7/site-packages/quicktile/commands.py", line 124, wrapper(winman=<quicktile.wm.WindowManager object>, window=<wnck.Window object at 0x7fb557016550 (WnckWindow at 0x556a79f72dc0)>, *args=(), **kwargs={})
                    if not (windowless or self.get_window_meta(
                            window, state, winman)):
                        logging.debug("No window and windowless=False")
  variables: {  'state': (  'local',
               {  'cmd_name': 'right',
                  'config': <ConfigParser.RawConfigParser instance at 0x7fb5572f5c20>}),
   'window': (  'local',
                <wnck.Window object at 0x7fb557016550 (WnckWindow at 0x556a79f72dc0)>),
   'winman': ('local', <quicktile.wm.WindowManager object at 0x7fb566d77590>)}
  File "/usr/lib/python2.7/site-packages/quicktile/commands.py", line 64, get_window_meta(window=<wnck.Window object at 0x7fb557016550 (WnckWindow at 0x556a79f72dc0)>, state={'cmd_name': 'right', 'config': <ConfigParser.RawConfigParser instance>}, winman=<quicktile.wm.WindowManager object>)
            monitor_id, monitor_geom = winman.get_monitor(window)
            use_area, use_rect = winman.workarea.get(monitor_geom)
  variables: {  'monitor_geom': (None, None),
   'monitor_id': (None, None),
   'window': (  'local',
                <wnck.Window object at 0x7fb557016550 (WnckWindow at 0x556a79f72dc0)>),
   'winman.get_monitor': (  'local',
                            <bound method WindowManager.get_monitor of <quicktile.wm.WindowManager object at 0x7fb566d77590>>)}
  File "/usr/lib/python2.7/site-packages/quicktile/wm.py", line 276, get_monitor(self=<quicktile.wm.WindowManager object>, win=None)
            # TODO: How do I retrieve the root window from a given one?
            monitor_id = self.gdk_screen.get_monitor_at_window(win)
            monitor_geom = self.gdk_screen.get_monitor_geometry(monitor_id)
  variables: {  'monitor_id': (None, None),
   'self.gdk_screen.get_monitor_at_window': (  'local',
                                               <built-in method get_monitor_at_window of gtk.gdk.ScreenX11 object at 0x7fb5575ad820>),
   'win': ('local', None)}
TypeError: Gdk.Screen.get_monitor_at_window() argument 1 must be gtk.gdk.Window, not None

At that point, quicktile stopped responding to any of my keybindings, even though it was still running. After restarting quicktile, it began working again. Unfortunately, I only have the backtrace and not the debug log.

For reference, this is my quicktile config:

[general]
cfg_schema = 1
UseWorkarea = True
#ModMask = <Mod4>
ModMask = <Ctrl><Alt>
ColumnCount = 3
MovementsWrap = True

[keys]
<Shift>Left = left
<Shift>Right = right
<Shift>Up = top-right
<Shift>Down = bottom-right
Up = move-to-top
Left = move-to-left
Right = move-to-right
Down = move-to-bottom

This line in quicktile/wm.py seems to be the problem:

    def get_monitor(self, win):
        ...
        if not isinstance(win, gtk.gdk.Window):
            win = gtk.gdk.window_foreign_new(win.get_xid())

        # TODO: How do I retrieve the root window from a given one?
        monitor_id = self.gdk_screen.get_monitor_at_window(win)
        ...

Apparently gtk.gdk.window_foreign_new returned None even though win was not None (a wnck.Window).

Based on this information, I have a hypothesis as to what happened:

  1. Pressing the right arrow triggered a quicktile command.
  2. The mpv window hadn't yet closed, so all the pre-flight checks succeeded.
  3. The mpv window closed.
  4. WindowManager.get_monitor called gtk.gdk.window_foreign_new. Since the mpv window no longer existed, it returned None.
  5. self.gdk_screen.get_monitor_at_window(win) failed because win is None.

Basically it's a race between executing the quicktile command and closing the window. I believe what is needed to fix it is to check that the result of gtk.gdk.window_foreign_new is not None, and bail if it is. All callers of WindowManager.get_monitor will need to be changed to account for this.

Thanks for reporting this. I'm still trying to get my hobby time budget back on track, so I'm not sure if I'll be able to fix this quickly, but I'll see what I can do.

Definitely a tricky bug to identify in testing if you don't know what you're looking for. To be honest, I've been wondering if maybe, as part of the port to GTK+ 3.x, I should also rewrite QuickTile in Rust to get the monadic error handling which would have prevented this failure from being catastrophic by reminding me that the call is fallible.

Either way, I should really make some time to start writing integration tests for QuickTile. To be honest, the need to spin up something like Xnest, Xephyr, or Xvfb, a WM inside it, and then write a test harness which creates windows to be moved around has been a bit demotivating.

All callers of WindowManager.get_monitor will need to be changed to account for this.

I think it'd provide both greater robustness and a nicer API for command authors if I just wrapped the dispatching of commands in a try/except block.

I wasn't able to reproduce the specific error described, but, on one of the "hold the key combo while mpv exits" tests, I managed to trigger a new error handler that I thought was only reachable during QuickTile startup, followed by the underlying GTK killing QuickTile because it received an X11 BadDrawable error.

(I say "GTK killing QuickTile" because GTK is bypassing my exception handler and just exiting the program with its own error message.)

I'm looking into that now.

I added a generic "don't let exceptions in tiling commands kill things" handler for the error you described, which I can't test because I can't reproduce the bug, but I can't work around the bug I can reproduce on my end because GTK's "helpful" error handling prevents Python exceptions from firing and I can find no evidence of an option to opt out of it.

All I can suggest is to file a bug with the GTK developers and tell them that there's a race condition in GTK, reproducible by launching ./quicktile.sh -b, focusing MPV, and holding down a tiling hotkey in QuickTile while the MPV window is in the process of going away of its own volition.

The relevant portion of the backtrace looks like this:

#6  0x00007ffff632635b in _XError () at /lib/x86_64-linux-gnu/libX11.so.6
#7  0x00007ffff63230c7 in  () at /lib/x86_64-linux-gnu/libX11.so.6
#8  0x00007ffff63242d3 in _XReply () at /lib/x86_64-linux-gnu/libX11.so.6
#9  0x00007ffff63081cf in XGetGeometry () at /lib/x86_64-linux-gnu/libX11.so.6
#10 0x00007ffff5d3793f in  () at /lib/x86_64-linux-gnu/libgdk-3.so.0
#11 0x00007ffff5d0687c in gdk_window_get_geometry () at /lib/x86_64-linux-gnu/libgdk-3.so.0
#12 0x00007ffff5cf0d22 in gdk_display_get_monitor_at_window () at /lib/x86_64-linux-gnu/libgdk-3.so.0
#13 0x00007ffff5cff542 in gdk_screen_get_monitor_at_window () at /lib/x86_64-linux-gnu/libgdk-3.so.0
#14 0x00007ffff7fbcff5 in  () at /lib/x86_64-linux-gnu/libffi.so.7

(i.e. The race condition appears to be "GTK crashes if a window goes away in the middle of it being the subject of a gdk_screen_get_monitor_at_window call.)

Since that particular bug kills QuickTile entirely, the only thing I can suggest to work around it would be to run QuickTile in a shell script that re-launches it every time it dies.

Something like this:

#!/bin/sh
while true; do
    ./quicktile -b
    sleep 1
done