boppreh / keyboard

Hook and simulate global keyboard events on Windows and Linux.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support for key suppression

boppreh opened this issue · comments

Right now all events report after-the-fact. It should be possible to suppress keys, keeping them from being processed by applications down the line. This allows users to create global hotkeys without affecting the focused application, for example.

This is possible in Windows, but would require moving the hotkey processing back into the hook.

Linux seems to be trickier. Maybe relying on X, like https://github.com/alols/xcape ?

Finally, extreme care must be taken to not introduce input lag. Because all hotkeys will be processed in the main thread, blocking the event they analyze, it would be too easy to add precious milliseconds to every key press, which is not acceptable.

There are three major complications.

  1. Linux events are emitted after-the-fact, so there's nothing we can do to suppress it. Any solution here will require radically changing the way Linux events are captured, possibly requiring special code for each environment (console, X, Wayland).
  2. Even when we can technically suppress events (Windows makes it easy, for example), there's input lag. Python is not a fast language, and power users may be registering hundreds of hotkeys. We must assure they will all be processed in no more than a couple of milliseconds, otherwise the user starts experiencing uncomfortable delays when typing.
  3. What happens if one hotkey is contained in another? If I register ctrl+a and ctrl+a, space, what should happen? Should we block the event to see if the next key is a space (tremendous input lag)? Should the second hotkey never be triggered (would require some sort of warning)? It's not an easy problem.

I am interested in resolving this issue for the windows platform and may contribute if I have time. The main information that I would like to clarify before beginning is what the general architecture for this will be. Personally I believe the best option in a practical sense would be to allow the user to register key combinations to be blocked (this could be done by simply adding suppress=True to any applicable functions). If done this way, suppression can be done only purely within keyboard, which would allow delays to be strictly controlled.

Moving to the delays specifically: suppression is something that may need to be done using cython. What I'm envisioning is something like this:

  1. User calls keyboard.add_hotkey(...suppress=True)
  2. Everything that currently happens now happens. But in addition, keyboard stores they key combination in keys_to_suppress.
  3. Keyboard registers an os-specific cython module with the operating system that simply compares input to the keys_to_suppress and suppresses the input if there is a match.
  4. Keyboard calls the user function, which may or may not duplicate (pass through) the original request.

Regarding your third point, that would simply be up to the user. So if the user added suppress=False (which would be the default) to the ctrl+a, space, then input would not be blocked. But if the user added suppress=True, then it would. Input would only be blocked up until the point where it would not be able to satisfy keys_to_suppress.

Sorry, I'm on my phone and misclicked. Please ignore the previous message.

Hi xoviat. Thank you for your interest.

I'm not sure I understand why cython may be needed. The library actually had this feature at the beginning, via a "blocking=True" parameter, and only ctypes was needed. Are you suggesting using cython for performance?

I like your idea of the keys_to_suppress variable. I agree this looks like the correct answer to avoid putting all hot keys on the critical path.

I don't understand your comment about ctrl+a, space. If the user registers both ctrl+a and ctrl+a, space (in this order), both with suppress=True, then types ctrl+a, space. What should happen?

Should both hotkeys be triggered? One could argue this is incorrect because the first one matched and requested suppress=True, which should suppress further hotkeys from accepting it.

Should only the first one trigger? But the user explicitly asked for the second hotkey to be triggered, and the user typed the key sequence. The library knew about the conflict at the time of registration, so an exception or at least a warning should be raised, instead of accepting a useless request. This is still a problem because users may be using the library in a background process, so they won't see the warning, and exceptions would only make everything stop working. I'm personally tending to this side, but I'm not happy.

What do you think?

I'm not sure I understand why cython may be needed.

If we're going to check all keys against the keys_to_suppress, then as you noted, we will need a very tight loop. Cython can significantly improve execution time. I am not saying that we will need this but it is an option if there is too much delay in the user input.

Should only the first one trigger?

My point was that this is not as critical as it seems because these events will not be happening within the loop that is suppressing the keys. In other words, if 'ctrl+a, space` is registered to be suppressed, then the entire sequence of keys needs to be put on hold regardless of others keys that have been registered.

With respect to program delay, that's not related specifically to this issue. I'm not specifically sure what would happen, but the answer is: whatever currently happens. The only difference that this change is going to make is whether other applications receive keyboard input. This change won't affect what keyboard sees. Personally, however, I do agree that an exception should be raised when a duplicate hotkey is registered because overusing exceptions is the Pythonic way to do things. Just look at removing a file that you aren't sure exists.

For a possibly more pertinent example, how will this be affected if the user has different hotkeys (with suppression) set up for "ctrl+a" and "a"?

I'll create a truth table so everyone can understand behavior clearly. The answer is: it will be suppressed under the behavior that I have described if keyboard currently calls the handler. The only relevant difference with this is timing.

The current behavior appears to be the following:

add_hotkey('a', print, args=['a was pressed'])
add_hotkey('Ctrl+a', print, args=['Ctrl+a was pressed'])
[Ctrl+a] a was pressed
[a] a was pressed

add_hotkey('Ctrl+a', print, args=['Ctrl+a was pressed'])
add_hotkey('Ctrl+a, space', print, args=['Ctrl+a, space was pressed'])
[Ctrl+a, space] Ctrl+a was pressed
[Ctrl+a] Ctrl+a was pressed

It appears to be that only the shortest combination is recognized and all others are ignored. So, under the implementation that I am proposing, any longer key combination would simply be suppressed and would not actually call a handler.

Update: looking at the API, setting blocking=False would call the other handlers, so whether other handlers are in fact called would depend on the value of this option.

Ctrl+a, suppress Ctrl+a, blocking a, suppress a, blocking Behavior
True True True True Suppress all Call a handler
True True True False Suppress all Call both
True True False True Suppress Ctrl+a Call a
True False True True Suppress all Call a handler
False True True True Suppress all Call a handler
True True False False Suppress Ctrl+a Call both
True False False True Suppress Ctrl+a Call a handler
False False True True Suppress all Call a handler
True False False False Suppress Ctrl+a Call both
False False False False Suppress none Call both

Note: "Suppress Ctrl+a" means wait for the entire key combination before allowing input.

For my own reference:

using System;
using System.Diagnostics;
using System.Windows.Forms;
using System.Runtime.InteropServices;

class InterceptKeys
{
    private const int WH_KEYBOARD_LL = 13;
    private const int WM_KEYDOWN = 0x0100;
    private static LowLevelKeyboardProc _proc = HookCallback;
    private static IntPtr _hookID = IntPtr.Zero;

    public static void Main()
    {
        _hookID = SetHook(_proc);
        Application.Run();
        UnhookWindowsHookEx(_hookID);
    }

    private static IntPtr SetHook(LowLevelKeyboardProc proc)
    {
        using (Process curProcess = Process.GetCurrentProcess())
        using (ProcessModule curModule = curProcess.MainModule)
        {
            return SetWindowsHookEx(WH_KEYBOARD_LL, proc,
                GetModuleHandle(curModule.ModuleName), 0);
        }
    }

    private delegate IntPtr LowLevelKeyboardProc(
        int nCode, IntPtr wParam, IntPtr lParam);

    private static IntPtr HookCallback(
        int nCode, IntPtr wParam, IntPtr lParam)
    {
        if (nCode >= 0 && wParam == (IntPtr)WM_KEYDOWN)
        {
            int vkCode = Marshal.ReadInt32(lParam);
            Console.WriteLine((Keys)vkCode);
        }
        return CallNextHookEx(_hookID, nCode, wParam, lParam);
    }

    [DllImport("user32.dll", CharSet = CharSet.Auto, SetLastError = true)]
    private static extern IntPtr SetWindowsHookEx(int idHook,
        LowLevelKeyboardProc lpfn, IntPtr hMod, uint dwThreadId);

    [DllImport("user32.dll", CharSet = CharSet.Auto, SetLastError = true)]
    [return: MarshalAs(UnmanagedType.Bool)]
    private static extern bool UnhookWindowsHookEx(IntPtr hhk);

    [DllImport("user32.dll", CharSet = CharSet.Auto, SetLastError = true)]
    private static extern IntPtr CallNextHookEx(IntPtr hhk, int nCode,
        IntPtr wParam, IntPtr lParam);

    [DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
    private static extern IntPtr GetModuleHandle(string lpModuleName);
}

I really the ideas here so far. Unfortunately I'm on vacation, with little to no computer or even Internet access until the 27th of December. So I can't contribute much until then.

I will still be able to answer questions, but with some hours of delay.

I believe that my implementation is almost complete. What is a bit disturbing though is that while running the tests, the _depressed variable in KeyTable was not correctly tracking the number of keys that were depressed at the time (it believed that significantly more keys were depressed than was actually the case), which does not bode well for multistep combinations. The code that I am using to track how many keys are depressed is the following:

    if is_up:
        depressed = self._depressed - 1
    else:
        depressed = self._depressed + 1

    if not depressed:
        key = self.SEQUENCE_END

is_up comes from simply checking whether the event is a KEY_UP event:

if allowed_keys.is_allowed(name, event_type == KEY_UP):

I'm back from the holidays. I had been following this pull request, and I have to say it'd very impressive. I'll schedule some time to properly review it as soon as possible.

Thank you

The number of depressed keys is now updated from the high-level code. I should note that there is now a race condition (if the user releases two keys faster than the high level API can update the number of depressed keys). Although I don't anticipate it being an issue, I will note it here for archival purposes.

So, is this officially fixed?

Sort of. It works in many cases but not all. Covering the last few cases is not trivial.

Is Linux support implemented or even possible?

Linux support is not implemented, but that part should actually be trivial if you are familiar with the Linux API. All you you need to do is call is_allowed and suppress the key if that's false.

Could you point me towards the relevant files/lines?

@IamCarbonMan Implementing X support would be equivalent to creating a new backend, and the steps are document in #11 .

As for how to communication may work, there were some suggestions here and a similar tool here.

I tried tackling this problem last weekend, but didn't get very far.

Based on this, I'm assuming there's no way to remap keys in Linux yet? E.g., user presses X, OS reads Y. Any ETA on this?

Would be nice to have this on GNU/Linux. Currently if i want to replace some keys with other i call xmodmap from python (using sh module):

sh.xmodmap(e='clear Lock')
sh.xmodmap(e='keycode 66 = Escape')

which replaces caps lock with escape and translates to bash:

xmodmap -e 'clear Lock'
xmodmap -e 'keycode 66 = Escape'

I've been working on this for a long time to try to fix bugs, and the complexity is just crazy. Just figuring out the correct behavior is hard. To give a preview of the problem:

Let's say we register the shortcut alt+w with key suppression. It's a single modifier plus a single key, in a single step. one of the simplest possible examples. Now think what happens inside the OS hook, where we have to decide to block or allow each event that is captured.

The user presses alt. We have to either allow or block this event. We allow it, because we don't know if the next key will be our shortcut or not.

Then comes a "w down". Hey, that's our shortcut! We block this event, and also the "w up" that follows.

Now comes the "alt up". If we block it, the system will be left with a held down alt, wreaking havoc. If we allow it, the underlying program will receive a press-and-release of alt alone, sending the focus to the menu bar.

That's not good, so we decide to block the initial "alt down".

Then instead of w comes a "tab down", for alt+tab. We have to let that through, but we blocked the previous alt. So we send a fake "alt down", then allow the "tab down". But when the "alt up" event comes, we have to remember if we blocked the initial event (yes) and if we had to fake it later (yes). In a sequence of events like alt+tab, alt+shift+m, alt+w, alt+tab the logic gets crazy quick.

Then you realize that pressing a then alt should result in both a and alt being allowed and the shortcut not triggered (try ctrl+a versus a+ctrl).

And if someone is playing a game where alt is bound to any important action, they will definitely notice that holding down the key not doing anything until its released.

And then you realize there's actually three alts: alt, left alt and right alt...

@xoviat That's a good idea, I forgot it existed. But I think it's a bit limiting (no sided modifiers, keys must have vk) and we still need to solve the problem for other platforms.

I finished the code for modifiers + key, now available on the branch suppress. The contrived logic was isolated into a 24-state finite state machine (https://github.com/boppreh/keyboard/blob/suppress/keyboard/__init__.py#L129). I've been dogfooding it and it's been reliable.

An excellent side effect of this implementation is that it exposes the internal suppression engine, so I added high-level functions for block_key, remap_hotkey, hook_blocking and a few others. Key/hotkey remapping was a much needed feature that was almost impossible to do reliably before, and it's now much easier to add your own suppression logic.

The bad news is that there's zero support for suppressing multi-step hotkeys or hotkeys with multiple non-modifiers (e.g. esc+a). I'm still deciding on the relative worth of those features, suggestions welcome.

After this, the other two big projects are adding device ID detection to Windows, and a X backend. I think I'll focus on X support due to the requests for key suppression on Linux.

b33886e in the suppress branch implements a prototype for multi-step blocking hotkeys (e.g. 'ctrl+j, e, b'). It was surprisingly easy to write because of the blocking hotkey functions that have been exposed. I'm now confident it's possible to add reliable multi-step blocking hotkeys in this branch.

The question now is how to implement blocking hotkeys with multiple non-modifiers together (e.g. esc+a). To be honest, I would it an acceptable sacrifice for the bug fixes. But hopefully it's still doable. If anyone is using them, feedback is welcome.

Just noticed that if two separate processes are using keyboard, the current (master-branch) implementation of suppress fails.

For example:  I have a hotkey win + + that maximizes a window and also applies a frame-less window style.  It works fine on its own, but if another keyboard process is running - the + key evades suppression and sends an = character to the active application (in addition to the 'maximize' functionality).

Is that something that's been taken into account for the suppress branch that's in the works?  Are there any known workarounds for the current implementation?

keyboard still can't suppress keys in Linux. It's the next-highest priority item in the roadmap, immediately after releasing a version with the new suppression system.

commented

Great to hear someone is working on this tough problem. Currently I use a dumb workaround in my scripts. I simply disable my keyboard using xinput and reenable it afterwards.

I'm currently trying to implement this using an optional dependency on python-xlib. See #33.

Also see:

https://github.com/moses-palmer/pynput/blob/master/lib/pynput/keyboard/_xorg.py#L522-L528

def _suppress_start(self, display):
    display.screen().root.grab_keyboard(
        self._event_mask, Xlib.X.GrabModeAsync, Xlib.X.GrabModeAsync,
        Xlib.X.CurrentTime)


def _suppress_stop(self, display):
    display.ungrab_keyboard(Xlib.X.CurrentTime)

The Xlib.display.Window.grab_key or Xlib.display.Window.grab_keyboard APIs from http://python-xlib.sourceforge.net/doc/html/python-xlib_21.html may be useful in the linux implementation.

@adnion Are you able to share your workaround using xinput? I'd be interested to see if it works for me. Thanks

commented

Of course, but be aware that only the configured global hotkeys work when your keyboard is disabled.

Disable Keyboard:

os.system('xinput --set-prop "yourkeyboard" "Device Enabled" 0')
os.system('xinput_toggle.sh')
# add some global hotkeys below
keyboard.add_hotkey('a', do, args=["a"],  suppress=True, timeout=0, trigger_on_release=False)

xinput_toggle.sh

#!/usr/bin/bash
# toggles keyboard that has "Keyboard" in its name
SEARCH=Keyboard
ids=$(xinput --list | awk -v search="$SEARCH" \
    '$0 ~ search {match($0, /id=[0-9]+/);\
                  if (RSTART) \
                    print substr($0, RSTART+3, RLENGTH-3)\
                 }'\
     )
for i in $ids
do
# echo $i
 STATE=$(xinput list-props $i | grep "Device Enabled" | grep -o "[01]$")
 if [ $STATE -eq 1 ];then
   xinput --disable $i
 else
   xinput --enable $i
 fi
done

Enable Keyboard:

os.system('xinput --set-prop "yourkeyboard" "Device Enabled" 1')
os.system('xinput_toggle.sh')
keyboard.unhook_all()
# add hotkey to trigger the script
# calls function do(action) with argument "powerswitch" if key 'alt gr + h' are pressed
keyboard.add_hotkey('alt gr + h', do, args=["powerswitch"],  suppress=True, timeout=0, trigger_on_release=False)

I use pyautogui.PAUSE=0.000001 to reduce input lag. I hope this helps 👍

commented

Plover, a popular python project looks to be suppressing x11 events and not kernel events, too.
I am also eagerly awaiting kernel level key suppression with this awesome library.

@wis Thanks. Confirmed. I hope this project can integrate Plover's KeyboardCapture, which successfully provides a middleware bewteen user input and X11 keyboard events.

I understand correctly that add_hotkeys with suppress=True is blocking default global keys? When I run such script, action like alt+tab or alt+shift dont works. Despite the fact that they are not in add_hotkeys. Even mouse do stange things. When script closed or suppress=False works correctly (Windows 7)

I understand correctly that add_hotkeys with suppress=True is blocking default global keys? When I run such script, action like alt+tab or alt+shift dont works. Despite the fact that they are not in add_hotkeys. Even mouse do stange things. When script closed or suppress=False works correctly (Windows 7)

Suppression blocks only the hotkeys you manually added, and doesn't touch unrelated keys. Additonally, this topic is for a Linux issue, and there's no support for key suppression on Linux at the moment

When I add ctrl+shift+a, alt+c, shift+5 and alt+x, space with suppress=True, default keys like alt+tab or alt+shift dont work on Windows.
Look at video. The first run with suppress=True, hotkey shift+alt dont change language. The second run suppress=False, shift+alt change language.

When I add ctrl+shift+a, alt+c, shift+5 and alt+x, space with suppress=True, default keys like alt+tab or alt+shift dont work on Windows. Look at video. The first run with suppress=True, hotkey shift+alt dont change language. The second run suppress=False, shift+alt change language.

Ah, that looks like a bug! The current version has a few known shortcomings. The goods news is that a replacement for the core engine is almost done, via the 'new_core' branch, and will be merged soon.