eset / ipyida

IPython console integration for IDA Pro

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Upgrade ipykernel to version 5

marc-etienne opened this issue · comments

Using ipykernel >= 5, the kernel dies after a few commands for some reason and the console displays the following error:

ERROR: execution aborted

Hi, I'm interested in working on this issue. Do you have any pointers as to what the problem could be?

Great! I do not have much information about the issue I'm afraid. I'm not sure if there's some new timeout or if the do_one_iteration() is not longer the way to go or if there's a protocol change between qtconsole and the new kernel that breaks compatibility. (I'd try that last point first I think)

Ok I'll poke around and have a look. Some initial testing I did seems to indicate it only happens after some kind of an exception occurs, but I'm not entirely certain that's always the case. Will follow up on your lead too.

Think I've figured out the problem. In ipykernel >= 5, do_one_iteration() is a coroutine, which returns a Future object instead of always completing an iteration before returning. Most of the time this isn't a problem, but in cases where the returned Future isn't completed, there's work left to be done which ends up being lost.

I've written a quick fix here; let me know if it works on your side. It's not a complete fix as I'm not handling cases where ipykernel < 5. In addition, since the code uses coroutines now, there might be a neater way of integrating it with IDA than using do_one_iteration(), but I've not had time to explore that yet.

That actually makes a lot of sense :) I will test it later on macOS and Windows but I'm pretty sure this is the problem.

Not sure if you've looked at this yet, but after my fix I'm noticing some random problems... occasionally the qtconsole seems to die after an exception. Will need to test and debug a bit more to see if there's something that's missing.

Ok, after a lot of debugging and testing I've more or less figured out the issue. The problem with ipykernel >= 5 is that it uses a coroutine based model, as mentioned above; specifically it uses tornado, which itself uses Python 3's asyncio package. So ultimately ipykernel's event loop uses asyncio's event loop, which means we need asyncio's event loop to be functioning properly for things to work.

My previous code sort of worked, because it was (I think) manually starting and stopping the asyncio event loop using run_sync() with a particular set of events to process, once so often via the timer. However, things break once Qt gets involved; any call to Qt from ipython which requires the Qt event loop to advance would result in a deadlock, as run_sync() would block while waiting for Qt, which was blocked waiting for IDAPython.

The solution is thus to get both event loops synchronized. This can be done with an external package, asyncqt (https://github.com/gmarull/asyncqt), which implements an event loop for asyncio that hooks onto the Qt event loop. Both asyncio and Qt events thus get handled properly and no deadlocks occur. The code to do this is in ida_plugin.py, and runs on startup.

A side-effect of this change is that we no longer need the timer code to poll for ipykernel events to handle. So the code ends up being a lot neater.

I've implemented the changes in my fork, you can have a look at the latest commit here: zerotypic@e384393
I've not tested on older versions of ipykernel, but I think I've more or less kept the old code intact so it should still be compatible. I've also not used IdaRichJupyterWidget for ipykernel >= 5, because the stock one seems to work with no problems.

Anyway, all the above is only correct if Qt's event loop is the top-level event loop of Hex-Rays. I'm assuming it is.

More generically, it might be useful to have IDAPython setup the asyncio event loop directly, instead of us doing it explicitly; that way all python code can make use of asyncio and have things work as expected. Perhaps @aundro can have a look at that?

Hi,

I am not very familiar with asyncio, but the proposed solution (zerotypic@e384393) seems rather elegant.
I do, however, have an objection to having IDAPython depend on third-party packages such as asyncqt.
Perhaps it's possible to start from asyncqt, and strip all the unnecessary bits so it's tailored to IDAPython…

In the meantime, I can confirm that Qt's event loop is the top-level event loop of IDA.

Hey!

Good job @zerotypic ! Sorry I've been pretty silent lately. I'm not back from conference and vacation. From what I see you nailed down the problem and your solution makes a lot of sense.

I'll test soon and merge this to the master branch. I'll also add the appropriate dependency in setup.py.

Thanks for your input @aundro :)

Thanks! No worries, take your time, I've only had the time to work on this a bit recently.

@aundro: Yeah, I was thinking that it would make the most sense for the event loops to be synced inside IDAPython. In fact there might even be a cleaner solution since you can modify the Qt side as well... you could see if there's any documentation on integrating asyncio with an embedded Python engine.

So, one of the cool things about syncing the event loops is that by doing so you can run Python code in the "background" without blocking the UI. This will probably be very useful to a lot of IDAPython projects. There might also be a way of getting ipython to not block the UI when running code as well, which would be a fix for #12 .

So just a note about some things I've discovered while playing with asyncio and IDAPython.

asyncqt's run_until_complete() function doesn't work. Reason being, the function tries to start the Qt event loop, but in the case of IDAPython, the event loop is already running. This is problematic because top-level code might want to block until a coroutine has completed running, and that's what you could use run_until_complete() for.

To circumvent this issue, you can run a coroutine as a task using create_task(), and then wait till the task is done. Here's an example of how to do it properly:

import asyncio
import time

from PyQt5.QtWidgets import QApplication

def run_task_till_done(coro):
    '''
    Runs a coroutine as a task, blocking till it is complete.
    Similar to asyncio.run_until_complete(), but works when the event loop
    has already been started elsewhere.
    '''
    app = QApplication.instance()
    t = asyncio.get_event_loop().create_task(coro)
    while not t.done():
        app.processEvents()
        time.sleep(0.01)
    return t.result()

I'm not totally certain we need the time.sleep(), but without it the loop consumes a lot of CPU.

I'm testing with asyncqt (from your up-to-date master branch) and it does seem to work very well.

I'm not sure what you mean by "asyncqt's run_until_complete() function doesn't work". Do you mean you can't run async code inside the console?

I tried the following which results in an error:

async def test():
    print("One")
    await asyncio.sleep(1)
    print("Two")

asyncio.run(test())
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-5-eaadc83a82ea> in <module>
----> 1 asyncio.run(test())

/.../python3.7/asyncio/runners.py in run(main, debug)
     32     if events._get_running_loop() is not None:
     33         raise RuntimeError(
---> 34             "asyncio.run() cannot be called from a running event loop")
     35 
     36     if not coroutines.iscoroutine(main):

RuntimeError: asyncio.run() cannot be called from a running event loop

The being said, non-async code seems to work fine. I'm not sure I want to support async code either, I can't find a useful use for it yet and I'm not sure how well IDA is supporting this anyway.

@marc-etienne ,

I'm not sure how well IDA is supporting this anyway

just for the record: I don't believe IDA[Python] does anything that would make async code impossible in and of itself. The problem I have with asyncqt, is the additional dependency on a third-party.

Yeah what I meant is that an event loop is already running, and IDAPython is effectively running asynchronously within the event loop. So if you want to call async code from within IDAPython, which in normal situations you would use something like run_until_complete(), it doesn't work, because that only works at the top level. So instead you need to run a task.

Anyway, don't have time to work on this now, things are starting to get crazy here due to COVID-19 (as they probably are where you folks are too). Will try to reply with more details sometime next month.

Just as a follow up I'm still testing and with the current solution IDA crashes systematically on exit with asyncioqt. We may need to release some resource or unhook something I don't know yet. I'll spend more time on this later as I'm busy with lot's of other things too.

I've not noticed this behaviour in Linux, what OS are you using?

I'm on macOS. I'll post crash trace here when I have a chance.

Hi guys,

I wanted to see if I could help track down the crash-on-close problem as I'm pretty familiar with the class of issue in stabilizing my own plugins. There's a few usual suspects, such as daemon python threads, timed/event based IDA callbacks executing after the python interpreter has been torn down, lingering Qt integrations, etc..

I'd be happy to help, but I can't seem to get @zerotypic's fork working quite right, let alone 'crashing' it on close :-). It think I am close, but maybe something atrophied in the past several months? across any number of the changes required by this branch...

Here's my setup:

Python 3.6.7 (x64)
IDA 7.5 SP1 (Windows)
asyncqt               0.8.0  (latest?)
ipykernel             5.3.2  (latest)
qtconsole             4.3.0  (pinned version, cuz the 'latest' version fails as imports moved around I think..)

The widget appears when Hotkey'd, but it seems dead / not connected to the backend.

ipy

I'm not familiar with IPython so much, but I am guessing you guys are familiar with where / what might be failing here. I put a number of prints throughout, but only spent a few minutes looking at what's wrong with my setup. As far as I can tell, the kernel is running but my messages aren't getting sent out? or the responses aren't getting received?

Additionally, I've attached a simple log of what happens whey I try sending print('foo') if it is any help...

log.txt

I've not looked at this for a while, so it could be changes in IDA or the other libraries like you suggested. I remember seeing behaviour like that while I was debugging, but I can't exactly remember what had happened. Possibly the kernel crashed. You could try connecting to the kernel from another IPython instance (e.g. from a separate terminal) and see if it's still alive. I'll take a closer look next week and let you know.

Did you try Ctrl-Enter? Maybe the IPython kernel didn't "guess" the command was complete.

I am not competent enough to add anything insightful at this point, but what I can say is this: if you can find a way to crash IDA that you think is due to IDA misbehaving / doing something inappropriate -- such as the timers issue @gaasedelen mentions (I know I fixed (at least) some of these in the past) -- please contact us on support@hex-rays.com (you'll need to have a license still under support.)

(We are not officially supporting IPython in IDA, but that doesn't mean we want to throw wrenches at it!)

Had there been any updates since? Many distributions migrated to ipykernel-5.x already.

Sorry, hadn't found the time to work on this yet. I think I should be able to in a few weeks though. In the meantime you can see if my fork works for you, as mentioned above.

Okay, I've finally had time to have a look today. Upgraded IDA to 7.5 SP 3 (7.5.201028), Python 3.6.9. Merged my fork with the latest changes from upstream. Everything seems to be working fine still; the IPython console works, can run commands, etc. I was wondering whether it might be a 64-bit issue (I usually work in 32-bit), so I tried it in ida64 but it seems fine as well.

I'm not getting a crash on close either. @marc-etienne if you have a stack trace could you send it my way?

I've not yet updated my Python packages... currently these are the versions they're at, for the packages that I think will matter:

ipykernel (5.1.3)
ipython (7.11.1)
ipython-genutils (0.2.0)
jupyter-client (5.3.4)
jupyter-console (6.1.0)
jupyter-core (4.6.1)
qtconsole (4.6.0)
tornado (6.0.3)

I'll try updating them and see if that leads to any problems. @gaasedelen, it may be an issue with the versions of the packages you're using; I think the kernel and QT console code were in quite a flux as they were migrating from one messaging system to another.

Okay... after upgrading everything to the latest possible versions, things still seem to be working. I did have trouble getting tab-completion to work, but it seems related to some libraries ipython is using, jedi and parso. Pinning parso==0.7.0 seems to solve the problem. These are the versions I'm currently at:

ipykernel 5.3.4
ipython 7.16.1
ipython-genutils 0.2.0
jedi 0.17.2
jupyter-client 6.1.7
jupyter-console 6.2.0
jupyter-core 4.6.3
parso 0.7.0
qtconsole 4.7.7
tornado 6.1

I suspect the problems you folks are seeing might be OS-related. I'm on Linux, and I can't really test on any other platform. Otherwise, maybe create a new venv for IDAPython with the latest packages and see if it works then.

(Oh, forgot to mention that I'm on the latest version of asyncqt as well, it just doesn't appear in the list as I have it 'installed' directly from a git repo.)

Oh, @gaasedelen, one thing you could try to see if it makes a difference: Modify _setup_asyncio_event_loop() in ida_plugin.py to add the last line in the code below:

            loop = asyncqt.QEventLoop(qapp)
            asyncio.set_event_loop(loop)
            loop.__dict__["_QEventLoop__is_running"] = True

This sets the __is_running in the event loop to True, which is correct, since the loop has already been started. It might be that in Windows, things don't work right because they think the event loop hasn't been started yet. In Linux this doesn't seem to be a problem for basic IPython usage; it does however become necessary if you want to run additional async code, like awaiting on a coroutine from the IPython console.

The right way to fix this would be to modify asyncqt so we can indicate that the event loop is already running. I'll be sumitting a PR to asyncqt for that; in the meantime we have to use the hack above.

I also took a deeper look at asyncqt's code, to see if we could extract a minimal version for IDAPython as @aundro suggested. Unfortunately it already seems pretty minimal to begin with; I don't really see a lot that can be stripped away. I think the best solution for now is for people to use asyncqt (or the qasync fork) if they want Python async code to work within IDAPython. Perhaps at some point aqyncqt's functionality will be added to PyQt, and then there won't be an additional dependency anymore.

Thanks for the updates! It shouldn't have been a deps conflict last time because I had set it up in a clean env. With how many deps this requires, it's kind of necessary :^)

I just updated my IDA install to 7.5 SP3, so I'll give it another shot later today with the packages/versions/notes you've added. Hopefully we can track down the issue if it's still bugging out.

I just want to confirm that this is working now.

I cloned your branch and 'manually' installed it. I created a fresh virtualenv for the plugin and did pip install ipykernel qtconsole asyncqt. It boots up and is responsive to my commands. I don't think I am getting the autocomplete issues that you were, though. Everything seem to be working okay afaict.

This is what my pip freeze looks like btw (env on Python 3.6.7 (x64), IDA 7.5 SP3, Windows)

(ipyida) C:\Users\user\Desktop>pip freeze
asyncqt==0.8.0
backcall==0.2.0
colorama==0.4.4
decorator==4.4.2
ipykernel==5.3.4
ipython==7.16.1
ipython-genutils==0.2.0
jedi==0.17.2
jupyter-client==6.1.7
jupyter-core==4.6.3
parso==0.7.1
pickleshare==0.7.5
prompt-toolkit==3.0.8
Pygments==2.7.2
python-dateutil==2.8.1
pywin32==228
pyzmq==19.0.2
qtconsole==4.7.7
QtPy==1.9.0
six==1.15.0
tornado==6.1
traitlets==4.3.3
wcwidth==0.2.5

I've been meaning to swap the IDA console out with the various IDA / iPython projects for years... so I'll try to start 'daily driving' this one and let you know if I encounter any other issues.

FYI, I submitted some PRs to qasync (https://github.com/CabbageDevelopment/qasync) which have already been merged and published, so if you change the code to use qasync instead of asyncqt you can just use the version 0.10.0 available on PyPI. I've added a 'dev' branch to my fork of ipyida where I've switched to using qasync, and added a few more features as well.

Since qasync seems to be actively maintained, I think we should switch to using that.

FWIW, the timer/callback code used by the plugin around the IDB open/close notifications is almost certainly responsible for the 'crash on close' issues cited earlier in the thread. Around IDA 7.2 (?) Arnaud/Hex-Rays fixed a bunch of stuff with the lifetime of the python interpreter, or kernel callbacks that would go back into it.

Those crashes will probably only manifest on builds older than IDA 7.2. On that note, I think it's probably best that this plugin considers deprecating support for IDA 6.X all together ... don't be enablers of legacy!! :P

The biggest benefit is removing PySide from the equation and consolidating on the 'new' 7.0 API. It also ensures everyone should be running 64 bit python / procs and will hopefully yield less chance of the deps atrophying or misbehaving in the long run.

WDYT?

I also played around with the code a bit more today. Really I think the goal should be to make this a native replacement for the existing console, so that's kind of what I was playing with.

I ripped out some of the load/unload code and switched the plugin over to a idaapi.PLUGIN_FIX style one. Refactor a few things and tell it where to dock and you can pretty easily get there:

ipython

I'm happy to contribute some of these changes and refactors, but I think it's best your iPy5 branch lands first, and maybe the deprecation work is completed too.

@gaasedelen, I'm pretty much trying to do the same thing (console replacement). If you like, and don't mind, you could contribute to the dev branch of my fork, and we could add improvements there. We'll merge back upstream when things are more stable, and when there's some consensus around whether or not to support older versions. (Given that IDA is commercial software, and isn't cheap, I can understand that there might be people out there stuck on older versions and we might not want to abandon them.)

Sure, that sounds good to me. I think we can just move all vNext development efforts over to your branch, and hopefully merge it all back into here? If @marc-etienne will have us back :-)

I don't mean to spam this thread, but since I can't make issues on your dev repo @zerotypic I don't really know where else to document or discuss feature work / other issues :-).

I've refactored what I can about the plugin's IDA integration using best practices and pushed them onto your fork. The plugin should be loading/unloading much cleaner, it's tied to the lifetime of IDB's now with PLUGIN_PROC. I think any 'crash on close' issues should also be resolved, and I've ensured (and tested) compatibility with the following software SKUs on Windows:

  • IDA 6.6, Python 2.7.18 (x32)
  • IDA 7.0, Python 2.7.18 (x64)
  • IDA 7.5 SP3, Python 3.6.7 (x64)

Things seem to be working smoothly on all versions, but I can only speak with the most confidence on latest (as that's what I use mostly). Additionally, these have been some of my observations with using roughly the current 'build' w/ iPy5 the past few days.

1. We should add double click to jump to address, like IDA 7.0+ has
2. We should add right click --> clear console
3. Right-click --> select-all doesn't seem to actually work?
4. Typing in the IPython console while 'analysis' is running is slow / will stutter -- not sure there's much we can do about this as we're tied to the Qt / Mainthread, as is the analysis.

It's looking great otherwise. Some of my other locally stashed work is to help improve the theming / appearance, which I will try to push at some point.

Whoops sorry, didn't realise issues wasn't enabled for my repo, it's enabled now. I'll check out your code tomorrow. Thanks!

Hey! I'm back .o/ :)

So I finally found so time to look into this. It seems to work quite fine!

I checked asyncqt/qasync's code. It's quite complex and can't just steal the relevant parts. I think it's better to delegate this problem and depend on qasync like you did.

I can't reproduce the crash-on-close with the latest version of IDA, ipykernel, qtconsole, etc. It was likely a bug related to my environment back then. Nonetheless @gaasedelen is right that the plugin "lifecycle" isn't quite right and hackish. For example, if you close and idb and open another one in the same IDA "session", sys.stdout will be reset to IDAPythonStdOut instead of IPyIDA's IDATeeOutStream. I'll see to this problem too.

I tested running async also seem to works without issue :)

I've made a branch ipykernel5-support where I cherry-picked the changes and will be merging to master later today.

Great! We've also added a few new features, on the 'dev' branch of my fork, you could check that out. Would you prefer if we move our work over to a branch on yours instead?

I did yes :) I started testing some of the features. I'll cherry-pick some or all of it after the holidays.

Enjoy the holidays! Cheers!