RediSearch / redisearch-py

RediSearch python client

Home Page:https://redisearch.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Any interest in a Client based on aioredis?

alancleary opened this issue · comments

I think it would be nice if there were a RediSearch asyncio Python Client, specifically, one based on the aioredis asyncio (PEP 3156) Redis client library. Looking at the redisearch-py code, it seems like this would only require implementing a new Client class based on aioredis with async methods, rather than making a whole new library. Is this something users would be interested in? Are the maintainers open to adding support for asyncio? I'm willing to tackle the initial implementation and make a pull request. Let me know.

@alancleary thanks the nice suggestion, yes we'll be welcome such a PR.

Hey there. Sorry for the long delay on this. I went through and ported the whole client and auto_complete classes into aio code but then got hung up on how to set up testing and installation. I guess testing isn't strictly necessary since the classes are implementing the same API but I'm not sure how to handle installation since redisearch supports Python 2 and aio is a Python 3 feature.

Let me know if you have thoughts on this or if I should just go ahead and make a PR with just the aio classes.

@alancleary Mind sharing a link? We've recently started unifying a client to combine async, sync, and the various connection options (pools, sentinel, etc). I'd love to work with you on it if you're interested - especially as we haven't yet started the async side yet!

@chayim Here's a commit that adds an AioClient class (with AioBatchIndexer) and an AioAutoCompleter class: https://github.com/alancleary/redisearch-py/commit/ed3da46bc2a17b9060d0480c12954bec270b3d50. These are the only classes that needed to be ported because they're the only classes that use a Redis connection.

The classes use the aioredis module, which implements the same API as the redis module, so other than adding async and await in the correct places, the only difference is how the classes are initialized and destructed.

For initialization, I used the same initialization pattern as the aioredis module, so you can instantiate the classes with or without using await. For example:

from redisearch import AioClient

def main():
    client = AioClient("my-index")

async def main():
    client = await AioClient("my-index")

Instantiating a class using await initializes the aioredis Client (i.e. it actually connects to Redis), whereas omitting the await does not initialize the aioredis Client (i.e. it instantiates the aioredis Client but does not connect to Redis). This has the advantage of allowing users to instantiate the classes outside of the aio event loop. Notice in the commit, though, that the classes' asynchronous initialize methods aren't called anywhere besides the __await__ method. This is because __await__ is simply calling the aioredis Client's initialize method. The aioredis Client actually calls its initialize method every time it sends a command to Redis, so if we don't explicitly initialize the aioredis Client when instantiating a redisearch class we know the aioredis Client will initialize itself when we start using it to send commands to Redis.

Initializing the aioredis Client as early as possible is nice, though, because it gives users an early exit if the connection fails. If a user has to instantiate a class outside of the aio event loop but wants the early exit, then they can call a redisearch class's initialize method themselves once inside the even loop. For example:

import asyncio
from redisearch import AioAutoCompleter

async def main(auto_completer):
    await auto_completer.initialize()

if __name__ == '__main__':
    auto_completer = AioAutoCompleter('ac')
    asyncio.run(main(auto_completer))

Those were the only additions to the redisearch API.

Regarding class destruction, I had to update the AioBatchIndexer's __del__ method to account for the event loop. Specifically, I used the same destruction pattern as the aioredis module, which checks if the event loop is still running and tries to run a new loop if it's not. The problem with this approach is that in some cases the event loop will be stopped and you won't be able to start another, meaning the last batch commit will not happen. aioredis gets around this by using the asynchronous __aenter__ and __aexit__ context manager magic methods, which guarantee that the event loop will still be running when the class's context is exited. Here's an example of what using such a context manager might look like:

client = await AioClient("my-index")
async with client.batch_indexer() as indexer:
    pass

Not only does this guarantee that the last commit will happen, it will also be at a much more predictable time, i.e. when the indexer block is exited, instead of whenever the heck the interpreter decides to call __del__. In fact, I think predictability is a good reason to add a similar context manager to the existing BatchIndexer class as well (and maybe even the (Aio)Client and (Aio)AutoCompleter classes).

Anyway, let me know if you have any questions or anything about the commit. I'm excited you're interested in adding aio support and I'm willing to help out however I can.