beetbox / audioread

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Simplify ffdec.py by using Popen.communicate() method

Bomme opened this issue · comments

Hi, thanks for this very useful library!
I was looking into ffdec.py since I need faster loading of mp3 and m4a files.
I believe that the module could be improved and simplified by using the Popen.communicate() method. This seems to be the recommended way of retrieving output from a subprocess.

The current implementation only allows to read the data in blocks which is suboptimal since a user might not be able to adapt the block size. (E.g. librosa just calls audio_open() which has no way of setting a block size.)

I did a speed comparison that shows that this way of reading data is slower than it needs to be, especially for large files:
https://gist.github.com/Bomme/d9aee452c8c1e68fb5fac743df6b2a07

If you decide to drop Python 2 support (#112) the timeout handling might be easier.
And for later versions of Python 3 the https://docs.python.org/3/library/subprocess.html#windows-popen-helpers might come in handy.

Hi! It could be worth looking into! But there are a few reasons why ffdec works this way now:

  • Streaming. The interface lets clients process the audio while reading it in, rather than loading the entire file into memory first. This is important for handling large files without allocating too much memory.
  • As a kind of corollary, we need to read simultaneously from the stdout and stderr streams. Otherwise, the process could fill up its stderr buffer while we're waiting for data from the stdout buffer, and the interaction would deadlock.

I believe Popen.communicate() only works in a synchronous/blocking style that reads everything until EOF. So I'm not sure we can use that, but maybe there's something else useful in modern subprocess we could rely on?

Hi! Thanks for the explanation.
Since the queue is infinite, the entire file could get read in the separate thread before the client code gets to call read_data. That's why I assumed it would be a simplification to make this explicit, i.e. loading the file into memory.
Shouldn't the queue size be limited to support real streaming?

Since my main goal was to speed up the file loading and you plan to release a new version soon, what do you think about raising the block_size? Maybe by setting it to io.DEFAULT_BUFFER_SIZE as a (still) conservative size.

Ah, yeah, that's a good point. We probably should be limiting the queue size! It seems tricky to get exactly right, but probably worth a shot nonetheless.

Raising the block size seems like a great idea! Would you mind putting together a small PR for that?