phoboslab / pl_mpeg

Single file C library for decoding MPEG1 Video and MP2 Audio

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Streaming from the net introduces latency or requires separate thread

phoboslab opened this issue · comments

When feeding a plm_buffer() from the network (or slow media) there's no way to tell the buffer that no data is available at this time, but may be available later.

This forces you to either only decode a frame whenever you are sure that the data for the whole frame is available (as the video decoder can't pause in the middle of a frame) or to run the decoder in a separate thread and busy wait in the plm_buffer_callback until more data is available.

Making sure that enough data is available for decoding a full frame is not straight forward, because we don't know the size of a frame until it has been fully decoded, or we find the PICTURE_START code of the next frame. This introduces unnecessary latency for streaming.

The problem is described in more detail in this blog post towards the end.

This issue is meant for discussion of the problem and possible solutions.

Perhaps a non-blocking-io pattern would be useful here. The demuxer can return either a packet or EAGAIN. If the decoder gets EAGAIN from the demuxer then it does nothing and waits to be called again.
This would probably involve changing the interface for the decoder as well so that the user can ask for a frame if one is ready or they can ask the decoder to block until the next frame is ready.

I guess that the wire format doesn't include the data size because doing so would force a frame of latency at the encoder whereas this way the decoder has more implementation options.

Was this implemented eventually?

Nope.

We don't know if a frame is complete until we decode it. If there's not enough data to continue decoding, we would either need to throw away everything we decoded and try again later, or store the state of the decoder and add the capability to continue decoding anywhere in the middle of any decode function. Both of these solutions are bad.

Alternatively we could have a separate decode thread that just waits in plm_buffer_read() when there's not enough data. But I now believe this is out of scope for this library. It surely could be implemented on top of pl_mpeg by orchestrating the demuxer and decoders yourself instead of using the high level plm_* interface.

This whole problem wouldn't exist if MPEG-PS would just state the size of a video frame or had a FRAME_END marker :/

I guess that the wire format doesn't include the data size because doing so would force a frame of latency at the encoder

Yes, I believe this is the reason. It's just a very bad fit for today's hardware and software. ffmpeg for instance only returns fully encoded frames anyway. You cannot read the encoded data for half a frame while the other half is still being encoded.

@ericoporto: may I ask what is your use-case? Is this extra frame of latency a big deal? For what it's worth, JSMpeg VNC works around this problem by just sending a full video frame as a single WebSocket message, instead of using MPEG-TS. When the client receives a message, it can be sure that it contains a full frame and can instantly decode it.

Thinking some more about this problem: it should be possible to supply your own PICTURE_END marker in the encoder. pl_mpeg could then check for either: PLM_START_PICTURE or a custom PICTURE_END marker to determine if we have a full frame in the buffers.

This way it's backwards compatible - as long if the PLM_START_FRAME_END doesn't collide with any known MPEG start code. I guess the 0xB2 ("user data") start code could be used for that with some private payload signaling our PICTURE_END (see mpeg1 header reference)!?