w3c / webcodecs

WebCodecs is a flexible web API for encoding and decoding audio and video.

Home Page:https://w3c.github.io/webcodecs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is it necessary to add key frame check step for audio decoder ?

bdrtc opened this issue · comments

commented

From the main spec, audio decode method step 2:

`If [[key chunk required]] is true:

If chunk.[[type]] is not key, throw a DataError.

Implementers should inspect the chunk’s [[internal data]] to verify that it is truly a key chunk. If a mismatch is detected, throw a DataError.

Otherwise, assign false to [[key chunk required]].`

The term key frame only exist for video, there is no key frame for audio AFAIK,
it seems copy from video decoder description, is this step check necessary for audio ?

Newer codecs like xHE-AAC actually have required key frames.

Newer codecs like xHE-AAC actually have required key frames.

Does it mean that these encoders would need in the future an option to trigger a key frame when an audio sample is given?

I'm not sure, the reference encoder doesn't seem to provide such a capability:
https://sourceforge.net/p/opencore-amr/fdk-aac/ci/master/tree/documentation/aacEncoder.pdf

So I'm unsure how this functionality is configured. I'll pass this Q along to the xHE-AAC folks.

commented

the audio decoder configure step 4

set require key frame to true mandatory,
then , decode method must inspect chunk’s internal data(payload) for all audio codecs in each decode step?

it seems webcodecs aac registry does not support xHe-AAC codec strings now.

For video decoders we have to say that since we never know how hardware decoders will handle starting from a non-keyframe. In Chrome we do actually peak at the first buffer after a reset/configure to see if it's a keyframe for video:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/modules/webcodecs/video_decoder.cc;l=577;drc=3f55fc991441e5bc1cbaa0e47796eae0f440b521

xHE-AAC is defined implicitly by the AAC registry since it's "just" another AAC profile. We haven't added true keyframe verification yet in Chrome. Assuming it's not too difficult to infer from the byte-stream we'll add that eventually.

Fraunhofer indicates that on-demand key-frame generation is limited. Generally a key frame interval is specified (or left automatic) at configuration time, but the encoder can also be told to generate a keyframe some number of samples into the future.

In general it seems similar to the Android hardware video encoding path where keyframe generation is best-effort (keyframe comes soon, but not immediately). So I think the knobs we have for video encoding could be mirrored to an optional AudioEncoderEncodeOptions for audio when eventually needed.

They also indicate that xHE-AAC is primarily two pass encoding, so we'd probably need to figure out what to do there before worrying about key-frames.

commented

do we need a new issue or open this for stay concerned?

I don't think so, we already have the key-frame requirement. We could create a new issue to track 2-pass encoding interest, but probably we can just wait until someone requests that.