w3c / webcodecs

WebCodecs is a flexible web API for encoding and decoding audio and video.

Home Page:https://w3c.github.io/webcodecs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Information on encoder/decoder performance?

aboba opened this issue · comments

When hwAcceleration = "no-preference", is there a way for the application to discover whether hardware acceleration is operating at a given time? In this situation, encoding seems to fallback from hardware to software without notice (e.g. no error thrown). In the case of decoding, this doesn't seem to happen, but the specification doesn't preclude it.

Within WebRTC stats, we have RTCOutboundRtpStreamStats such as encoderImplementation and powerEfficientEncoder.

Related: w3c/webrtc-stats#732

commented

it seems webcodecs there is no stats like webrtc, we meet webcodecs h264 encoder delay issue in one mac device by default,
solved after change the hardwareAcceleration value to prefer-software/prefer-hardware. there is no way to check what encoder webcodecs currently used.

No, we don't currently provide a signal for this. The privacy concerns expressed by @youennf during development of that preference probably prevent us from signaling that information.

Without an analysis, it's not reasonable to accept a blanket "we don't like it" from any specific party.

If the feature would help developers and users, and Chromium is comfortable with it, why is it not an incubation on track for an OT?

I mean it's also not clear to me that it's actually useful. For local debugging scenarios this information is exposed through the Media panel in Dev Tools.

Ah, OK. That's a different question, and one we should use customer/user enthusiasm to judge.

We have a use for this data. In our case, we use the WebCodecs API to encode video. To make use of hardware acceleration we pass in no-preference so that the API makes the final decision. However, once we pass in that flag we have no visibility about what the API did with the flag, i.e. whether it operated efficiently, which we could use as an analog for "hardware was used".

Given the data is already exposed in WebRTC, powerEfficientEncoder, what additional privacy risk is introduced by exposing it here as well?

Without this telemetry data we lose visibility about how our system is operating and whether we're making best use of the API. This data will help us understand how our code performs in production. Debugging locally is useful to a degree but our development machines are often more powerful than the majority of our users' devices and so will not be representative of real world usage. This data will allow us to tune performance in prod and ultimately provide a better experience for the full spectrum of our users.

commented

From a practical point of view,agree with @mattbirman , In practice, we need to collect this information to the background system, analyze the online usage, and guide us to control preference encoding method on a specific device through configuration delivery.

What prevents you from running A/B experiments with the preference values as they exist today? I would expect the best signal is some app specific quality indicator goes up or down based on the preference setting. It's unclear to me what this adds over bucketing against existing things like UA major version, OS, etc.

Is power efficient encoder actually what you want or do you just want some way of differentiating which decoder was used? I.e., the value doesn't have to have meaning to you; it just needs to allow you to associate metrics so you can report issues to the UA?

E.g., We could allow for a per-UA scheme (Audio|Video)(Encoder|Decoder).id -- in Chromium that could return an integer or some hashed equivalent.

@dalecurtis the trouble is that this greatly increases complexity to implement a fallback when prefer-hardware doesn't work. Effectively equals starting the export with prefer-hardware, expecting it to fail with some error event being sent to the VideoEncoder#error callback and then re-try the export with no-preference or prefer-software.

Unfortunately, calling VideoEncoder.isConfigSupported(...) with prefer-hardware is not indicative of whether subsequently running the encode with this setting will actually succeed.

Add to this a UX issue: when no-preference is configured AND the encoder falls back to its internal software encoding stack, performance suffers quite dramatically (for us, that's an order of magnitude slower). And it happens a lot - presumably due to other apps utilizing the GPU media blocks at the same time. Ideally, we'd be able to tell the user something to the tune of hey, this is slow because you've got some other apps running in the background that you should shut down.

Lastly, for us using WebCodecs competes with using other means to speed up video encoding (eg. server-side encoding, doubling down on our local software encoder with multi-threading etc.). Right now, it's really tough for us to know how WebCodecs performs (how often can it actually use "hardware", i.e. GPU media blocks) among end users. Our performance numbers thus far seem to indicate: not very well.

To sum it up, without any better insights into the internal encoder decisions being made in the presence of no-preference, this setting is not actually all too useful in a practical setting.

@sbalko Is the info available in WebRTC-stats ( encoderImplementation and powerEfficientEncoder) sufficient? Or is there additional information that you need as well?

That's sufficient yes!

Are both needed or is just implementationId or id sufficient? powerEfficient doesn't seem like it would distinguish between software and hardware capability. What do you expect to happen if implementation changes midstream?

At least in Chrome's case, exposing the implementation id doesn't seem like it would expose any additional fingerprinting concern. Clients with more complicated no-preference behavior would need further consideration.

However both of the WebRTC-stats versions are guarded by capture permission or full screen, which isn't required to use WebCodecs. I.e., if WebCodecs is just used with canvas should we deny the exposure of such information? Do you have recommendations on an equivalent signal we should use?

Maybe there's some way to generate an origin stable decoder id that isn't useful for tracking.

Happy new year and following up on this @dalecurtis:

  1. We had assumed powerEfficient to be the same as hardware accelerated. If that's not actually so, implementationId or id might be all we need. Provided that we can delineate the "good" (hardware acceleration is being used) and "bad" (software fallback kicks in) cases in our telemetry.
  2. We were wondering about mid-stream encoder changes but had mentally discarded this. In fact, it made us wonder what the bitstream looks like if, say, one starts with a particular H.264 profile/level (eg. high/5.2) and then fails over to the software encoder (constrained baseline/4.2) or something. Not sure if every decoder/player could easily cope with this. In fact, if this is expected/frequent behavior of the existing VideoEncoder behavior (in the presence of no-preference), perhaps another event to be emitted would be in order?
  3. Preferably, the implementation wouldn't need any user prompts. I could think of multiple approaches:
  • Some very basic stats that do not allow for fingerprinting (and perhaps more elaborate stats that require the user to consent);
  • No consent for PWAs.
  1. Correct, powerEfficient just means the UA thinks the encoder is efficient at the current settings.
  2. Fallback only works with the same configuration, so there should be no profile change. I think it's possible that level or some other properties may change though. @Djuffin - WDYT?
  3. @aboba, since this is coming from Microsoft, can you put forward a proposal that meets your needs?

@youennf any commentary on what would be acceptable to Safari would be useful here too.

Since the level is a part of the codec string, I don't expect it to change after fallback.

If you find an example when fallback breaks decoding please report a bug for a UA.
(For Chromium https://bugs.chromium.org/p/chromium/issues/entry )

Potential approach: getStats() method to return encoderId and powerEfficient.

commented

HW exposure is useful, on an individual level if allows fallback to maximize chance that the user gets HW and on an aggregate level this allows apps to A/B test different strategies to maximize HW usage. In both cases, this leads to improved user experience.

I argue that if we need specific steps to limit HW exposure, we should put those steps in one spec (e.g. MediaCapabilities) and let other specs reference those same steps for consistency. If one can query HW capabilities then one should also be able to tell if HW is being used by an encoding or decoding application.

In my opinion, some form of direct querying of a hw/sw flag on WebCodec would be preferred, rather than referencing something second-order like MediaCapabilities. This keeps the returned information as close to ground truth as possible.

Just to relay a recent experience I had while trying to use MediaCapabilities to assess whether hardware decode was capable of playing back some video (with webrtc and as a file) at a particular resolution:

Using a Samsung Galaxy Tab A7 (SM-T500) I can call something like this in any browser (Chrome, Firefox -- doesn't matter):

navigator.mediaCapabilities.decodingInfo({ type: 'file', video: { contentType: "video/mp4;codecs=avc1.42001f", width: 3840, height: 2160, bitrate: 3000, framerate: 60, } });

In the promise, it claims that the device can play this video smoothly and efficiently.

In reality, this device can't decode 4K AVC video in hardware at all, because it has an Adreno 610 GPU which only supports 1080p in hardware. Only by attempting to play the video can you assess how it'll perform. On FF it seems to fallback to software (despite claiming it's efficient). On Chrome it doesn't play at all. The same applies to all Adreno 610 based devices.

The real world is messy and is filled with stuff like this, and knowing in our telemetry if hardware is actually used for encode/decode in these kinds of situations is incredibly useful.

commented

In my opinion, some form of direct querying of a hw/sw flag on WebCodec would be preferred, rather than referencing something second-order like MediaCapabilities.

Agreed. I only propose we're consistent with answering the question "is HW exposure allowed?" (e.g. by referencing a single spec), but in terms of exposing the HW bits, each spec may want its own API. Classical example is HW is available, but SW fallback happened. IMO if you've already exposed the HW bit, you should not require additional checks to determine whether or not HW is being achieved right now.

In the promise, it claims that the device can play this video smoothly and efficiently.

Regardless of whether a hardware/software flag gets added, if the browser is unable to decode 4K AVC video at 60fps on that device as suggested, step 5 of the Create a MediaCapabilitiesDecodingInfo algorithm should set smooth to false. If that is not the case, that seems like an implementation bug to me. Said differently, Media Capabilities should also return close to ground truth responses.

@tidoust Media Capabilities results can differ from a particular application's experience, because they indicate what capabilities are available if the application is the sole user of the hardware. If another application is using the hw encoder/decoder, then there is no guarantee that decode/encode can be supported at all (e.g. there may be no software fallback), let alone smooth or efficient.

@aboba I understand the nuance. I may be reading too much into @AndrewJDR's comment but my understanding is that they suggest that "smooth" would not be realistic for a 4K video on that device in any case because (1) there is no 4K hardware decoder that the application could become user of, and (2) software decoding would not be able to reach 60fps. If that is the case, I would hope that Media Capabilities reports that 4K playback on that device won't be smooth.

Yeah, it's possible I'm misreading something here. Putting it another way, the point I'm trying to make is that the MediaCapabilities stuff probably couldn't be a 1:1 proxy for "Is WebCodec going to use hardware?" unless the WebCodec spec says it must always follow MediaCapabilities "efficient" flag for determining whether it'll use hardware, or something like that. If it's not specified that way, then a WebCodec implementation is free to fall back to software even when hardware is available for any reason it pleases. Or visa-versa, if a WebCodec implementation ends up with an alternate or newer way of identifying/accessing hardware than MediaCapabilities. If that sort of thing happens, and if there isn't an ability to ask WebCodec directly whether it's using hw, IMO we lose a key insight into performance issues in the wild.

@AndrewJDR, FWIW the issue #604 (comment) mentions should report unsupported on canary: https://chromium-review.googlesource.com/c/chromium/src/+/4219883 as of a couple days ago.

We discussed this at the editor's meeting, and it is on the agenda for the March 7 MEDIA WG meeting (see notes).

  1. For hwAcceleration = no-preference should we provide an event on fallback from hardware to software, or just throw an error if hardware acceleration is initially enabled but at some point can no longer be provided? The latter is more typically how encoder APIs behave.

  2. Should we handle this the same way for encode and decode? In the current Chromium implementation, when selecting no-preference the decoder will not fallback, only the encoder, but there is no explicit statement about this.

Previous discussion on whether exposing hardware acceleration is a good idea: #239

I cannot attend the media WG meeting but I would note the following:

  • PING WG pushed back several times against exposing this information without proper mitigation. Describing the mitigations seems a prerequisite to this proposal.
  • The use case is not clearly described. In the related GitHub issues, cloud gaming is mentioned, but the why and what for is not really described. Getting this information would be very helpful in evaluating the proposal.

@youennf Your comments seem to relate to WebRTC-Stats rather than WebCodecs. In WebCodecs, the question relates to fallback when hwAcceleration = "no-preference". Doing fallback "under the covers" without surfacing errors is atypical for encoder APIs, and leaves the application without an understanding of whether hardware acceleration is enabled or not.

...leaves the application without an understanding of whether hardware acceleration is enabled or not.

That's begging the question: It's not clear what value there is to the end user in providing the application this information in the first place.

@youennf Your comments seem to relate to WebRTC-Stats rather than WebCodecs.

It is more related to #645 that is expected to solve this particular issue.

leaves the application without an understanding of whether hardware acceleration is enabled or not.

Exposing fallback as an event has potential privacy implications so we need to understand the use case if we plan to expose it.
If hwAcceleration = "no-preference", why is the application interested in whether HW acceleration is enabled or not? What will it do with this event?

About throwing instead of silently fallback, isn't the spec already allowing the UA to do so?
FWIW, hardwareAcceleration is only a hint to the UA

@youennf Please read to the bottom of #645. The potential need for an event arises because the error is hidden from the application when "no-preference" is selected and fallback occurs. Most encoder APIs don't do fallback "under the covers", instead providing an error when "hardware" or "software" choices cannot be accommodated. Assuming that the errors provide enough information, choosing one of the other options would provide the application with the knowledge that hardware (or software) encoding can no longer be provided. Similar considerations apply to decode (though hardware acceleration appears to fail less frequently in practice).

Discussed at 7 March 2023 Media WG meeting: https://www.w3.org/2023/03/07-mediawg-minutes.html#t01. Conclusion was to investigate adding an error as suggested in #645 (comment), and to add a note to explain that "no-preference" could allow fallback but without surfacing an error.

Conclusion was to investigate adding an error as suggested in #645 (comment)

The minutes are terse here, and I am not clear in the issue comment what 'adding an error' means. Can you clarify this?

I think @aboba's comment in w3c/webrtc-extensions#146 (comment) covers it.

Yes, that covers it. We talked about this in today's editor meeting and getting input from web developers for each error case will be greatly helpful.

@aboba should we close this now? Since this turned out to be more about documenting fallback behavior.

Yes, we should close this. There are other issues relating to error handling.