w3c / webcodecs

WebCodecs is a flexible web API for encoding and decoding audio and video.

Home Page:https://w3c.github.io/webcodecs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

VideoFrame needs an way to export pixels into ImageData

Djuffin opened this issue · comments

VideoFrames can have multiple pixel formats depending on where they come from.
Currently VideoFrame.copyTo() doesn't do format conversions and can only export data in the same pixel format.
It's a bit complicated to spec and implement universal format conversion in copyTo() (#92)

Most image/video processing libraries (for example opencv.js and tensorflow.js) accept ImageData as an input.
So it makes sense to remove the format conversion burden from website/app developers and introduce a way to read-back VideoFrames into ImageData converting pixel format to RGBA along the way.

Something like
VideoFrame.copyToImageData(imagedata, x, y)
imagedata the ImageData to read to and to get width and height of the rectangle from which the ImageData will be extracted.
x and y - coordinates of the top-left corner of the rectangle from which the ImageData will be extracted.

Most likely most developers currently achieve the same result by drawing the frame on the canvas and calling CanvasRenderingContext2D.getImageData() later. But it introduces unnecessary copies along the way.

Looks like ImageData from a Uint8Array doesn't involve any copies: https://developer.mozilla.org/en-US/docs/Web/API/ImageData/ImageData

So having a method is just the difference between:

let data = new Uint8Array(...)
await videoFrame.copyTo(data, ...);
let id = ImageData(data, ...);

If we were to add an API, it seems we'd want something simpler, but mirroring canvas API:

let data = await videoFrame.copyToImageData(sx, sy, sw, sh, settings)

Or if we need a synchronous method just using the canvas name:

let data = videoFrame.getImagedata(...);

We can't get rid of the copy, so the two issues developers face today here are:

  • copyTo doesn't do format conversion, that's handled at the draw stage.
  • Annoyance at setting up ImageData from Uint8Array.

Enhancing copyTo to allow format changes (or at least only to RGBA/BGRA) seems like a good first step.

Indeed if copyTo() allowed conversion to RGBA, there would be no need to add an extra method for ImageData conversion.

But adding format conversion to `copyTo() would require either

  1. Implementation of generic format change
    or
  2. Discovery mechanism to query for supported format conversions

Adding a specific method intended for RGBA conversion resolves the practical need without having to solve generic format conversion in copyTo(). Even if later on copyTo() gets generic format conversion, copyToImageData() seems like a useful helper method to have anyway.

I think we'd just have copyTo reject the promise with a not supported error if the target format isn't supported. We could also add a canCopyTo(dict) method to solve 2 if needed.

If we're sure we can determine format support synchronously, we could also have allocationSize() throw on unsupported format.

Closing this in favor of #92 (comment)