w3c / mediacapture-record

MediaStream Recording

Home Page:https://w3c.github.io/mediacapture-record/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Creation of Seekable Files

SingingTree opened this issue · comments

At the moment implementations of the MediaStream Recording API don't write seekable webm files. The WebM format doesn't make this particularly easy, as to write the cues in a useful fashion will require mutating the start of the file, either to write the cues there or manipulate the seek head. However, with the MediaRecorder API, it's the case the if requestData has been run, this data is no longer available for writing.

Would it work to expand the spec to encompass some way to handle this?

For example:

  • expose functionality to finalise recorded blobs: allowing implementations to modify the finalised blob(s) as needed for a given container
  • expose the ability to signal that data will not be requested until the end of recording, allowing the recorder to buffer data and finalise it, without concern for further potential future data

@SingingTree thanks for moving this discussion to an issue. Like I said before, we have given a bit of thought to this internally with the libwebm folks and we thought about a different approach because:

  • there are tools for doing this now (albeit in C/C++) for libwebm,
  • MediaRecorder is a live recorder, modifying it to also, sometimes, be non-live, would dilute the API,

but most important, cues-reconstruction would be muxer-specific: I don't see a straightforward way to generalize this to any muxer and hence to add it to the Spec. @jan-ivar, @Pehrsons wdyt?

The alternative solution in the libwebm case is a simple polyfill (in the spirit of polyfill-first) calling CopyAndMoveCuesBeforeClusters(); this polyfill will create a new encoded Blob with the Cues and duration reconstructed. The JS could be done using e.g. Emscriptem. I wanted to get this done but haven't really found time TBH.

Regarding the muxer specific nature: is your concern that a finalise() style function, or indicating that a data will not be read back until completion, is not enough to allow for all muxers to handle this case?

In the case of the polyfill, would this have no official relation this spec, but could be included by pages using media recorder to rewrite the results of their recording to contain cues? Would there be a need for the file to already have cues written, in the sense that it's a strict move operation, or would it handle writing cues in files that didn't have any?

I think [1] is a good idea that would fix a number of issues. Including this one, making issue #4 support changes to tracks, supporting resolution changes in containers that don't natively support it, etc.

[1] #67 (comment)

@SingingTree

Regarding the muxer specific nature: is your concern that a finalise() style function, or indicating that a data will not be read back until completion, is not enough to allow for all muxers to handle this case?

Aside from the concerns already mentioned, adding a finalise()-like function would face some operational issues spec-wise. Two cases:

  1. the user doesn't mind the UA holding on to the data for as long as needed, and indicates that by calling start() with no timeslice; at first sight, this situation would allow the implementation to rewrite the cues/length appropriately since it holds on to all the data, right? The problem here is that requestData() can be called at any time, flushing any internal memory, and dumping us into case 2.
  2. to add a finalise() method we would need to specify what data is passed into it, e.g. should this method be passed as parameter the whole bag of Blobs received in ondatavailable? Or just some Blobs marked in some particular way...? Different container formats might need to rewrite different chunks of the output, so If the answer is 'the whole bag` then please read on...

In the case of the polyfill, would this have no official relation this spec, but could be included by pages using media recorder to rewrite the results of their recording to contain cues? Would there be a need for the file to already have cues written, in the sense that it's a strict move operation, or would it handle writing cues in files that didn't have any?

Yeah, in this case the polyfill would be a node.js package that would be informatively linked from this very spec, and would consist of a single function call that gets the whole set of recorded Blobs and passes it through the mentioned function (CopyAndMoveCuesBeforeClusters), that tries to "clean up" the webm/mkv, so that it has correct Duration, Cues and a bunch of other things. IIRC, it can create the Cues from scratch. IIUC, it's very much the equivalent of mkclean for webm files.

A similar informative-thingy would be to use WebAudio to mix several audio tracks before passing them to Media Recorder: it's not strictly part of this Spec, but it's good to have an informative example detailing this... (either in the Spec, in MDN or in both).

To give more context, I had cloned and compiled https://github.com/webmproject/libwebm: among the generated executables there is this utility mkvmuxer_sample, that is an example of what I suggested above. I have used with a MediaRecorder-Chrome produced webm (not seekable and with ∞ duration) to generate another webm file that is seekable and has the duration correctly recalculated.

Running another utility from the same folder, webm_info, I see the changes:
before:

Segment:
  SegmentInfo:
    TimecodeScale : 1000000 
    Duration(secs): -1e-09
    MuxingApp     : Chrome
    WritingApp    : Chrome

after:

Segment:
  SegmentInfo:
    TimecodeScale : 1000000 
    Duration(secs): 12.7203
    MuxingApp     : libwebm-0.2.1.0
    WritingApp    : mkvmuxer_sample

You can also do the same with ffmpeg -i input.webm -c copy output.webm to get a remuxed file.

Honestly, if getting a seekable file only requires remuxing the file, my vote would be a javascript library. My opinion is MediaRecorder should be more about doing the thing that is prohibitively hard to do in javascript (encoding audio and video) and less about muxing into containers.

@jnoring I agree, and didn't know that ffmpeg also reconstructs the missing parts of the file, good to know. I forked libwebm here and added an emscripten-compiled repair_webm.cc (see also emcompile.sh) as demo and perhaps this can be the seed of a polyfill to cover the remuxing. It's still TBD, mostly because I need to somehow teach the C routines to read from a Blob instead of from file, and I'm no emscripten pro.

Sounds like a reasonable solution. I'm mindful of keeping the barrier to entry low, so sounds good to me that we can both advertise the JS once it's ready and keep the interface simple.

@legokichi was kind enough to provide a solution based on ts-ebml, see legokichi/ts-ebml#2 (comment) .

What is the appropriate place to discuss usage of libs such as the one above, as well as stewardship of that code (if this is the means by which MediaRecorder can have seekable files, who is involved in making sure it remains so and saying how). This issue? Another one?

FYI remuxing from ffmpeg is not viable (anymore?) with chromium as it will currently produce 1000 fps files, see https://trac.ffmpeg.org/ticket/6386 for details.

commented

A few years have passed and there is still no official solution.
The good solution I've tried so far is this webm-duration-fix for those who need it.
It supports fixing recording files larger than 2GB and has a low memory footprint when fixing.
based on ts-ebml,Support browser and node。
https://github.com/buynao/webm-duration-fix

import fixWebmDuration from 'webm-duration-fix';

const mimeType = 'video/webm\;codecs=vp9';
const blobSlice: BlobPart[] = [];

mediaRecorder = new MediaRecorder(stream, {
  mimeType
});

mediaRecorder.ondataavailable = (event: BlobEvent) => {
  blobSlice.push(event.data);
}

mediaRecorder.onstop = async () => {  
    // fix blob, support fix webm file larger than 2GB
    const fixBlob = await fixWebmDuration(new Blob([...blobSlice], { type: mimeType }));
    // to write locally, it is recommended to use fs.createWriteStream to reduce memory usage
    const fileWriteStream = fs.createWriteStream(inputPath);
    const blobReadstream = fixBlob.stream();
    const blobReader = blobReadstream.getReader();
  
    while (true) {
      let { done, value } = await blobReader.read();
      if (done) {
        console.log('write done.');
        fileWriteStream.close();
        break;
      }
      fileWriteStream.write(value);
      value = null;
    }
    blobSlice = [];
};

re-opening the issue since it's not obvious leaving this to user-code fulfills best the requirements

A few years have passed and there is still no official solution. The good solution I've tried so far is this webm-duration-fix for those who need it. It supports fixing recording files larger than 2GB and has a low memory footprint when fixing. based on ts-ebml,Support browser and node。 https://github.com/buynao/webm-duration-fix

import fixWebmDuration from 'webm-duration-fix';

const mimeType = 'video/webm\;codecs=vp9';
const blobSlice: BlobPart[] = [];

mediaRecorder = new MediaRecorder(stream, {
  mimeType
});

mediaRecorder.ondataavailable = (event: BlobEvent) => {
  blobSlice.push(event.data);
}

mediaRecorder.onstop = async () => {  
    // fix blob, support fix webm file larger than 2GB
    const fixBlob = await fixWebmDuration(new Blob([...blobSlice], { type: mimeType }));
    // to write locally, it is recommended to use fs.createWriteStream to reduce memory usage
    const fileWriteStream = fs.createWriteStream(inputPath);
    const blobReadstream = fixBlob.stream();
    const blobReader = blobReadstream.getReader();
  
    while (true) {
      let { done, value } = await blobReader.read();
      if (done) {
        console.log('write done.');
        fileWriteStream.close();
        break;
      }
      fileWriteStream.write(value);
      value = null;
    }
    blobSlice = [];
};

this one can not run in project with es2022