Garbled CEA captions

Question

Garbled CEA captions

joeyparrish opened this issue 5 years ago · comments

In shaka-project/shaka-player#2395, we received a report of garbled CEA captions in Shaka Player. We do not know what is causing it, but we can reproduce the issue with mux.js in a standalone node script which is very similar in structure to how we use mux.js in Shaka Player:

const muxjs = require('mux.js');
const fs = require('fs');

const CaptionParser = class {
  constructor() {
    this.muxCaptionParser_ = new muxjs.mp4.CaptionParser();
    this.videoTrackIds_ = [];
    this.timescales_ = {};
  }

  parseInitSegment(data) {
    this.videoTrackIds_ = muxjs.mp4.probe.videoTrackIds(data);
    this.timescales_ = muxjs.mp4.probe.timescale(data);
    this.muxCaptionParser_.init();
  }

  parseMediaSegment(data) {
    const parsed = this.muxCaptionParser_.parse(
        data, this.videoTrackIds_, this.timescales_);
    const captions = parsed && parsed.captions ? parsed.captions : [];
    this.muxCaptionParser_.clearParsedCaptions();
    return captions;
  }
};

function readFile(path) {
  return new Uint8Array(fs.readFileSync(path));
}

// argv[0] is the name of the interpreter
// argv[1] is the name of this script
if (process.argv.length < 4) {
  console.log('Usage: ' + process.argv[0] + ' ' + process.argv[1] +
              '<INIT_SEGMENT> <MEDIA_SEGMENT> [<MEDIA_SEGMENT> ...]');
  process.exit(0);
}

const initSegmentPath = process.argv[2];
const mediaSegmentPaths = process.argv.slice(3);

const initSegment = readFile(initSegmentPath);
console.log('Init segment:', initSegmentPath, initSegment.length + ' bytes');

const p = new CaptionParser();
p.parseInitSegment(initSegment);

for (const path of mediaSegmentPaths) {
  const segment = readFile(path);
  console.log('Media segment:', path, segment.length + ' bytes');

  for (const caption of p.parseMediaSegment(segment)) {
    console.log(caption);
  }
}

The output is:

{ startPts: 4296348676,
  endPts: 4296519847,
  text: 'e  iuc\nri Mm,eneadiouR-- -- <i>Hetageuseu</i>',
  stream: 'CC1',
  startTime: 47737.20751111111,
  endTime: 47739.10941111111 }

That text is supposed to be English, though I don't have a working parser for comparison to say exactly what that particular piece of text is meant to be. We get the same results for both encrypted and clear versions of the content, so we know that the encryption is not being applied to the CC parts of the segment.

I've asked permission to share the init segment and one encrypted media segment with you, and I will follow up with those as soon as I have permission.

Joey Parrish · Answer 1 · Sat Apr 11 2020 22:54:28 GMT+0800 (China Standard Time)

The segments are attached.

CEA_segments.zip

Thanks!

Joey Parrish · Answer 2 · Wed Apr 15 2020 06:31:30 GMT+0800 (China Standard Time)

@gesinger, @gkatsev, please let me know if there's anything else we can do to help you debug this. Thanks so much!

Gary Katsevman · Answer 3 · Wed Apr 15 2020 07:34:09 GMT+0800 (China Standard Time)

I'll take a look tomorrow.

Gary Katsevman · Answer 4 · Thu Apr 16 2020 03:24:32 GMT+0800 (China Standard Time)

I looked into it a bit today and it seems like everything is working as expected. Unfortunately, we don't really have many 608 experts anymore, so, any help you can provide would be helpful.
Unfortunately, it seems like there aren't many tools that help with 608 or aren't really maintained anymore. I've tried to see what other parsers would do with this segment but couldn't get any others to work. Even ccextractor returned nothing.
We'd appreciate any help you're able to provide and we'll continue investigating as well.

Joey Parrish · Answer 5 · Thu Apr 16 2020 04:39:39 GMT+0800 (China Standard Time)

I've tried to see what other parsers would do with this segment but couldn't get any others to work. Even ccextractor returned nothing.
We'd appreciate any help you're able to provide and we'll continue investigating as well.

@ppatlolla-turner, since this content came from you, can you offer any other information to help with this investigation?

Gary Katsevman · Answer 6 · Thu Apr 16 2020 04:46:00 GMT+0800 (China Standard Time)

One thought that @ldayananda had is that maybe we're not calculating the PTS/DTS times properly for these captions.

Also, would it be possible to get a clear segment with the garbled captions?

Gary Katsevman · Answer 7 · Thu Apr 16 2020 05:20:35 GMT+0800 (China Standard Time)

One thing we noticed is that the segment has a lot of b-frames and unfortunately, we don't support b-frames with the 608/708 captions, though, we should #214.

ppatlolla-turner · Answer 8 · Thu Apr 16 2020 05:31:25 GMT+0800 (China Standard Time)

Yes our streams do have b-frames.

Joey Parrish · Answer 9 · Sun May 03 2020 01:32:51 GMT+0800 (China Standard Time)

I've been digging into this a bit more, and I find that it's not completely garbled. If I look at a different range of segments and log the CEA character pairs from mux.js, it becomes apparent that some are missing. For example, this caption output from mux.js:

"Lioln d nodo h homork theack a svel"

Corresponds to the spoke line:

"Lincoln did not do his homework on the back of a shovel"

Several CEA character pairs are just plain missing.

When I take the same content and run it through FFmpeg to remove bframes and Shaka Packager to re-fragment it, I find that the text is correctly parsed in mux.js. The segment I posted above, which results in:

"e iuc\nri Mm,eneadiouR-- -- Hetageuseu"

Becomes:

"DOCENT TRAINER:\nHere at the American\nHeritage Museum,"

So this does seem related to bframes in the content.

What would it take to support bframes correctly?

Joey Parrish · Answer 10 · Thu May 07 2020 05:22:50 GMT+0800 (China Standard Time)

A colleague has just pointed this out to me:

        if (sampleCompositionTimeOffsetPresent) {
          // Note: this should be a signed int if version is 1
          sample.compositionTimeOffset = view.getUint32(offset);
          offset += 4;
        }

He says that the content in this issue has v1 TRUN boxes and some negative offsets. It's possible that my re-encoding of the content to remove b-frames may have coincidentally changed the TRUN boxes, too. So it may not be directly caused by b-frames at all.

ppatlolla-turner · Answer 11 · Thu May 07 2020 23:11:47 GMT+0800 (China Standard Time)

We will look and get back if our packaging is incorrect as indicated above.

Joey Parrish · Answer 12 · Fri May 08 2020 02:22:48 GMT+0800 (China Standard Time)

I think I have a very simple fix. I'm now able to parse the content from @ppatlolla-turner. It looks like it's the trun box parser in mux.js. I'll send a PR shortly.

ppatlolla-turner · Answer 13 · Fri May 08 2020 03:33:00 GMT+0800 (China Standard Time)

Thanks @joeyparrish
Really appreciate the effort.

Joey Parrish · Answer 14 · Fri May 08 2020 04:54:14 GMT+0800 (China Standard Time)

We're always happy to help. Thanks to the mux.js team for feedback and review.