Transcripts

Question

Transcripts

farski opened this issue 8 years ago · comments

Chris Kalafarski commented 8 years ago

Chris Kalafarski · Answer 1 · Sun Apr 10 2016 03:31:16 GMT+0800 (China Standard Time)

Should support linking to an external document, to keep feed sizes down

Chris Kalafarski · Answer 2 · Sun Apr 10 2016 03:32:25 GMT+0800 (China Standard Time)

What are the existing standards?

Cj Malone · Answer 3 · Sun Aug 28 2016 18:00:17 GMT+0800 (China Standard Time)

mrss covers subtitle http://www.rssboard.org/media-rss#media-subtitle http://www.rssboard.org/media-rss#media-text
It has basically no adoption but I don't think it should be re done into another spec.

Charles Wiltgen · Answer 4 · Wed Oct 05 2016 03:59:22 GMT+0800 (China Standard Time)

What are the existing standards?

A good summary of choices: https://en.wikipedia.org/wiki/Timed_text

Charles Wiltgen · Answer 5 · Fri Oct 07 2016 04:07:47 GMT+0800 (China Standard Time)

It has basically no adoption but I don't think it should be re done into another spec.

One Very Good Thing™ to do would be to act as a neutral but opinionated body on which existing RSS standards should be supported. Media RSS has some very nice ideas, but the well-intentioned folks behind it left its nurturing to fate. That's not how you bootstrap a community standard.

azpeters · Answer 6 · Fri Oct 14 2016 00:42:57 GMT+0800 (China Standard Time)

Like the idea of including a transcript as a separate link. Also like encouraging the formatting of that transcript using one of the standards from https://en.wikipedia.org/wiki/Timed_text that @CharlesWiltgen mentioned. However, using timed text presents similar challenges discussed in other issue of ad insertion and keeping the transcript in sync with the latest audio file it's pointed at.

Charles Wiltgen · Answer 7 · Fri Oct 14 2016 01:26:56 GMT+0800 (China Standard Time)

However, using timed text presents similar challenges discussed in other issue of ad insertion and keeping the transcript in sync with the latest audio file it's pointed at.

On the bright side, we're standing on the shoulders of giants (ffmpeg, AV Foundation, etc.) that are really good at this kind of EDL-like manipulation. Even for folks that have to roll their own (web apps, maybe?), it's conceptually straightforward — for example, if a 0:10 ad is inserted at 2:30, any events that happen at or after 2:30 are simply offset by 0:10. Not saying it wouldn't be a PITA. 🙂

Georg Holzmann · Answer 8 · Fri Nov 18 2016 21:39:48 GMT+0800 (China Standard Time)

I would like to point the discussion to WebVTT.
In an RSS feed, one could link to an external WebVTT file which includes the transcript - for example:
<atom:link rel="transcript" href="http://example.org/transcript.vtt">
On the podcast webpage, the WebVTT file can be added as a track element in the audio/video element. This is supported by all major browsers.

WebVTT is an existing spec with lots of possible time-based features (not only the text, also speaker names, styling or any other custom data like GPS coordinates etc.) and quite some systems support it already (screenreaders, (web) audio players with WebVTT display+search, software libs, etc.).
Search engines could also easily parse WebVTT files in an audio/video tag, then we have searchable audio ;)

Chris Quamme Rhoden · Answer 9 · Fri Nov 18 2016 22:00:17 GMT+0800 (China Standard Time)

Once again, I think as long as the metadata is tightly bound to the media file (e.g. through use of an id3 tag or link header) we can do whatever. WebVTT is definitely the obvious choice though I think it remains to be seen whether or not timed text is an important feature of these transcripts - I think if we make it a requirement, we may reduce the level of participation. If we don't make timed text a requirement then the data can be encoded directly in the feed.

Charles Wiltgen · Answer 10 · Sat Nov 19 2016 08:07:11 GMT+0800 (China Standard Time)

I think if we make it a requirement, we may reduce the level of participation.

Agreed, transcripts (untimed) and subtitles/captions (timed) are both useful. I'd like to see both defined in the same way that MediaRSS sort of[1] does with media:text and media:subTitle.

FWIW I like WebVTT as the timed text format. It's supported in 82% of browsers in use worldwide and 96% in the USA, and there are apparently polyfills available for older browsers. The only potential downside is that I don't see any native iOS or Android parsers (iOS has one, but it only appears to work in HLS contexts) so that might create a bit of a chicken/egg problem initially.

For the transcript format, it sure would be nice to be able to use Markdown (.md). If full HTML is supported, I think it's likely that enterprising people will use this for all kinds of things that go well beyond the intent.

[1] IMO they didn't quite nail it because both can be used for timed-text. People consuming the spec shouldn't be wondering which to use when.