movever / srt-parser-2-fix

SRT parser that can handle wrong SRT format too (like 00:00:12.682 use dot as separator, which is wrong, it should be a comma)

Home Page:https://www.npmjs.com/package/srt-parser-2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

srt-parser-2

An SRT parser for Javascript.

It reads an .srt file into an array.

Install

npm

npm install srt-parser-2

or yarn

yarn add srt-parser-2

Example

This is a srt format file:

1
00:00:11,544 --> 00:00:12,682
Hello

it would become:

[{
    id: '1',
    startTime: '00:00:11,544',
    startSeconds: 11.544,
    endTime: '00:00:12,682',
    endSeconds: 12.682,
    text: 'Hello'
}]

Enviroment support

Since it only process text,
it should work in both Browser and Node.js enviroment

Usage

let srt = `
1
00:00:11,544 --> 00:00:12,682
Hello
`;

import srtParser2 from "srt-parser-2";
var parser = new srtParser2();
var srt_array = parser.fromSrt(srt);
console.log(srt_array);

// turn array back to SRT string.
var srt_string = parser.toSrt(srt_array);
console.log(srt_string);

You can run this example using node example/1.Comma.js

CLI

npx srt-parser-2 -i input.srt -o output.json --minify

Options:

Option Required Default
--input or -i Yes
--output or -o No output.json
--minify No false

License

MIT

Why?

Why this one special? There are plently SRT parser on npm:

What's wrong with them?

Nothing wrong.
All of them can handle this format:

1
00:00:11,544 --> 00:00:12,682
Hello

But I want to handle format like these:

00:00:11.544

This is wrong format, it use period as separator

Or this:

00:00:11,5440

This is also wrong format, millisecond has 4 digit (should be 3)

Or this:

1:00:11,5

Similiar, hour & millisecond is only 1 digit (wrong)

Or this

00:00:00.05

etc

Format Support

Format Other parser srt-parser-2 srt-parser-2 would turn this into
00:00:01,544 Yes ✅ Yes ✅ 00:00:01,544
00:00:01.544 ❓ Yes for some of them Yes ✅ 00:00:01,544
00:00:01.54 ❓ Yes for some of them Yes ✅ 00:00:01,544
00:00:00.3333 No ❌ Yes ✅ 00:00:00,333
00:00:00.3 No ❌ Yes ✅ 00:00:00,300
1:2:3.4 No ❌ Yes ✅ 01:02:03,400

Basic principle:

  1. If hour,minute,second is shorter than 2 digit, pad start with "0", if longer than 2 digit, only save first 2 digit.
  2. Millisecond is the same, but it's 3 digit.
  3. Seperator can be .(periods) or ,(comma), periods(incorrect) will be replace with comma(correct)

Conclusion

  1. Support more time format (even wrong format)
  2. Have extensive test

About

SRT parser that can handle wrong SRT format too (like 00:00:12.682 use dot as separator, which is wrong, it should be a comma)

https://www.npmjs.com/package/srt-parser-2

License:MIT License


Languages

Language:TypeScript 86.7%Language:JavaScript 13.3%