possibility to vary the region-detection threshold?

Question

possibility to vary the region-detection threshold?

aleks-mariusz opened this issue 2 years ago · comments

Is your feature request related to a problem? Please describe.
I have some videos where the speakers tend to pause between parts of their sentence. pyTranscriber seems to treat each of these pause as a separate "region", and so the translations are created 'in isolation' and are often incorrect.

Is there a way to vary the region-detection threshold? It looks like the parameter is called min_region_size per this function but i'm not sure how to make it so that this can be varied by the gui?

Is your feature request related to a problem? Please describe.
yes, depending on speaker, the default region-detection threshold may need to be adjusted

Describe the solution you'd like
would be nice if this could be varied (or maybe even disabled so the entire file counts as a region, unless there's a reason that's undesireable)

Describe alternatives you've considered
i considered to see if the pyTranscriber.app macOS deployment can be editted but it looks like the code is encapsulated and non-modifiable there?

Additional context
Add any other context or screenshots about the feature request here.

Raryel C. Souza · Answer 1 · Thu Dec 15 2022 10:43:44 GMT+0800 (China Standard Time)

Hi @aleks-mariusz,
In fact the binary distributions have code encapsulated... so unless you are skillful in compiling (instructions on docs folder) no easy way to change that.

If you are confortable with command line one suggestion is to try using autosub directly with that property edited. I confess I never tweaked that so I'm not sure how helpful will that be in your case.

Raryel C. Souza · Answer 2 · Thu Dec 15 2022 10:48:10 GMT+0800 (China Standard Time)

If you try that in autosub and it is effective in solving your issue please let me know so we can add an option from pyTranscriber to edit that in future release.

Thanks

aleks · Answer 3 · Fri Dec 16 2022 21:54:03 GMT+0800 (China Standard Time)

Hi Raryel, thanks for the incredibly fast reply. I did peak at autosub however the repo says it's not currently maintained and that its last commit is from 4 years ago, so i wasn't sure if that'll work. I'll give it a try and let you know if it makes a difference!

Raryel C. Souza · Answer 4 · Fri Dec 16 2022 22:36:10 GMT+0800 (China Standard Time)

Hi aleks, that is the underlying library that pyTranscriber uses on background with almost no modifications... it should still be runnable Sent with [Proton Mail](https://proton.me/) secure email.

…

------- Original Message -------

On Friday, December 16th, 2022 at 10:54 AM, aleks ***@***.***> wrote: Hi Raryel, thanks for the incredibly fast reply. I did peak at autosub however the repo says it's not currently maintained and that its last commit is from 4 years ago, so i wasn't sure if that'll work. I'll give it a try and let you know if it makes a difference! — Reply to this email directly, [view it on GitHub](#25 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AARSA5CM3HXH57CJUBFS4BTWNRYANANCNFSM6AAAAAAS7CVE3U). You are receiving this because you commented.Message ID: ***@***.***>

aleks · Answer 5 · Sat Dec 17 2022 03:21:40 GMT+0800 (China Standard Time)

Ok, i see.. so i gave autosub a try.. after modifying the min_region_size parameter to 15 seconds.. i got, still pretty lousy results. I've uploaded both the resulting .SRT file as well as a .SRT file generated by uploading a sample video to youtube. The sample one is less than 20 minutes, while i have dozens of real working set of videos are almost identical, except 80-120 minutes long, so a bit impractical to upload to youtube just to get subtitles :-/

If you have a few minutes, can you take a look a these, and let me know why the one from youtube is so incredibly accurate, whereas the one autosub outputs is basically rubbish, even though they both use google?

I'm baffled how to make this app usable if it's not the smaller regions sizes :-/