Some part of transcript skipped in the VTT file

Question

Some part of transcript skipped in the VTT file

Invincible166 opened this issue 4 years ago · comments

There is a small issue in this code on this line
display_words = result['DisplayText'].split(' ')

There could be words like "person's" in the transcript. Here Azure would return this word in 2 parts in the "words" list as person and 's.
The indexes of display_words would not be in sync with words list. Hence instead use this:

display_words = transcript_obj['NBest'][max_confidence_index]['Lexical'].split(' ')

This would solve the problem. Formatting could be missing but that can be added as additional code.

Mike Wallio · Answer 1 · Mon Jun 08 2020 21:46:33 GMT+0800 (China Standard Time)

Awesome catch! Thanks!