githubharald / CTCWordBeamSearch

Connectionist Temporal Classification (CTC) decoder with dictionary and language model.

Home Page:https://towardsdatascience.com/b051d28f3d2e

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Why are not used log(probabilities)?

janvainer opened this issue · comments

Thank you for this awesome repo! ;)
I was wondering why are not used log probabilities? Is the beam search stable even for long sequences?

  1. for my use-case (text recognition), I usually had something around 100 time-steps, for which I did not run into numerical issues
  2. there already was a discussion about it, maybe the changes are already implemented in the fork, see: #13
  3. if you're comfortable with C++, it should no be too difficult to implement the changes

Thanks for response, I will look into the forked repo. :)

@lordofluck FYI, I never did make any log-space changes. But as @githubharald says, the conversion process would be moderately straightforward

@weinman Thank you for the info, I am testing the code on long sequences now. If the decoding fails for my case, I may implement the log-space operations in the future