speechio / chinese_text_normalization

Chinese text normalization for speech processing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add support fo thrax for TN?

npuichigo opened this issue · comments

Google use thrax and sparrowhawk to build TN system. Since thrax is already used in ITN, is it handy to add corresponding thrax rules in TN?

@npuichigo Hi, Yuchao, you are right, and it should be done in proper time.

The reasons that I haven't done it yet:

  1. For now in speech recognition context, I use TN primarily for offline text processing, which involves other text processing scripts(such as data cleaning, simplified/traditional chinese conversion etc), keeping TN a simple python program is convenient.
  2. For recognition, TN is not a runtime module (ITN is), efficiency is not a problem.

On the other hand when it comes to speech synthesis context, TN is definitely a runtime module and it should be done in thrax and sparrowhawks as you said.

A PR would be welcomed if any one can start some works on thrax TN rewriting rules, it shouldn't be very difficult with references to those ITN rules that I've done.