lifeiteng / NaturalSpeech2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NaturalSpeech2

Progress

  • Align datasets
  • Implement modules
  • Training
  • End-To-End Synthesizer
  • Add Loss CE RVQ
  • Subjective Evaluation
  • Objective Evaluation
  • Demo Page

Objective Evaluation

Prompt WER Speaker cosine Similarity UtteranceLevel Pitch Mean MAE UtteranceLevel Pitch Std MAE UtteranceLevel Duration Diff
Ground Truth 0.86 - - - -
2 Seconds
4 Seconds
6 Seconds
8 Seconds
4 Seconds(PrefixPrompt) (avg utter duration)

About