seastar105 / kr-custom-tts

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

가이드에서 사용하는 코랩 노트북이 제대로 작동하지 않는 것 같습니다.

noname7777777 opened this issue · comments

정확히는 모든 사전 준비를 끝내고 학습 시작을 위해 마지막 셀을 실행시킬 경우 Pretrained KSS 모델을 다운로드받은 뒤 학습을 시작하기 직전에 파일 없음 예외가 뜨면서 뻗네요.

아래는 로그입니다.

rm: cannot remove 'tmp': No such file or directory
rm: cannot remove 'dump': No such file or directory
rm: cannot remove 'exp': No such file or directory
<<<<<<<<<< Data Processing, org to kaldi format >>>>>>>>>>
size= 225kB time=00:00:04.78 bitrate= 384.4kbits/s speed=1.14e+03x
size= 209kB time=00:00:04.44 bitrate= 384.4kbits/s speed= 509x
size= 171kB time=00:00:03.64 bitrate= 384.5kbits/s speed=1.05e+03x
size= 98kB time=00:00:02.09 bitrate= 384.7kbits/s speed= 972x
size= 206kB time=00:00:04.38 bitrate= 384.4kbits/s speed=1.03e+03x
size= 119kB time=00:00:02.52 bitrate= 384.6kbits/s speed= 970x
size= 147kB time=00:00:03.13 bitrate= 384.5kbits/s speed=1.06e+03x
size= 371kB time=00:00:07.91 bitrate= 384.3kbits/s speed=1.22e+03x
size= 340kB time=00:00:07.24 bitrate= 384.3kbits/s speed=1.13e+03x
size= 429kB time=00:00:09.13 bitrate= 384.3kbits/s speed=1.22e+03x
size= 26kB time=00:00:00.54 bitrate= 386.4kbits/s speed= 105x
size= 42kB time=00:00:00.90 bitrate= 385.6kbits/s speed= 724x
size= 141kB time=00:00:03.00 bitrate= 384.5kbits/s speed=1.04e+03x
size= 103kB time=00:00:02.20 bitrate= 384.7kbits/s speed= 936x
size= 58kB time=00:00:01.23 bitrate= 385.1kbits/s speed= 891x
size= 123kB time=00:00:02.61 bitrate= 384.6kbits/s speed=1.03e+03x
size= 213kB time=00:00:04.54 bitrate= 384.4kbits/s speed=1.1e+03x
size= 193kB time=00:00:04.11 bitrate= 384.5kbits/s speed=1.11e+03x
size= 249kB time=00:00:05.30 bitrate= 384.3kbits/s speed=1.18e+03x
size= 300kB time=00:00:06.40 bitrate= 384.3kbits/s speed=1.17e+03x
size= 249kB time=00:00:05.30 bitrate= 384.3kbits/s speed=1.13e+03x
size= 300kB time=00:00:06.40 bitrate= 384.3kbits/s speed=1.12e+03x
size= 256kB time=00:00:05.46 bitrate= 384.4kbits/s speed=1.18e+03x
size= 122kB time=00:00:02.60 bitrate= 384.6kbits/s speed=1e+03x
size= 213kB time=00:00:04.52 bitrate= 384.4kbits/s speed=1.06e+03x
size= 165kB time=00:00:03.51 bitrate= 384.5kbits/s speed=1.1e+03x
size= 163kB time=00:00:03.48 bitrate= 384.5kbits/s speed=1.11e+03x
size= 123kB time=00:00:02.62 bitrate= 384.6kbits/s speed=1e+03x
size= 113kB time=00:00:02.41 bitrate= 384.7kbits/s speed=1.01e+03x
size= 100kB time=00:00:02.13 bitrate= 384.7kbits/s speed=1.03e+03x
size= 144kB time=00:00:03.06 bitrate= 384.5kbits/s speed=1.11e+03x
size= 245kB time=00:00:05.21 bitrate= 384.4kbits/s speed=1.08e+03x
size= 201kB time=00:00:04.28 bitrate= 384.4kbits/s speed=1.1e+03x
size= 121kB time=00:00:02.58 bitrate= 384.6kbits/s speed= 902x
size= 222kB time=00:00:04.73 bitrate= 384.4kbits/s speed=1.14e+03x
size= 283kB time=00:00:06.03 bitrate= 384.3kbits/s speed=1.15e+03x
size= 267kB time=00:00:05.69 bitrate= 384.3kbits/s speed=1.15e+03x
size= 205kB time=00:00:04.36 bitrate= 384.4kbits/s speed=1.14e+03x
size= 212kB time=00:00:04.51 bitrate= 384.4kbits/s speed=1.11e+03x
size= 116kB time=00:00:02.47 bitrate= 384.7kbits/s speed=1.04e+03x
utils/ file data/train/utt2spk is not in sorted order or not unique, sorting it
utils/ file data/train/text is not in sorted order or not unique, sorting it
utils/ file data/train/wav.scp is not in sorted order or not unique, sorting it kept all 30 utterances. old files are kept in data/train/.backup
utils/ file data/eval/utt2spk is not in sorted order or not unique, sorting it
utils/ file data/eval/text is not in sorted order or not unique, sorting it
utils/ file data/eval/wav.scp is not in sorted order or not unique, sorting it kept all 10 utterances. old files are kept in data/eval/.backup
utils/ You are splitting into too many pieces! [reduce $nj (32) to be smaller than the number of lines (30) in data/train/wav.scp] kept all 30 utterances. old files are kept in data/train/.backup
utils/ You are splitting into too many pieces! [reduce $nj (32) to be smaller than the number of lines (10) in data/eval/wav.scp] kept all 10 utterances. old files are kept in data/eval/.backup
<<<<<<<<<< Data Processing, dump data >>>>>>>>>>
2022-10-30T11:25:01 ( ./ --tts_task tts --feats_extract fbank --feats_normalize global_mvn --local_data_opts --text_format raw --audio_format wav --lang ko --feats_type raw --fs 24000 --n_fft 2048 --n_shift 300 --win_length 1200 --token_type phn --cleaner none --g2p g2pk_no_space --train_config conf/train.yaml --inference_config conf/decode.yaml --train_set train --valid_set eval --test_sets eval --srctexts data/train/text --tts_task gan_tts --min_wav_duration 0.683 --fs 24000 --fmin 0 --fmax null --n_fft 1024 --n_shift 256 --win_length null --train_config conf/tuning/finetune_jets.yaml --token_type phn --g2p g2pk --cleaner null --stage 2 --stop-stage 5 --expdir /content/drive/MyDrive/exp --expdir /content/drive/MyDrive/exp
2022-10-30T11:25:01 ( Stage 2: Format wav.scp: data/ -> dump/raw/
utils/ copied data from data/train to dump/raw/org/train
utils/ WARNING: you have only one speaker. This probably a bad idea.
Search for the word 'bold' in
for more information.
utils/ Successfully validated data-directory dump/raw/org/train
2022-10-30T11:25:01 ( scripts/audio/ --nj 32 --cmd --audio-format wav --fs 24000 data/train/wav.scp dump/raw/org/train
2022-10-30T11:25:01 ( [info]: without segments
2022-10-30T11:25:17 ( Successfully finished. [elapsed=16s]
utils/ copied data from data/eval to dump/raw/org/eval
utils/ WARNING: you have only one speaker. This probably a bad idea.
Search for the word 'bold' in
for more information.
utils/ Successfully validated data-directory dump/raw/org/eval
2022-10-30T11:25:17 ( scripts/audio/ --nj 32 --cmd --audio-format wav --fs 24000 data/eval/wav.scp dump/raw/org/eval
2022-10-30T11:25:17 ( [info]: without segments
2022-10-30T11:25:23 ( Successfully finished. [elapsed=6s]
utils/ copied data from data/eval to dump/raw/org/eval
utils/ WARNING: you have only one speaker. This probably a bad idea.
Search for the word 'bold' in
for more information.
utils/ Successfully validated data-directory dump/raw/org/eval
2022-10-30T11:25:23 ( scripts/audio/ --nj 32 --cmd --audio-format wav --fs 24000 data/eval/wav.scp dump/raw/org/eval
2022-10-30T11:25:23 ( [info]: without segments
2022-10-30T11:25:28 ( Successfully finished. [elapsed=5s]
2022-10-30T11:25:28 ( Stage 3: Remove long/short data: dump/raw/org -> dump/raw
utils/ copied data from dump/raw/org/train to dump/raw/train
utils/ WARNING: you have only one speaker. This probably a bad idea.
Search for the word 'bold' in
for more information.
utils/ Successfully validated data-directory dump/raw/train kept 29 utterances out of 30 old files are kept in dump/raw/train/.backup
utils/ copied data from dump/raw/org/eval to dump/raw/eval
utils/ WARNING: you have only one speaker. This probably a bad idea.
Search for the word 'bold' in
for more information.
utils/ Successfully validated data-directory dump/raw/eval kept all 10 utterances. old files are kept in dump/raw/eval/.backup
2022-10-30T11:25:29 ( Stage 4: Generate token_list from data/train/text
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] /root/nltk_data...
[nltk_data] Unzipping taggers/
[nltk_data] Downloading package cmudict to /root/nltk_data...
[nltk_data] Unzipping corpora/
/usr/bin/python3 /usr/local/lib/python3.7/dist-packages/espnet2/bin/ --token_type phn -f 2- --input dump/raw/srctexts --output dump/token_list/phn_null_g2pk/tokens.txt --non_linguistic_symbols none --cleaner null --g2p g2pk --write_vocabulary true --add_symbol ':0' --add_symbol ':1' --add_symbol '<sos/eos>:-1'
2022-10-30 11:25:32,161 (tokenize_text:172) INFO: OOV rate = 0.0 %
2022-10-30T11:25:32 ( Stage 5: TTS collect stats: train_set=dump/raw/train, valid_set=dump/raw/eval
2022-10-30T11:25:38 ( Generate '/content/drive/MyDrive/exp/tts_stats_raw_phn_null_g2pk/'. You can resume the process from stage 5 using this script
2022-10-30T11:25:38 ( TTS collect_stats started... log: '/content/drive/MyDrive/exp/tts_stats_raw_phn_null_g2pk/logdir/stats.*.log'
/usr/bin/python3 /usr/local/lib/python3.7/dist-packages/espnet2/bin/ --input_dir /content/drive/MyDrive/exp/tts_stats_raw_phn_null_g2pk/logdir/stats.1 --input_dir /content/drive/MyDrive/exp/tts_stats_raw_phn_null_g2pk/logdir/stats.2 --input_dir /content/drive/MyDrive/exp/tts_stats_raw_phn_null_g2pk/logdir/stats.3 --input_dir /content/drive/MyDrive/exp/tts_stats_raw_phn_null_g2pk/logdir/stats.4 --input_dir /content/drive/MyDrive/exp/tts_stats_raw_phn_null_g2pk/logdir/stats.5 --input_dir /content/drive/MyDrive/exp/tts_stats_raw_phn_null_g2pk/logdir/stats.6 --input_dir /content/drive/MyDrive/exp/tts_stats_raw_phn_null_g2pk/logdir/stats.7 --input_dir /content/drive/MyDrive/exp/tts_stats_raw_phn_null_g2pk/logdir/stats.8 --input_dir /content/drive/MyDrive/exp/tts_stats_raw_phn_null_g2pk/logdir/stats.9 --input_dir /content/drive/MyDrive/exp/tts_stats_raw_phn_null_g2pk/logdir/stats.10 --output_dir /content/drive/MyDrive/exp/tts_stats_raw_phn_null_g2pk
2022-10-30T11:27:10 ( Skip the uploading stage
2022-10-30T11:27:10 ( Skip the uploading to HuggingFace stage
2022-10-30T11:27:10 ( Successfully finished. [elapsed=129s]
<<<<<<<<<< Download Pretrained KSS Model >>>>>>>>>>
Fetching 36 files: 0% 0/36 [00:00<?, ?it/s]
Downloading: 100% 1.17k/1.17k [00:00<00:00, 1.30MB/s]
Fetching 36 files: 3% 1/36 [00:00<00:06, 5.26it/s]
Downloading: 100% 11.4k/11.4k [00:00<00:00, 8.77MB/s]
Fetching 36 files: 6% 2/36 [00:00<00:06, 5.39it/s]
Downloading: 100% 770/770 [00:00<00:00, 930kB/s]
Fetching 36 files: 8% 3/36 [00:00<00:06, 5.34it/s]
Downloading: 100% 1.40k/1.40k [00:00<00:00, 1.99MB/s]
Fetching 36 files: 11% 4/36 [00:00<00:05, 5.35it/s]
Downloading: 100% 770/770 [00:00<00:00, 1.04MB/s]
Fetching 36 files: 14% 5/36 [00:00<00:05, 5.28it/s]
Downloading: 100% 9.31k/9.31k [00:00<00:00, 11.4MB/s]
Fetching 36 files: 17% 6/36 [00:01<00:05, 5.32it/s]
Downloading: 100% 73.6k/73.6k [00:00<00:00, 1.90MB/s]
Fetching 36 files: 19% 7/36 [00:01<00:05, 4.93it/s]
Downloading: 100% 73.8k/73.8k [00:00<00:00, 3.97MB/s]
Fetching 36 files: 22% 8/36 [00:01<00:05, 4.79it/s]
Downloading: 100% 79.8k/79.8k [00:00<00:00, 2.02MB/s]
Fetching 36 files: 25% 9/36 [00:01<00:05, 4.69it/s]
Downloading: 100% 64.3k/64.3k [00:00<00:00, 2.97MB/s]
Fetching 36 files: 28% 10/36 [00:02<00:05, 4.50it/s]
Downloading: 100% 69.3k/69.3k [00:00<00:00, 1.78MB/s]
Fetching 36 files: 31% 11/36 [00:02<00:05, 4.47it/s]
Downloading: 100% 61.3k/61.3k [00:00<00:00, 1.58MB/s]
Fetching 36 files: 33% 12/36 [00:02<00:05, 4.45it/s]
Downloading: 100% 75.7k/75.7k [00:00<00:00, 3.82MB/s]
Fetching 36 files: 36% 13/36 [00:02<00:05, 4.47it/s]
Downloading: 100% 33.6k/33.6k [00:00<00:00, 1.69MB/s]
Fetching 36 files: 39% 14/36 [00:02<00:04, 4.54it/s]
Downloading: 100% 36.2k/36.2k [00:00<00:00, 1.84MB/s]
Fetching 36 files: 42% 15/36 [00:03<00:04, 4.62it/s]
Downloading: 100% 30.6k/30.6k [00:00<00:00, 1.56MB/s]
Fetching 36 files: 44% 16/36 [00:03<00:04, 4.67it/s]
Downloading: 100% 68.1k/68.1k [00:00<00:00, 1.73MB/s]
Fetching 36 files: 47% 17/36 [00:03<00:04, 4.56it/s]
Downloading: 100% 47.9k/47.9k [00:00<00:00, 1.30MB/s]
Fetching 36 files: 50% 18/36 [00:03<00:03, 4.55it/s]
Downloading: 100% 61.9k/61.9k [00:00<00:00, 3.33MB/s]
Fetching 36 files: 53% 19/36 [00:04<00:03, 4.49it/s]
Downloading: 100% 45.1k/45.1k [00:00<00:00, 1.15MB/s]
Fetching 36 files: 56% 20/36 [00:04<00:03, 4.47it/s]
Downloading: 100% 35.9k/35.9k [00:00<00:00, 1.83MB/s]
Fetching 36 files: 58% 21/36 [00:04<00:03, 4.54it/s]
Downloading: 100% 36.0k/36.0k [00:00<00:00, 1.98MB/s]
Fetching 36 files: 61% 22/36 [00:04<00:03, 4.55it/s]
Downloading: 100% 33.1k/33.1k [00:00<00:00, 1.75MB/s]
Fetching 36 files: 64% 23/36 [00:04<00:02, 4.65it/s]
Downloading: 100% 69.1k/69.1k [00:00<00:00, 3.51MB/s]
Fetching 36 files: 67% 24/36 [00:05<00:02, 4.57it/s]
Downloading: 100% 64.5k/64.5k [00:00<00:00, 1.66MB/s]
Fetching 36 files: 69% 25/36 [00:05<00:02, 4.51it/s]
Downloading: 100% 36.4k/36.4k [00:00<00:00, 1.85MB/s]
Fetching 36 files: 72% 26/36 [00:05<00:02, 4.59it/s]
Downloading: 100% 36.7k/36.7k [00:00<00:00, 1.92MB/s]
Fetching 36 files: 75% 27/36 [00:05<00:01, 4.62it/s]
Downloading: 100% 32.0k/32.0k [00:00<00:00, 1.72MB/s]
Fetching 36 files: 78% 28/36 [00:05<00:01, 4.71it/s]
Downloading: 100% 36.8k/36.8k [00:00<00:00, 1.87MB/s]
Fetching 36 files: 81% 29/36 [00:06<00:01, 4.69it/s]
Downloading: 100% 29.0k/29.0k [00:00<00:00, 1.57MB/s]
Fetching 36 files: 83% 30/36 [00:06<00:01, 4.69it/s]
Downloading: 100% 77.7k/77.7k [00:00<00:00, 2.07MB/s]
Fetching 36 files: 86% 31/36 [00:06<00:01, 4.53it/s]
Downloading: 100% 25.3k/25.3k [00:00<00:00, 1.34MB/s]
Fetching 36 files: 89% 32/36 [00:06<00:00, 4.67it/s]
Downloading: 100% 25.1k/25.1k [00:00<00:00, 1.27MB/s]
Fetching 36 files: 92% 33/36 [00:07<00:00, 4.71it/s]
Downloading: 100% 66.7k/66.7k [00:00<00:00, 1.74MB/s]
Fetching 36 files: 94% 34/36 [00:07<00:00, 4.61it/s]
Downloading: 0% 0.00/334M [00:00<?, ?B/s]
Downloading: 0% 329k/334M [00:00<01:43, 3.21MB/s]
Downloading: 1% 1.95M/334M [00:00<00:30, 10.8MB/s]
Downloading: 1% 4.04M/334M [00:00<00:21, 15.4MB/s]
Downloading: 2% 6.65M/334M [00:00<00:16, 19.6MB/s]
Downloading: 3% 10.1M/334M [00:00<00:13, 24.8MB/s]
Downloading: 4% 14.4M/334M [00:00<00:10, 31.1MB/s]
Downloading: 6% 19.9M/334M [00:00<00:08, 39.1MB/s]
Downloading: 8% 26.3M/334M [00:00<00:06, 47.0MB/s]
Downloading: 10% 34.0M/334M [00:00<00:05, 56.1MB/s]
Downloading: 12% 41.6M/334M [00:01<00:04, 62.3MB/s]
Downloading: 15% 49.7M/334M [00:01<00:04, 68.1MB/s]
Downloading: 17% 57.3M/334M [00:01<00:03, 70.6MB/s]
Downloading: 19% 64.7M/334M [00:01<00:03, 71.5MB/s]
Downloading: 22% 71.9M/334M [00:01<00:03, 70.2MB/s]
Downloading: 24% 79.3M/334M [00:01<00:03, 71.5MB/s]
Downloading: 26% 87.2M/334M [00:01<00:03, 73.6MB/s]
Downloading: 28% 94.6M/334M [00:01<00:04, 55.2MB/s]
Downloading: 31% 103M/334M [00:01<00:03, 61.3MB/s]
Downloading: 33% 109M/334M [00:02<00:05, 42.9MB/s]
Downloading: 35% 116M/334M [00:02<00:04, 47.5MB/s]
Downloading: 37% 124M/334M [00:02<00:03, 54.8MB/s]
Downloading: 39% 131M/334M [00:02<00:03, 58.3MB/s]
Downloading: 42% 139M/334M [00:02<00:03, 64.0MB/s]
Downloading: 44% 146M/334M [00:02<00:03, 60.3MB/s]
Downloading: 46% 153M/334M [00:02<00:03, 57.1MB/s]
Downloading: 48% 161M/334M [00:02<00:02, 63.2MB/s]
Downloading: 50% 168M/334M [00:03<00:02, 66.3MB/s]
Downloading: 53% 176M/334M [00:03<00:02, 70.9MB/s]
Downloading: 55% 184M/334M [00:03<00:02, 68.8MB/s]
Downloading: 57% 191M/334M [00:03<00:02, 66.4MB/s]
Downloading: 59% 198M/334M [00:03<00:01, 69.2MB/s]
Downloading: 62% 206M/334M [00:03<00:01, 71.2MB/s]
Downloading: 64% 214M/334M [00:03<00:01, 72.6MB/s]
Downloading: 66% 222M/334M [00:03<00:01, 75.3MB/s]
Downloading: 69% 229M/334M [00:03<00:01, 74.3MB/s]
Downloading: 71% 238M/334M [00:04<00:01, 76.9MB/s]
Downloading: 74% 246M/334M [00:04<00:01, 77.6MB/s]
Downloading: 76% 253M/334M [00:04<00:01, 77.2MB/s]
Downloading: 78% 261M/334M [00:04<00:00, 78.2MB/s]
Downloading: 81% 269M/334M [00:04<00:01, 64.1MB/s]
Downloading: 83% 277M/334M [00:04<00:00, 67.5MB/s]
Downloading: 85% 284M/334M [00:04<00:00, 68.2MB/s]
Downloading: 87% 291M/334M [00:04<00:00, 64.1MB/s]
Downloading: 89% 298M/334M [00:04<00:00, 66.1MB/s]
Downloading: 91% 305M/334M [00:05<00:00, 62.2MB/s]
Downloading: 94% 312M/334M [00:05<00:00, 65.0MB/s]
Downloading: 96% 319M/334M [00:05<00:00, 58.9MB/s]
Downloading: 97% 325M/334M [00:05<00:00, 55.7MB/s]
Downloading: 100% 334M/334M [00:05<00:00, 60.6MB/s]
Fetching 36 files: 97% 35/36 [00:13<00:01, 1.95s/it]
Downloading: 100% 297/297 [00:00<00:00, 462kB/s]
Fetching 36 files: 100% 36/36 [00:13<00:00, 2.68it/s]
{'train_config': '/content/espnet/egs2/finetune/tts1/downloads/models--imdanboy--kss_tts_train_jets_raw_phn_null_g2pk_train.total_count.ave/snapshots/b059fd8f0fefd7c779cdca610fd29ab7cab692cf/exp/tts_train_jets_raw_phn_null_g2pk/config.yaml', 'model_file': '/content/espnet/egs2/finetune/tts1/downloads/models--imdanboy--kss_tts_train_jets_raw_phn_null_g2pk_train.total_count.ave/snapshots/b059fd8f0fefd7c779cdca610fd29ab7cab692cf/exp/tts_train_jets_raw_phn_null_g2pk/train.total_count.ave_5best.pth'}
Traceback (most recent call last):
File "pyscripts/utils/", line 33, in
File "pyscripts/utils/", line 20, in main
with open(args.inyaml, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'downloads/imdanboy--kss_tts_train_jets_raw_phn_null_g2pk_train.total_count.ave.main.b059fd8f0fefd7c779cdca610fd29ab7cab692cf/exp/tts_train_jets_raw_phn_null_g2pk/config.yaml'

@noname7777777 안녕하세요 노트북에서 본 레포를 다시 받아오는 셀부터 실행하시면 학습이 진행될 수 있도록 수정했습니다.

올려주신 로그를 보면 학습데이터로 30개만 사용하시는거 같은데 이럴 경우 결과가 안 좋을 수도 있습니다.

@noname7777777 안녕하세요 노트북에서 본 레포를 다시 받아오는 셀부터 실행하시면 학습이 진행될 수 있도록 수정했습니다.

올려주신 로그를 보면 학습데이터로 30개만 사용하시는거 같은데 이럴 경우 결과가 안 좋을 수도 있습니다.

답장이 늦었습니다. 수정 감사합니다.

샘플 갯수 관련해서는 대량으로 작업을 하기전에 미리 테스트겸 소량으로만 넣어본거라 수정된걸 확인했으니 이제 괜찮을듯 싶네요.