babysor / MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

用这里的模型跑出现这个RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([70, 512]) from checkpoint, the shape in current model is torch.Size([75, 512]).

wangkewk opened this issue · comments

用这里的模型跑出现这个RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([70, 512]) from checkpoint, the shape in current model is torch.Size([75, 512]).

谁能解决

commented

这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py 第11行的内容 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容

这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py 第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容

放心

同样的问题!

这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py 第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容

谢谢,这是有效的。修改过之后,原来的纯杂音变成正常声音了

这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py 第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容

谢谢,问题解决了

这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py 第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容

谢谢,已经解决

commented

同样问题

这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py 第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容

感谢!问题已顺利解决。

一样!

修改后完全正常,thanks~

+1

commented

修改后正常了,感谢

问题确实解决了,但是声音质量没有哔哩哔哩的效果好,我特意找到的小说的录音,不知道是哪里有问题。
如果想要声音特别像某个人的声音,要怎么提高呢?

commented

同样的问题。

这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py 第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容

如果使用自己训练的模型 要把这个改回去才有效吗 还是不用改也行 我试了下新训练的没声音(也有可能是自己训练的问题)但是用给的模型是正常

commented

这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py 第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容

如果使用自己训练的模型 要把这个改回去才有效吗 还是不用改也行 我试了下新训练的没声音(也有可能是自己训练的问题)但是用给的模型是正常

改回去效果会好一点 但是不改也可以工作的

总算可以了,这个问题搞了好久,还以为本地安装的环境问题

出来的声音像机器人的声音,是因为不同的电脑环境出来的效果不一样么?那是否得自己重新训练模型?

commented

出来的声音像机器人的声音,是因为不同的电脑环境出来的效果不一样么?那是否得自己重新训练模型?

不是的,可能是vocoder或者输入音频不同导致的

commented

+1

唉,还是没有视频中的效果,听起来像刚来**的老外的塑料中文

问题确实解决了,但是声音质量没有哔哩哔哩的效果好,我特意找到的小说的录音,不知道是哪里有问题。 如果想要声音特别像某个人的声音,要怎么提高呢?

我也用的B站up主的模型,但是没有bilibili中的效果,我那边听起来像伏拉夫的调调,都不像中文

commented

问题确实解决了,但是声音质量没有哔哩哔哩的效果好,我特意找到的小说的录音,不知道是哪里有问题。 如果想要声音特别像某个人的声音,要怎么提高呢?

我也用的B站up主的模型,但是没有bilibili中的效果,我那边听起来像伏拉夫的调调,都不像中文

如果录音清晰,平调情况下音色复制效果还是可以的,是不是哪里没运行好?

已修改synthesizer/utils/symbols.py,还是出现报错

Synthesizer using device: cuda
Trainable Parameters: 32.735M
Traceback (most recent call last):
  File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda>
    func = lambda: self.synthesize() or self.vocode()
  File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize
    specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token)
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms
    self.load()
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load
    self._model.load(self.model_fpath)
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load
    self.load_state_dict(checkpoint["model_state"], strict=False)
  File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Tacotron:
        size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]).
        size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]).
        size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]).
        size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).

已修改synthesizer/utils/symbols.py,还是出现报错

Synthesizer using device: cuda
Trainable Parameters: 32.735M
Traceback (most recent call last):
  File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda>
    func = lambda: self.synthesize() or self.vocode()
  File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize
    specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token)
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms
    self.load()
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load
    self._model.load(self.model_fpath)
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load
    self.load_state_dict(checkpoint["model_state"], strict=False)
  File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Tacotron:
        size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]).
        size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]).
        size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]).
        size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).

Me too

commented

试着用这个模型:
链接:https://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw
提取码:om7f
--来自百度网盘超级会员V3的分享

试着用这个模型: 链接:https://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw 提取码:om7f --来自百度网盘超级会员V3的分享

可以运行起来了,但是生成的句子只有前半是读出来的,后半句都是杂音,多生成几次有时会好点有时又会倒退回去,而且生成的声音和原音频不像,差的有点远的那种,哈哈

试着用这个模型: 链接:https://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw 提取码:om7f --来自百度网盘超级会员V3的分享

这个模型没问题,把_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!'(),-.:;? '改回原来的就行了

蔓用这个模型: 链接:https ://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw提取码:om7f --来自百度网盘超级会员V3的分享

这个可以解决了,但拿演示音频测试,生成的差了好多emmm

commented

蔓用这个模型: 链接:https ://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw提取码:om7f --来自百度网盘超级会员V3的分享

这个可以解决了,但拿演示音频测试,生成的差了好多emmm

跑的步数很少,可以延续跑到100k+

commented

ceshi的模型需要将代码切换到10月20号左右的commit之后,再按issue #37 修改之后就可以用了
而作者的模型,需要将代码切换到10月20号左右的commit之后使用

蔓用这个模型: 链接:https ://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw提取码:om7f --来自百度网盘超级会员V3的分享

这个可以解决了,但拿演示音频测试,生成的差了好多emmm

跑的步数很少,可以延续跑到100k+
是不断的点synthesize only之后,输出的声音就会越来越好吗?

这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py 第11行的内容 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容

改了之后还是没用,,,,希望再看看

已修改synthesizer/utils/symbols.py,还是出现报错

Synthesizer using device: cuda
Trainable Parameters: 32.735M
Traceback (most recent call last):
  File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda>
    func = lambda: self.synthesize() or self.vocode()
  File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize
    specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token)
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms
    self.load()
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load
    self._model.load(self.model_fpath)
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load
    self.load_state_dict(checkpoint["model_state"], strict=False)
  File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Tacotron:
        size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]).
        size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]).
        size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]).
        size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).

我觉得你这个估计是一开始你复制了模型到你的程序里面去了,重新解压一下那个程序的压缩包,然后重新来就可以了

为什么我的源音频是黑色的,有大佬知道吗?

源音频的Dataset和Speaker这些都是黑的,不能选择?

commented

没有被识别的数据集 不训练的话就不用理会了

大佬,是不是如果要克隆自己的声音的话,需要对自己做音源进行训练,而不能直接用community给的那些模型。昨天用给的模型(包括synthesizer和vector)克隆自己的录音,结果出来的梅尔频谱图是杂乱的,只有一堆电流声和噪声,求大佬指正错误

整篇评论都看了,raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(。生成的都是杂音,代码也照着改了都不行。换模型也不行。。。

同样的错误copying a param with shape torch.Size([128, 512]) ,输出的声音全部是杂音

纯萌新,请教一下切换到tag0.01怎么切换啊?完全没理解。
自己拿75k的训练了一阵目标语音,感觉模仿的声音还是不像,想换这个模型再训练试试

已修改synthesizer/utils/symbols.py,还是出现报错

Synthesizer using device: cuda
Trainable Parameters: 32.735M
Traceback (most recent call last):
  File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda>
    func = lambda: self.synthesize() or self.vocode()
  File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize
    specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token)
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms
    self.load()
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load
    self._model.load(self.model_fpath)
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load
    self.load_state_dict(checkpoint["model_state"], strict=False)
  File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Tacotron:
        size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]).
        size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]).
        size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]).
        size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).

同样的报错 你那个好了吗?

只输出杂音,按照评论来改了还是一样

commented

已修改synthesizer/utils/symbols.py,还是出现报错

Synthesizer using device: cuda
Trainable Parameters: 32.735M
Traceback (most recent call last):
  File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda>
    func = lambda: self.synthesize() or self.vocode()
  File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize
    specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token)
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms
    self.load()
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load
    self._model.load(self.model_fpath)
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load
    self.load_state_dict(checkpoint["model_state"], strict=False)
  File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Tacotron:
        size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]).
        size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]).
        size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]).
        size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).

同样的报错 你那个好了吗?

版本先切换,再应用#37

请问如何切换版本?

Same issue

改完一样报错,看起来又有新的问题,,,,,,,,,

File "E:\语音克隆\MockingBird\synthesizer\models\tacotron.py", line 564, in load
self.load_state_dict(checkpoint["model_state"], strict=False)
File "E:\anaconda\envs\torch\lib\site-packages\torch\nn\modules\module.py", line 1497, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Tacotron2:
size mismatch for embedding.weight: copying a param with shape torch.Size([148, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]).

怎么解决呀,我改了之后后面的70变化了

帮帮孩子吧

我这个是拿nVidia那个改了一点,为什么前面是148,怎么修改这个值

这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py 第11行的内容 改为: _characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? ' 即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容

改了 没用啊

已修改,还是出现报错synthesizer/utils/symbols.py

Synthesizer using device: cuda
Trainable Parameters: 32.735M
Traceback (most recent call last):
  File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda>
    func = lambda: self.synthesize() or self.vocode()
  File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize
    specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token)
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms
    self.load()
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load
    self._model.load(self.model_fpath)
  File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load
    self.load_state_dict(checkpoint["model_state"], strict=False)
  File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Tacotron:
        size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]).
        size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]).
        size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]).
        size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).

同样的报错 你那个好了吗?

同样的报错 你那个好了吗?

同样的报错 你那个好了吗?

同样的报错 你那个好了吗?

同问 切换到v0.0.1依然不行 (已加修复) pytorch是最新版 cuda11.7

按说明修改后还是没用,一样报错。一定要自己训练吗?

这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py 第11行的内容 改为: _characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? ' 即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容

快两年了,这个还会兼容吗 。。

RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]).
Traceback:
File "D:\ProgramData\Anaconda3\envs\voiceClone\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script
exec(code, module.dict)
File "C:\Users\zhanglong\AppData\Local\Temp\tmp342r_iv9.py", line 13, in
render_streamlit_ui()
File "H:\MockingBird\MockingBird\control\mkgui\base\ui\streamlit_ui.py", line 909, in render_streamlit_ui
session_state.output_data = opyrator(input=input_data_obj)
File "H:\MockingBird\MockingBird\control\mkgui\base\core.py", line 203, in call
return self.function(input_obj, **kwargs)
File "H:\MockingBird\MockingBird\control\mkgui\app.py", line 140, in synthesize
specs = current_synt.synthesize_spectrograms(texts, embeds)
File "H:\MockingBird\MockingBird\models\synthesizer\inference.py", line 91, in synthesize_spectrograms
self.load()
File "H:\MockingBird\MockingBird\models\synthesizer\inference.py", line 69, in load
self._model.load(self.model_fpath, self.device)
File "H:\MockingBird\MockingBird\models\synthesizer\models\base.py", line 55, in load
self.load_state_dict(state, strict=False)
File "D:\ProgramData\Anaconda3\envs\voiceClone\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(

Traceback (most recent call last):
File "D:\codeinstall\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
exec(code, module.dict)
File "C:\Users\zy820\AppData\Local\Temp\tmpegto92vs.py", line 13, in
render_streamlit_ui()
File "D:\develop\workspace-project\MockingBird\control\mkgui\base\ui\streamlit_ui.py", line 909, in render_streamlit_ui
session_state.output_data = opyrator(input=input_data_obj)
File "D:\develop\workspace-project\MockingBird\control\mkgui\base\core.py", line 203, in call
return self.function(input_obj, **kwargs)
self._model.load(self.model_fpath, self.device)
File "D:\develop\workspace-project\MockingBird\models\synthesizer\models\base.py", line 55, in load
self.load_state_dict(state, strict=False)
File "D:\codeinstall\Python310\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Tacotron:
size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([75, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]).
size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]).

按照上面的解决方法改了还是报错,有新的解决方案没

Traceback (most recent call last):
File "G:\PycharmProjects\MockingBird\control\toolbox_init_.py", line 260, in synthesize
specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token, steps=int(self.ui.length_slider.value())*200)
File "G:\PycharmProjects\MockingBird\models\synthesizer\inference.py", line 91, in synthesize_spectrograms
self.load()
File "G:\PycharmProjects\MockingBird\models\synthesizer\inference.py", line 69, in load
self._model.load(self.model_fpath, self.device)
File "G:\PycharmProjects\MockingBird\models\synthesizer\models\base.py", line 55, in load
self.load_state_dict(state, strict=False)
File "C:\Users\Admin.conda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Tacotron:
size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([70, 512]) from checkpoint, the shape in current model is torch.Size([75, 512]).
size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]).
size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]).
size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]).
size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
QWindowsWindow::setGeometry: Unable to set geometry 1992x1001+0+29 (frame: 2010x1048-9-9) on QWidgetWindow/"UIClassWindow" on "\.\DISPLAY1". Resulting geometry: 1920x1001+0+29 (frame: 1938x1048-9-9) margins: 9, 38, 9, 9 minimum size: 1992x583 MINMAXINFO maxSize=0,0 maxpos=0,0 mintrack=2010,630 maxtrack=0,0)

你好,除了一个75k steps 的合成器正常运行了,25k 150k 200k 的均出现类似的错误,这个是加载mandarin_200k.pt的合成器时候的报错,到现在还有解决方案吗?谢谢

上面提供改的所有方案都试过了全部没用,不知道真正导致数据不同步的错误在运行环境哪里

这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py 第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容

感谢!问题已顺利解决。

RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([75, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]). size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]).
Traceback:
File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
exec(code, module.dict)
File "C:\Users\Admin\AppData\Local\Temp\tmpsj6156uv.py", line 13, in
render_streamlit_ui()
File "E:\GithubProjects\MockingBird-main\control\mkgui\base\ui\streamlit_ui.py", line 909, in render_streamlit_ui
session_state.output_data = opyrator(input=input_data_obj)
File "E:\GithubProjects\MockingBird-main\control\mkgui\base\core.py", line 203, in call
return self.function(input_obj, **kwargs)
File "E:\GithubProjects\MockingBird-main\control\mkgui\app.py", line 140, in synthesize
specs = current_synt.synthesize_spectrograms(texts, embeds)
File "E:\GithubProjects\MockingBird-main\models\synthesizer\inference.py", line 91, in synthesize_spectrograms
self.load()
File "E:\GithubProjects\MockingBird-main\models\synthesizer\inference.py", line 69, in load
self._model.load(self.model_fpath, self.device)
File "E:\GithubProjects\MockingBird-main\models\synthesizer\models\base.py", line 55, in load
self.load_state_dict(state, strict=False)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(

按照这个修改了MockingBird-main\models\synthesizer\utils目录下面的symbols.py文件里面的第11行代码,但是依旧还是报错,不知道什么原因?

在我实际使用中发现,如果出现尺寸不匹配的问题,有说是输入框文字切割的问题,原始仓库Real-Time-Voice-Cloning也会出现这个问题。
但是多点击几次好像就不报这个错误,但是输出的音频还是以杂音为主

这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py 第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容

感谢!问题已顺利解决。

RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([75, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]). size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]). Traceback: File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script exec(code, module.dict) File "C:\Users\Admin\AppData\Local\Temp\tmpsj6156uv.py", line 13, in render_streamlit_ui() File "E:\GithubProjects\MockingBird-main\control\mkgui\base\ui\streamlit_ui.py", line 909, in render_streamlit_ui session_state.output_data = opyrator(input=input_data_obj) File "E:\GithubProjects\MockingBird-main\control\mkgui\base\core.py", line 203, in call return self.function(input_obj, **kwargs) File "E:\GithubProjects\MockingBird-main\control\mkgui\app.py", line 140, in synthesize specs = current_synt.synthesize_spectrograms(texts, embeds) File "E:\GithubProjects\MockingBird-main\models\synthesizer\inference.py", line 91, in synthesize_spectrograms self.load() File "E:\GithubProjects\MockingBird-main\models\synthesizer\inference.py", line 69, in load self._model.load(self.model_fpath, self.device) File "E:\GithubProjects\MockingBird-main\models\synthesizer\models\base.py", line 55, in load self.load_state_dict(state, strict=False) File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(

按照这个修改了MockingBird-main\models\synthesizer\utils目录下面的symbols.py文件里面的第11行代码,但是依旧还是报错,不知道什么原因?

同 不知道解决了吗

修改后还是报错:
RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([75, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]). size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]).
Traceback:
File "/Users/ywy/Library/Python/3.11/lib/python/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 535, in _run_script
exec(code, module.dict)
File "/private/var/folders/53/3r03mt7d4v9bsvhljvnd_zs80000gn/T/tmpo53ek00n.py", line 13, in
render_streamlit_ui()
File "/Users/ywy/MockingBird/control/mkgui/base/ui/streamlit_ui.py", line 909, in render_streamlit_ui
session_state.output_data = opyrator(input=input_data_obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ywy/MockingBird/control/mkgui/base/core.py", line 203, in call
return self.function(input_obj, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ywy/MockingBird/control/mkgui/app.py", line 140, in synthesize
specs = current_synt.synthesize_spectrograms(texts, embeds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ywy/MockingBird/models/synthesizer/inference.py", line 91, in synthesize_spectrograms
self.load()
File "/Users/ywy/MockingBird/models/synthesizer/inference.py", line 69, in load
self._model.load(self.model_fpath, self.device)
File "/Users/ywy/MockingBird/models/synthesizer/models/base.py", line 55, in load
self.load_state_dict(state, strict=False)
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(

Traceback (most recent call last):
File "D:\python\tts\chatgpt\tts\MockingBird-main\control\toolbox_init_.py", line 144, in
func = lambda: self.synthesize() or self.vocode()
^^^^^^^^^^^^^^^^^
File "D:\python\tts\chatgpt\tts\MockingBird-main\control\toolbox_init_.py", line 260, in synthesize
specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token, steps=int(self.ui.length_slider.value())*200)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\python\tts\chatgpt\tts\MockingBird-main\models\synthesizer\inference.py", line 91, in synthesize_spectrograms
self.load()
File "D:\python\tts\chatgpt\tts\MockingBird-main\models\synthesizer\inference.py", line 69, in load
self._model.load(self.model_fpath, self.device)
File "D:\python\tts\chatgpt\tts\MockingBird-main\models\synthesizer\models\base.py", line 55, in load
self.load_state_dict(state, strict=False)
File "C:\ProgramData\anaconda3\envs\pytorch\Lib\site-packages\torch\nn\modules\module.py", line 2152, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Tacotron:
size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([75, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]).
size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]).