Bert_vits2

部署

224/4/26更新

2024/3/18更新

不需要再自己部署服务了，运行下面几行代码，你就能实现非本地运行的bert_vits2语音合成

import asyncio

import httpx


async def modelscopeTTS(data):
    speaker = data.get("speaker")
    text = data.get("text")
    if text == "" or text == " ":
        text = "哼哼"
    if speaker == "阿梓":
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/Azusa-Bert-VITS2-2.3/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/Azusa-Bert-VITS2-2.3/gradio/file="
    elif speaker == "otto":
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/otto-Bert-VITS2-2.3/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/otto-Bert-VITS2-2.3/gradio/file="
    elif speaker == "塔菲":
        speaker="永雏塔菲"
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/Taffy-Bert-VITS2-2.3/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/Taffy-Bert-VITS2-2.3/gradio/file="
    elif speaker == "星瞳":
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/2568-Bert-VITS2/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/2568-Bert-VITS2/gradio/file="
    elif speaker == "丁真":
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/DZ-Bert-VITS2-2.3/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/DZ-Bert-VITS2-2.3/gradio/file="
    elif speaker == "东雪莲":
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/Azuma-Bert-VITS2.0.2/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/Azuma-Bert-VITS2.0.2/gradio/file="
    elif speaker == "嘉然":
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/Diana-Bert-VITS2-2.3/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/Diana-Bert-VITS2-2.3/gradio/file="
    elif speaker == "孙笑川":
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/SXC-Bert-VITS2/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/SXC-Bert-VITS2/gradio/file="
    elif speaker == "鹿鸣":
        speaker="Lumi"
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/Lumi-Bert-VITS2/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/Lumi-Bert-VITS2/gradio/file="
    elif speaker == "文静":
        speaker="Wenjing"
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/Wenjing-Bert-VITS2/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/Wenjing-Bert-VITS2/gradio/file="
    elif speaker == "亚托克斯":
        speaker="Aatrox"
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/Aatrox-Bert-VITS2/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/Aatrox-Bert-VITS2/gradio/file="
    elif speaker=="奶绿":
        speaker="明前奶绿"
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/LAPLACE-Bert-VITS2-2.3/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/LAPLACE-Bert-VITS2-2.3/gradio/file="
    elif speaker == "七海":
        speaker = "Nana7mi"
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/Nana7mi-Bert-VITS2/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/Nana7mi-Bert-VITS2/gradio/file="
    elif speaker=="恬豆":
        speaker="Bekki"
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/Bekki-Bert-VITS2/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/Bekki-Bert-VITS2/gradio/file="
    elif speaker=="科比":
        url = "https://www.modelscope.cn/api/v1/studio/xzjosh/Kobe-Bert-VITS2-2.3/gradio/run/predict"
        newurp = "https://www.modelscope.cn/api/v1/studio/xzjosh/Kobe-Bert-VITS2-2.3/gradio/file="
    data = {
        "data": ["<zh>" + text, speaker, 0.5, 0.5, 0.9, 1, "ZH", None, "Happy", "Text prompt", "", 0.7],
        "event_data": None,
        "fn_index": 0,
        "dataType": ["textbox", "dropdown", "slider", "slider", "slider", "slider", "dropdown", "audio", "textbox",
                     "radio", "textbox", "slider"],
        "session_hash": "xjwen214wqf"
    }
    p = '12.wav'
    #print(p)
    async with httpx.AsyncClient(timeout=200) as client:
        r = await client.post(url, json=data)
        print(r.json())
        newurl = newurp + r.json().get("data")[1].get("name")

        async with httpx.AsyncClient(timeout=200) as client:
            r = await client.get(newurl)

            with open(p, "wb") as f:
                f.write(r.content)
            return p
asyncio.run(modelscopeTTS({"speaker":"塔菲","text":"关注塔菲喵"}))

1、colab部署(推荐)

点击并依次运行即可

2、源码部署

请确保已经安装python3.9.0 ，由于使用了非此版本的py解释器部署此项目产生的报错请自行搜索解决。

1.1克隆仓库到本地

找一个你喜欢的目录，打开cmd执行

git clone https://github.com/avilliai/Bert_Vits2_Sever.git

下面获取必要文件，也可以从2.1提供的压缩包中获取，以确保git能够提供稳定更新支持

1.2下载必要文件

从Huggingface 下载如下三个文件并放入bert/chinese-roberta-wwm-ext-large文件夹

flax_model.msgpack
pytorch_model.bin
tf_model.h5

1.3获取模型

获取模型
创建logs文件夹，模型和配置文件的放置应当如下所示。注意它们在同一个文件夹即{角色名}文件夹内

Bert_Vits2_Sever/logs/{角色名}/{模型名(无所谓)}.pth
Bert_Vits2_Sever/logs/{角色名}/config.json

1.4安装必要环境并启动

双击安装脚本.bat
双击启动脚本.bat

更新

双击update.bat

3、压缩包部署(不推荐，基本不更新)

文件比较大github放不下，自带一个azusa模型。压缩包部署将不支持获取github更新
1、下载bert_vits2_sever.rar

进群628763673群文件下载
百度网盘提取码：9uyg

2、双击安装脚本.bat
3、双击启动脚本.bat

更新

从git仓库下载源码压缩包，解压，替换同名文件

如果你需要测试服务是否可用

测试用的代码放在test.py，激活虚拟环境并安装了对应的模型后，运行test.py如果生成了可用音频，代表测试成功。

安装更多模型

在logs文件夹下新建一个文件夹，如otto 把你的模型和配置文件放进去
如新增一个otto语音模型则

logs/otto/G_114514.pth
logs/otto/config.json

以及，务必填写characters.yaml，在其中建立对你的模型的索引

使用

要使用你的模型，需要在使用时传入对应的参数

对接到QQ机器人

现有方案 Manyana 已经完成对接
你也可以使用berglm 对接，它是一个打包好的exe文件

Manyana用户操作指北

请根据你的characters.yaml填写Manyana/config/settings.yaml，在bert_speakers填写所有你bert_vits中可用的角色
先启动Manyana，随后执行启动脚本.bat 即可
指令格式： xx说XXXXXXXXX

自行调用api

务必确认已填写characters.yaml，你没有的模型对应的配置可以删除
你可以用你喜欢的语言来调用这个api
在执行启动脚本.bat 后
向http://localhost:9080/synthesize 发送post请求
如果使用colab部署，笔记里面有写，自己看
api将需要下面的参数

text 文本
speaker 说话人(参考characters.yaml)

以下是一个示例(json)

{
    "text": "早上好，请关注我", 
    "speaker":"塔菲",
}

api将返回语音的二进制文件，你需要保存它

如果你熟悉python，test.py里面是一个调用实例，但我更建议使用异步的httpx而不是requests来调用它

使用GPU加速

如果你已经开始使用了，一定能感觉到，合成的速度并不能让人满意。
这时候，如果你有一张N卡，我们可以通过安装cuda,使用GPU加速来实现更快的语音合成。

安装cuda
- 上面的教程关于pytorch的部分不用看
根据你的实际情况选择一个torch版本

复制蓝色部分
在Bert_Vits_Sever文件夹打开cmd
- 依次输入如下指令
- cd venv/Scripts
- call activate.bat
- pip3 install torch==1.10.2+cu113 torchvision==0.11.3+cu113 torchaudio===0.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
  (这是根据我的系统、包管理器、Python版本和CUDA版本所生成的，不一定适合你)

重启即可

声明

严禁将此项目用于一切违反《中华人民共和国宪法》，《中华人民共和国刑法》，《中华人民共和国治安管理处罚法》和《中华人民共和国民法典》之用途。
严禁用于任何政治相关用途。

avilliai / Bert_Vits2_Sever