folotoy-server-self-hosting

Config files for self-hosting the FoloToy Server.
Recommended using Linux x86_64, Debian 10-11/Ubuntu 22.04

Preparation

Required OpenAI key with whisper-1 and gpt-3.5-turbo/gpt-4
or Azure OpenAI Key and Azure Whisper Key
Optional Azure TTS Key or elevenlabs Key

Environment Dependency

docker

Quick Start

git clone git@github.com:FoloToy/folotoy-server-self-hosting.git

```
cd folotoy-server-self-hosting
```

Change all 192.168.41.154 into your external server IP in docker-compose.yml

AUDIO_DOWNLOAD_URL: http://192.168.41.154:8082
SPEECH_UDP_SERVER_HOST: 192.168.41.154

Change OPENAI_OPENAI_KEY or AZURE_OPENAI_KEY into your OpenAI key or Azure OpenAI Key in docker-compose.yml

OPENAI_OPENAI_KEY: sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

AZURE_OPENAI_KEY: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

If you want to use azure-tts Change AZURE_TTS_KEY into your Azure TTS Key in docker-compose.yml
```
AZURE_TTS_KEY: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
```

Run

```
docker compose up -d
```

Update

```
docker compose pull
```
```
docker compose up -d
```

Advanced

Using Custom OpenAI API Path

Remove # of the line in docker-compose.yml and change https://xxx.com/v1 into your custom OpenAI API path

#OPENAI_OPENAI_API_BASE: https://xxx.com/v1

Using Azure OpenAI

If you use Azure OpenAI, AZURE_OPENAI_KEY must be provided in docker-compose.yml

Remove # of the line in docker-compose.yml

AZURE_OPENAI_KEY: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Using elevenlabs

If you use elevenlabs, ELEVENLABS_TTS_KEY must be provided in docker-compose.yml

Remove # of the line in docker-compose.yml

ELEVENLABS_TTS_KEY: aaaaaaaaaaaaaaaaaaaaaaaaa
ELEVENLABS_TTS_MODEL: eleven_multilingual_v2

And change voice_name in roles.json to elevenlabs voice id.

Using Custom Prompt and Voice

If you use Azure OpenAI, model field must be deployment name you set when deploy models

Speech to text language config

In roles.json, you can add language configuration for stt recognition. Different speech recognition engines have different code formats for the language. Please pay attention to the configuration.

If stt_type is set to azure-stt, the language of the role must be configured, otherwise an error will occur. The language codes for azure-stt can be found here: Language, such as zh-CN.
If stt_type is set to openai-whisper, the language of the role is optional. If a language is specified, openai-whisper will recognize based on that specific language. If no language is set, openai-whisper will automatically detect the language. The language codes for openai-whisper can be found here: Language, such as zh.
If stt_type is using azure-whisper, language configuration is not supported.

Voice List

Language Codes

Role Level Config Examples

if stt_type is openai-whisper, language should use 639-1 codes

{"1": {
    "model": "gpt-3.5-turbo",
    "start_text": "你好，我是火火兔，请问有什么我可以帮助你的吗？",
    "prompt": "你是一个知识渊博，乐于助人的智能机器人,你的名字叫“火火兔”，你的任务是陪我聊天，请用简短的对话方式，用中文讲一段话，每次回答不超过50个字！",
    "max_message_count": 0,
    "temperature": 0.7,
    "max_tokens": 800,
    "top_p": 0.95,
    "frequency_penalty": 0,
    "presence_penalty": 0,
    "voice_name": "zh-CN-XiaoshuangNeural",
    "language": "zh",
    "stt_type": "openai-whisper"
  }}

if stt_type is azure-stt, language should use BCP-47 codes

{"1": {
    "model": "gpt-3.5-turbo",
    "start_text": "你好，我是火火兔，请问有什么我可以帮助你的吗？",
    "prompt": "你是一个知识渊博，乐于助人的智能机器人,你的名字叫“火火兔”，你的任务是陪我聊天，请用简短的对话方式，用中文讲一段话，每次回答不超过50个字！",
    "max_message_count": 20,
    "temperature": 0.7,
    "max_tokens": 800,
    "top_p": 0.95,
    "frequency_penalty": 0,
    "presence_penalty": 0,
    "voice_name": "zh-CN-XiaoshuangNeural",
    "language": "zh-CN",
    "stt_type": "azure-stt"
  }}

Full fields for reference

{"1": {
  "model": "gpt-3.5-turbo",
  "start_text": "你好，我是火火兔，请问有什么我可以帮助你的吗？",
  "prompt": "你是一个知识渊博，乐于助人的智能机器人,你的名字叫“火火兔”，你的任务是陪我聊天，请用简短的对话方式，用中文讲一段话，每次回答不超过50个字！",
  "max_message_count": 10,
  "temperature": 0.7,
  "max_tokens": 800,
  "top_p": 0.95,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "voice_name": "zh-CN-XiaoshuangNeural",
  "language": "zh-CN",
  "stt_type": "azure-stt",
  "llm_type": "openai",
  "tts_type": "azure-tts"
}}

{"1": {
  "model": "gpt-3.5-turbo",
  "start_text": "你好，我是火火兔，请问有什么我可以帮助你的吗？",
  "prompt": "你是一个知识渊博，乐于助人的智能机器人,你的名字叫“火火兔”，你的任务是陪我聊天，请用简短的对话方式，用中文讲一段话，每次回答不超过50个字！",
  "max_message_count": 10,
  "temperature": 0.7,
  "max_tokens": 800,
  "top_p": 0.95,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "voice_name": "6xTjFMIxZYYFJag51KZe",
  "language": "zh-CN",
  "stt_type": "azure-stt",
  "llm_type": "azure-openai",
  "tts_type": "elevenlabs"
}}

MQTT Authentication

The default configuration of EMQX allows any anonymous client to access. You can make your EMQX service only allow connections from your own devices by following these steps.

YouTube Video
Open http://your_external_server_ip:18083 in your browser(Default username： admin, Default password public. Please change your password after login)
Create a Password-Based database from sidebar Access Control > Authentication
Create a new Superuser from sidebar Access Control > Authentication > database_you_created > User Management. (Username and Password should be the ones defined in docker-compose.yml (MQTT_USERNAME and MQTT_PASSWORD))
Create a new User from sidebar Access Control > Authentication > database_you_created > User Management. (Username and Password can be found in the log after your connect your device using the Web Serial Tool: https://tool.folotoy.com/index > Console)

happykkAi / folotoy-server-self-hosting