生成的语音开头有啪嗒的声音
zsanjin-p opened this issue · comments
用的是api生成的语音片段。
并不是每个生成的语音片段都有这样的啪嗒的声音,但是有不少语音片段头部,有啪嗒的一声,或者哒的一声,就像电流啪嗒一样的声音,这是什么原因?你们有这样吗?
Could you please provide more details about this issue, such as the specific text, speaker ID, and audio samples?
我也遇到了,speaker ID换成啥都不行,请帮忙看看什么问题,音频例子如下
response.zip
Could you please provide more details about this issue, such as the specific text, speaker ID, and audio samples?
我也遇到了,speaker ID换成啥都不行,请帮忙看看什么问题,音频例子如下
response.zip
When using the webpage-based demo by running streamlit run demo_page.py
, the generated audio contains no noise. However, I do notice noise at the beginning of the sample audio. Can you please provide more details about this issue?
我用的是api的方式。以下是我的docker run命令
docker run --gpus "device=3" -d --name EmotiVoice -p 28021:8000 -v /raid/liuhao/EmotiVoice:/workspace/EmotiVoice -w /workspace/EmotiVoice/EmotiVoice emoti-voice:v1 env LANG=C.UTF-8 sh -c "uvicorn openaiapi:app --reload --host 0.0.0.0 --port 8000 >> log/all.log 2>&1"
When using the webpage-based demo by running
streamlit run demo_page.py
, the generated audio contains no noise. However, I do notice noise at the beginning of the sample audio. Can you please provide more details about this issue?
我也遇到了,speaker ID换成啥都不行,请帮忙看看什么问题,音频例子如下 response.zip
import os
from pydub import AudioSegment
import logging
# Set up logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
def remove_or_silence_noise_from_audio_files(directory, noise_duration_ms, mode):
# Determine the output folder for processed audio files
output_folder = os.path.join(directory, "Processed_Audio")
if not os.path.exists(output_folder):
os.makedirs(output_folder)
logging.info(f"Folder created: {output_folder}")
# Get all audio files
audio_files = [file for file in os.listdir(directory) if file.endswith(('.mp3', '.wav'))]
logging.info(f"Found {len(audio_files)} audio files.")
# Initialize statistics variables
success_count = 0
fail_count = 0
failed_files = []
# Process each file
for file in audio_files:
file_path = os.path.join(directory, file)
try:
# Load the audio
audio = AudioSegment.from_file(file_path)
logging.info(f"Processing audio file: {file_path}")
if mode == 1:
# Remove noise from the beginning of the audio for noise_duration_ms milliseconds
processed_audio = audio[noise_duration_ms:]
elif mode == 2:
# Create a silence segment and replace the beginning noise_duration_ms milliseconds with it
silence = AudioSegment.silent(duration=noise_duration_ms)
processed_audio = silence + audio[noise_duration_ms:]
# Save the new audio file
new_file_path = os.path.join(output_folder, file)
processed_audio.export(new_file_path, format=file[-3:])
logging.info(f"Processed audio file saved to: {new_file_path}")
success_count += 1
except Exception as e:
logging.error(f"Error processing audio file {file_path}: {e}")
fail_count += 1
failed_files.append((file_path, str(e)))
# Log the results
logging.info(f"Processing complete. Success: {success_count}, Failures: {fail_count}")
if fail_count > 0:
logging.info("Failed files and reasons:")
for file, error in failed_files:
logging.info(f"File: {file}, Error: {error}")
if __name__ == "__main__":
# User inputs the processing time, default is 100ms
try:
noise_duration_ms = int(input("Enter the noise processing time (ms, default 100ms): ") or "100")
except ValueError:
print("Invalid input, using default value of 100ms")
noise_duration_ms = 100
# User chooses the processing mode
try:
mode = int(input("Choose the mode (1: Remove beginning noise, 2: Replace beginning noise with silence): "))
if mode not in [1, 2]:
raise ValueError("Invalid mode, must be 1 or 2")
except ValueError as ve:
print(ve)
mode = int(input("Please re-enter the correct mode (1 or 2): "))
# Call the function to process audio files in the current directory
remove_or_silence_noise_from_audio_files(os.getcwd(), noise_duration_ms, mode)