ufal / whisper_streaming

Whisper realtime streaming for long speech-to-text transcription and translation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

For real-time transcription, crashes after loading around the 60 sec market.

soforeward opened this issue · comments

Hi there, this has been super awesome but for some reason, it always crashes for me around the 60 second mark on the default model. One interesting thing to note is that as I increase the min-chunk-size the time it takes to crash also increases (but it always inevitably crashes). I can't figure out why and I don't even know where to begin to debug. It also does not matterr which model I use but I usually use the default.

I've connected the real-time streaming via a simple streaming socket app. I'm not sure if that is causing any issues but they both tend to crash together. Would apprecciate any guidance on how to solve or trouble shoot the issue. The very basic streaming app is as follows:

import socket
import pyaudio

Audio configuration

FORMAT = pyaudio.paInt16 # Audio format (16-bit PCM)
CHANNELS = 1 # Mono audio
RATE = 16000 # Sampling rate 16kHz
CHUNK = 1024 # Size of each audio chunk

Server configuration

SERVER = 'localhost' # Server IP address (adjust as needed)
PORT = 43007 # Port number for the server

def stream_audio_to_server():
audio = pyaudio.PyAudio()

# Open the microphone stream
stream = audio.open(format=FORMAT, channels=CHANNELS, rate=RATE, input=True, frames_per_buffer=CHUNK)

# Create a socket connection to the server
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as client_socket:
    client_socket.connect((SERVER, PORT))

    print("Streaming audio to server... Press Ctrl+C to stop.")

    try:
        while True:
            # Read audio chunk from microphone
            data = stream.read(CHUNK, exception_on_overflow=False)
            # Send audio chunk to server
            client_socket.sendall(data)
    except KeyboardInterrupt:
        pass
    finally:
        # Stop and close the stream
        stream.stop_stream()
        stream.close()
        audio.terminate()

        print("Stopped streaming.")

if name == "main":
stream_audio_to_server()

Any recommendations to solve this would be greatly appreciated! Thanks!

Hi, I'm sorry, I can't help you to debug your code. Please make sure the bug is reproducible with the code in this repo and you post all info for reproducibility -- input, output, arguments, expected behaviour, wrong behaviour, hardware info.

You can also investigate the logs that should indicate the reason of your crash. Maybe OOM?

Good luck!

sorrry abouut that. as soon as u use netcat it works and doesn't crash. probably just an efficiency issue. thanks anyway!