collabora / WhisperLive

A nearly-live implementation of OpenAI's Whisper.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problem about long stop when processing a given audio file

KujoStar opened this issue · comments

I run the code below to process a given audio file and change stdout to add output to a list.

from whisper_live.client import TranscriptionClient
import sys

import sys
class ListStream:
    def __init__(self):
        self.data = []
    def write(self, s: str):
      if s != '\n' and not s.startswith('[INFO]'):
        self.data.append(s)
# change stdout
sys.stdout = x = ListStream()

client = TranscriptionClient(
  "172.18.64.66",
  9090,
  lang="en",
  translate=False,
  model="small",
  use_vad=False,
)

client.play_file("./output.wav")
client.close_all_clients()
sys.stdout = sys.__stdout__
print(x.data[-1])

It runs well, but I found that after the audio file was processed, it waited for about 10s to end, and then it print x.data[-1] to stdout. I can't find out the reason. I wonder if I can do something to avoid the stop.

Moreover, I've got another problem. When I change client.play_file("./output.wav") to simply client() to record my voice, I found that the server can't detect if I stopped saying. After I speak, I have to use "Ctrl+C" to stop the client and print the data. It seems client.close_all_clients() does not work. I wonder if there will be a solution.

@KujoStar the reason we wait for 10s [here]

client.wait_before_disconnect()

is if any response from the server hasnt been received yet(in case the server is slow; runs on cpu).

Yes, for the recording client to stop the way we stop processing for now is by a KeyboardInterrupt, else its always listening.