ufal / whisper_streaming

Whisper realtime streaming for long speech-to-text transcription and translation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

'str' object has no attribute 'sep'

agandhinit opened this issue · comments

I am using this code to setup a ws server. Client sends audio chunks to this ws server. The ws connection is established but first upstream audio send gives me an error 'str' object has no attribute 'sep'.
Basically it fails at https://github.com/ufal/whisper_streaming/blob/main/whisper_online.py#L250 - not so sure why though. Do you see an obvious issue i am missing ?


src_lan = "en"  # source language
tgt_lan = "en"  # target language -- same as source for ASR, "en" if translate task is used
# Set options:
asr = FasterWhisperASR(src_lan, "large-v2") 
online = OnlineASRProcessor(tgt_lan, asr)
async def audio_processing(websocket, path):
    try:
        online.init() 
        async for audio_chunk in websocket:
            a = audio_chunk  # Receive new audio chunk
            print("audio",a)
            online.insert_audio_chunk(a)
            o = online.process_iter()
            await websocket.send(o)  # Send the current partial output
        # At the end of audio processing, send the final output
        final_output = online.finish()
        await websocket.send(final_output)
    except websockets.exceptions.ConnectionClosedOK:
        pass
    except Exception as e:
        print(e)
# Start the WebSocket server
start_server = websockets.serve(audio_processing, "localhost", 8765)

asyncio.get_event_loop().run_until_complete(start_server)
asyncio.get_event_loop().run_forever()

Hi, no, I can't see any obvious error at this moment. Can you debug and give more info? E.g. what's in the variables around the line 250?

I also encountered the same problem, possibly because the parameters passed in the example do not match the parameters of the actual function

`class OnlineASRProcessor:

SAMPLING_RATE = 16000

def __init__(self, asr, tokenizer):
    """asr: WhisperASR object
    tokenizer: sentence tokenizer object for the target language. Must have a method *split* that behaves like the one of MosesTokenizer.
    """
    self.asr = asr
    self.tokenizer = tokenizer`

But the parameters passed in the example are as follows:
online = OnlineASRProcessor(tgt_lan, asr)

Perhaps you can try calling it in the following way?
tokenizer = create_tokenizer("en") online =OnlineASRProcessor(asr, tokenizer)