mozilla / DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How does Java work

Jzow opened this issue · comments

Here is my code

from deepspeech import Model


audio_path = 'models/test1.wav'
# 已下载的模型地址(正确的模型文件中有以.pb结尾的文件)
model_path = "models/deepspeech-0.9.3-models.pbmm"
ars = Model(model_path)
translate_txt = ars.stt(audio_path)

print(translate_txt)

The following error occurred when I was running. I didn't find a similar problem in issues

C:\Users\Administrator\AppData\Local\Programs\Python\Python38\python.exe E:/iston_algorithm/util/speech/Test.py
TensorFlow: v2.3.0-6-g23ad988fcd
DeepSpeech: v0.9.3-0-gf2e9c858
2022-01-13 12:15:21.954871: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
  File "E:/iston_algorithm/util/speech/Test.py", line 8, in <module>
    translate_txt = ars.stt(audio_path)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\site-packages\deepspeech\__init__.py", line 162, in stt
    return deepspeech.impl.SpeechToText(self._impl, audio_buffer)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\site-packages\deepspeech\impl.py", line 175, in SpeechToText
    return _impl.SpeechToText(aCtx, aBuffer)
ValueError: invalid literal for int() with base 10: 'models/test1.wav'

Process finished with exit code 1

I've solved it

from deepspeech import Model
import numpy as np
import wave

audio_path = 'models/yasi2.wav'
# 已下载的模型地址(正确的模型文件中有以.pb结尾的文件)
model_path = "models/deepspeech-0.9.3-models.pbmm"
ars = Model(model_path)

fin = wave.open(audio_path, 'rb')
audio = np.frombuffer(fin.readframes(fin.getnframes()), np.int16)

translate_txt = ars.stt(audio)
print(translate_txt)

run result:

C:\Users\Administrator\AppData\Local\Programs\Python\Python38\python.exe E:/iston_algorithm/util/speech/Test.py
TensorFlow: v2.3.0-6-g23ad988fcd
DeepSpeech: v0.9.3-0-gf2e9c858
2022-01-13 12:53:25.896530: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
a nine year old girl in new mexico has raised morethan five hundred dollars for her little brother who needs hart surgery in whosten texas this july ad assun were tolsy's grand mother came alread said addison probably overheared a conversation between family members talking about the fun's needed to get her little brather to treatment i guess she overheard her grandfather and me talking about how we are worried about how we are going to get to hoston for my grandson's hearp surgery said alrd she decided to go outside and have a lemonade stand and make some drawings and pictures and seldom that's when addestan and her friends herrecer and emily bordon decided to sell lemonade for fifty cents accup and sel pictures for twenty five cents each before all red new it new mexico state police officers were among the many stopping by helping the reach a total of five hundred ad sixty eight dollars the family turned to social media expressing their gratitude saying from the bottom of our hearts we would like to deeply thank each an every person that stopped by questions one and two a bestd on the mews report you ave just heard question one god id addison raised money for eston to how did abison raised money

Process finished with exit code 0

I want to know how to integrate DeepSpeech with Java using Maven