Python SDK for Yandex SpeechKit API.
This library supports absolutely all Yandex SpeechKit methods including “Streaming mode for short audio recognition”. For more information please visit Yandex Speechkit API Docs.
Assuming that you have Python and virtualenv
installed, set up your environment and install the required dependencies
like this, or you can install the library using pip
:
$ git clone https://github.com/TikhonP/yandex-speechkit-lib-python.git
$ cd yandex-speechkit-lib-python
$ virtualenv venv
...
$ . venv/bin/activate
$ python -m pip install -r requirements.txt
$ python -m pip install .
python -m pip install speechkit
See speechkit readthedocs or speechkit docs in PDF for more info.
There are support of synthesis, recognizing long and short audio. For more information please read Documentation.
First you need create session for authorisation:
from speechkit import Session
oauth_token = str('<oauth_token>')
folder_id = str('<folder_id>')
api_key = str('<api-key>')
jwt_token = str('<jwt_token>')
oauth_session = Session.from_yandex_passport_oauth_token(oauth_token, folder_id)
api_key_session = Session.from_api_key(api_key)
jwt_session = Session.from_jwt(jwt_token)
Use created session to make other requests.
There are also functions for getting credentials (read Documentation for more info):
Speechkit.auth.generate_jwt
, speechkit.auth.get_iam_token
, speechkit.auth.get_api_key
Short audio:
from speechkit import ShortAudioRecognition
recognizeShortAudio = ShortAudioRecognition(session)
with open(str('/Users/tikhon/Desktop/out.wav'), str('rb')) as f:
data = f.read()
print(recognizeShortAudio.recognize(data, format='lpcm', sampleRateHertz='48000'))
# Will be printed: 'text that need to be recognized'
See example with long audio long_audio_recognition.py .
See example with streaming audio streaming_recognize.py
from speechkit import SpeechSynthesis
synthesizeAudio = SpeechSynthesis(session)
synthesizeAudio.synthesize(
str('/Users/tikhon/Desktop/out.wav'), text='Текст который нужно синтезировать',
voice='oksana', format='lpcm', sampleRateHertz='16000'
)
- Provide wide range of exceptions (There is only
speechkit.exceptions.RequestError
right now) - Add troubleshooting headers to
speechkit.Session
- Add gRPC streaming synthesis
MIT
Tikhon Petrishchev 2021