Yandex SpeechKit Python SDK

Python SDK for Yandex SpeechKit API.

This library supports absolutely all Yandex SpeechKit methods including “Streaming mode for short audio recognition”. For more information please visit Yandex Speechkit API Docs.

Getting Started

Assuming that you have Python and virtualenv installed, set up your environment and install the required dependencies like this, or you can install the library using pip:

$ git clone https://github.com/TikhonP/yandex-speechkit-lib-python.git
$ cd yandex-speechkit-lib-python
$ virtualenv venv
...
$ . venv/bin/activate
$ python -m pip install -r requirements.txt
$ python -m pip install .

python -m pip install speechkit

SpeechKit documentation

See speechkit readthedocs or speechkit docs in PDF for more info.

Using speechkit

There are support of synthesis, recognizing long and short audio. For more information please read Documentation.

First you need create session for authorisation:

from speechkit import Session

oauth_token = str('<oauth_token>')
folder_id = str('<folder_id>')
api_key = str('<api-key>')
jwt_token = str('<jwt_token>')

oauth_session = Session.from_yandex_passport_oauth_token(oauth_token, folder_id)
api_key_session = Session.from_api_key(api_key)
jwt_session = Session.from_jwt(jwt_token)

Use created session to make other requests.

There are also functions for getting credentials (read Documentation for more info): Speechkit.auth.generate_jwt, speechkit.auth.get_iam_token, speechkit.auth.get_api_key

For audio recognition

Short audio:

from speechkit import ShortAudioRecognition

recognizeShortAudio = ShortAudioRecognition(session)
with open(str('/Users/tikhon/Desktop/out.wav'), str('rb')) as f:
    data = f.read()

print(recognizeShortAudio.recognize(data, format='lpcm', sampleRateHertz='48000'))

# Will be printed: 'text that need to be recognized'

See example with long audio long_audio_recognition.py .

See example with streaming audio streaming_recognize.py

For synthesis

from speechkit import SpeechSynthesis

synthesizeAudio = SpeechSynthesis(session)
synthesizeAudio.synthesize(
    str('/Users/tikhon/Desktop/out.wav'), text='Текст который нужно синтезировать',
    voice='oksana', format='lpcm', sampleRateHertz='16000'
)

TODO

Provide wide range of exceptions (There is only speechkit.exceptions.RequestError right now)
Add troubleshooting headers to speechkit.Session
Add gRPC streaming synthesis

License

MIT

Tikhon Petrishchev 2021

drygdryg / yandex-speechkit-lib-python