drygdryg / yandex-speechkit-lib-python

Python SDK for Yandex Speechkit API.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Yandex SpeechKit Python SDK

PyPI GitHub PyPI - Format Build Status Updates Python 3 codecov Documentation Status

Python SDK for Yandex SpeechKit API.

This library supports absolutely all Yandex SpeechKit methods including “Streaming mode for short audio recognition”. For more information please visit Yandex Speechkit API Docs.

Getting Started

Assuming that you have Python and virtualenv installed, set up your environment and install the required dependencies like this, or you can install the library using pip:

$ git clone https://github.com/TikhonP/yandex-speechkit-lib-python.git
$ cd yandex-speechkit-lib-python
$ virtualenv venv
...
$ . venv/bin/activate
$ python -m pip install -r requirements.txt
$ python -m pip install .
python -m pip install speechkit

SpeechKit documentation

See speechkit readthedocs or speechkit docs in PDF for more info.

Using speechkit

There are support of synthesis, recognizing long and short audio. For more information please read Documentation.

First you need create session for authorisation:

from speechkit import Session

oauth_token = str('<oauth_token>')
folder_id = str('<folder_id>')
api_key = str('<api-key>')
jwt_token = str('<jwt_token>')

oauth_session = Session.from_yandex_passport_oauth_token(oauth_token, folder_id)
api_key_session = Session.from_api_key(api_key)
jwt_session = Session.from_jwt(jwt_token)

Use created session to make other requests.

There are also functions for getting credentials (read Documentation for more info): Speechkit.auth.generate_jwt, speechkit.auth.get_iam_token, speechkit.auth.get_api_key

For audio recognition

Short audio:

from speechkit import ShortAudioRecognition

recognizeShortAudio = ShortAudioRecognition(session)
with open(str('/Users/tikhon/Desktop/out.wav'), str('rb')) as f:
    data = f.read()

print(recognizeShortAudio.recognize(data, format='lpcm', sampleRateHertz='48000'))

# Will be printed: 'text that need to be recognized'

See example with long audio long_audio_recognition.py .

See example with streaming audio streaming_recognize.py

For synthesis

from speechkit import SpeechSynthesis

synthesizeAudio = SpeechSynthesis(session)
synthesizeAudio.synthesize(
    str('/Users/tikhon/Desktop/out.wav'), text='Текст который нужно синтезировать',
    voice='oksana', format='lpcm', sampleRateHertz='16000'
)

TODO

  • Provide wide range of exceptions (There is only speechkit.exceptions.RequestError right now)
  • Add troubleshooting headers to speechkit.Session
  • Add gRPC streaming synthesis

License

MIT

Tikhon Petrishchev 2021

About

Python SDK for Yandex Speechkit API.

License:MIT License


Languages

Language:Python 100.0%