CMU-TBD / tbd_polly_speech

ROS Wrapper for Amazon Polly Speech with local caching of generated audio.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tbd_polly_speech

License - MIT
Maintainer - zhi.tan@ri.cmu.edu

A ROS wrapper for Amazon Polly Text-to-Speech service. We also cache locally each soundfile, so if you ever repeat the same sentence with the same voice, a local copy of the audio will be used instead of sythesizing it again.

Dependencies

ROS Packages:

Python Dependencies

  • boto3

Usage

Running

  1. Make sure you have a AWS credentials file setup on the system. Guide by AWS. Make sure the account has AWS Polly Speech Enabled
  2. launch the backend services
roslaunch tbd_polly_speech polly_speech.launch
  1. You can access the service either through the Python API
from succes_polly_speech import PollySpeech

ps = PollySpeech()

ps.speak("I am a good robot",voice_id='Joanna')
ps.speak("I am not a scary robot",voice_id='Joanna', block=False)
ps.wait()
ps.speak('Hello World. I will be interrupted',voice_id='Emma', block=False, cancel=True) 
ps.stopAll() #Interrupts the sentence and stop the voice command

OR directly calling the action server at the topic tbd_polly_speech/speak with the action pollySpeechAction.

ROS Parameters

There are three ROS parameters in the launch file

  • no_audio, true if you just want to simulate it and not actually running the code.

  • play_type, the default is sound_play in the ros-driver/audio_common repository. You can install this with sudo apt install ros-melodic-audio-common. The alternative is TBD's lab own audio stack (tbd_audio_common) that uses actionlib instead of ROS messages and plays faster.

  • polly_audio_storage_path, the path to the location you want to store the audio and also the masterlist.txt which stores the coding from phrases/text to filename. the default is PACKAGE_ROOT/audio_storage

Voices

A list of voice ID can be found here: https://docs.aws.amazon.com/polly/latest/dg/voicelist.html

Past Contributors

  • Joe Connolly - 07/2018

About

ROS Wrapper for Amazon Polly Speech with local caching of generated audio.

License:MIT License


Languages

Language:Python 74.7%Language:CMake 25.3%