AppDevGuy / OSSSpeechKit

OSSSpeechKit offers a native iOS Speech wrapper for AVFoundation and Apple's Speech.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OSSSpeechKit

OSSSpeechKit Logo

Build Status Version License Platform codecov docs

OSSSpeechKit was developed to provide easier accessibility options to apps.

Apple does not make it easy to get the right voice, nor do they provide a simple way of selecting a language or using speech to text. OSSSpeechKit makes the hassle of trying to find the right language go away.

Requirements

  • Swift 5.0 or higher
  • iOS 13.0 or higher
  • Cocoapods

Supported Languages

The table below shows the original 37 languages first supported. Since v0.3.3, an additional 10 languages have been added.

English - Australian
๐Ÿ‡ฆ๐Ÿ‡บ
Hebrew
๐Ÿ‡ฎ๐Ÿ‡ฑ
Japanese
๐Ÿ‡ฏ๐Ÿ‡ต
Romanian
๐Ÿ‡ท๐Ÿ‡ด
Swedish
๐Ÿ‡ธ๐Ÿ‡ช
Norsk
๐Ÿ‡ณ๐Ÿ‡ด
Portuguese - Brazilian
๐Ÿ‡ง๐Ÿ‡ท
Hindi - Indian
๐Ÿ‡ฎ๐Ÿ‡ณ
Korean
๐Ÿ‡ฐ๐Ÿ‡ท
Russian
๐Ÿ‡ท๐Ÿ‡บ
Chinese - Taiwanese
๐Ÿ‡น๐Ÿ‡ผ
Dutch - Belgium
๐Ÿ‡ง๐Ÿ‡ช
French - Canadian
๐Ÿ‡จ๐Ÿ‡ฆ
Hungarian
๐Ÿ‡ญ๐Ÿ‡บ
Spanish - Mexican
๐Ÿ‡ฒ๐Ÿ‡ฝ
Arabic - Saudi Arabian
๐Ÿ‡ธ๐Ÿ‡ฆ
Thai
๐Ÿ‡น๐Ÿ‡ญ
French
๐Ÿ‡ซ๐Ÿ‡ท
Chinese
๐Ÿ‡จ๐Ÿ‡ณ
Indonesian
๐Ÿ‡ฎ๐Ÿ‡ฉ
Norwegian
๐Ÿ‡ณ๐Ÿ‡ด
Slovakian
๐Ÿ‡ธ๐Ÿ‡ฐ
Turkish
๐Ÿ‡น๐Ÿ‡ท
Finnish
๐Ÿ‡ซ๐Ÿ‡ฎ
Chinese - Hong Kong
๐Ÿ‡ญ๐Ÿ‡ฐ
English - Irish
๐Ÿ‡ฎ๐Ÿ‡ช
Polish
๐Ÿ‡ต๐Ÿ‡ฑ
English - South African
๐Ÿ‡ฟ๐Ÿ‡ฆ
English - United States
๐Ÿ‡บ๐Ÿ‡ธ
Danish
๐Ÿ‡ฉ๐Ÿ‡ฐ
Czech
๐Ÿ‡จ๐Ÿ‡ฟ
Italian
๐Ÿ‡ฎ๐Ÿ‡น
Portuguese
๐Ÿ‡ต๐Ÿ‡น
Spanish
๐Ÿ‡ช๐Ÿ‡ธ
English
๐Ÿ‡ฌ๐Ÿ‡ง
Dutch
๐Ÿ‡ณ๐Ÿ‡ฑ
Greek
๐Ÿ‡ฌ๐Ÿ‡ท

Features

OSSSpeechKit offers simple text to speech and speech to text in 47 different languages.

OSSSpeechKit is built on top of both the AVFoundation and Speech frameworks.

You can achieve text to speech or speech to text in as little as two lines of code.

The speech will play over the top of other sounds such as music.

Installation

OSSSpeechKit is available through CocoaPods. To install it, simply add the following line to your Podfile:

pod 'OSSSpeechKit'

Implementation

Text to Speech

These methods enable you to pass in a string and hear the text played back using.

Simple

import OSSSpeechKit

.....

// Declare an instance of OSSSpeechKit
let speechKit = OSSSpeech.shared
// Set the voice you wish to use - currently upper case for formality or language and country name
speechKit.voice = OSSVoice(quality: .enhanced, language: .Australian)
// Set the text in the language you have set
speechKit.speakText(text: "Hello, my name is OSSSpeechKit.")

Advanced

import OSSSpeechKit

.....

// Declare an instance of OSSSpeechKit
let speechKit = OSSSpeech.shared
// Create a voice instance
let newVoice = OSSVoice()
// Set the language
newVoice.language = OSSVoiceEnum.Australian.rawValue
// Set the voice quality
newVoice.quality = .enhanced
// Set the voice of the speech kit
speechKit.voice = newVoice
// Initialise an utterance
let utterance = OSSUtterance(string: "Testing")
// Set the recognition task type
speechKit.recognitionTaskType = .dictation
// Set volume
utterance.volume = 0.5
// Set rate of speech
utterance.rate = 0.5
// Set the pitch
utterance.pitchMultiplier = 1.2
// Set speech utterance
speechKit.utterance = utterance
// Ask to speak
speechKit.speakText(text: utterance.speechString)

Speech to Text

Currently speech to text is offered in a very simple format. Starting and stopping of recording is handled by the app.

iOS 13 On-Device Speech to Text support is now available as of 0.3.0 ๐ŸŽ‰

SpeechKit implements delegates to handle the recording authorization, output of text and failure to record.

speechKit.delegate = self
// Call to start and end recording.
speechKit.recordVoice()
// Call to end recording
speechKit.endVoiceRecording()

It is important that you have included in your info.plist the following:

Privacy - Speech Recognition Usage Description

Privacy - Microphone Usage Description

Without these, you will not be able to access the microphone nor speech recognition.

Delegates

Handle returning authentication status to user - primary use is for non-authorized state.

func authorizationToMicrophone(withAuthentication type: OSSSpeechKitAuthorizationStatus)

When the microphone has finished accepting audio, this delegate will be called with the final best text output.

func didFailToCommenceSpeechRecording()

If the speech recogniser and request fail to set up, this method will be called.

func didFinishListening(withText text: String)

For further information you can check out the Apple documentation directly.

Other Features

List all available voices:

let allLanguages = OSSVoiceEnum.allCases

Get specific voice information:

// All support languages
let allVoices = OSSVoiceEnum.allCases
// Language details
let languageInformation = allVoices[0].getDetails()
// Flag of country
let flag = allVoices[0].flag

The getDetails() method returns a struct containing:

OSSVoiceInfo {
    /// The name of the voice; All AVSpeechSynthesisVoice instances have a persons name.
    var name: String?
    /// The name of the language being used.
    var language: String?
    /// The language code is what is internationally used in Locale settings.
    var languageCode: String?
    /// Identifier is a unique bundle url provided by Apple for each AVSpeechSynthesisVoice.
    var identifier: Any?
}

Other Info

The OSSVoiceEnum contains other methods, such as a hello message, title variable and subtitle variable so you can use it in a list.

You can also set the speech:

  • volume
  • pitchMultiplier
  • rate

As well as using an NSAttributedString.

There are plans to implement flags for each country as well as some more features, such as being able to play the voice if the device is on silent.

If the language or voice you require is not available, this is either due to:

  • Apple have not made it available through their AVFoundation;
  • or the SDK has not been updated to include the newly added voice.

Important Information

Apple do not make the voice of Siri available for use.

This kit provides Apple's AVFoundation voices available and easy to use, so you do not need to know all the voice codes, among many other things.

To say things correctly in each language, you need to set the voice to the correct language and supply that languages text; this SDK is not a translator.

Code Example:

You wish for you app to use a Chinese voice, you will need to ensure the text being passed in is Chinese.

Disclaimer: I do not know how to speak Chinese, I have used Google translate for the Chinese characters.

Correct:

speechKit.voice = OSSVoice(quality: .enhanced, language: .Chinese)
speechKit.speakText(text: "ไฝ ๅฅฝๆˆ‘็š„ๅๅญ—ๆ˜ฏ ...")

Incorrect:

speechKit.voice = OSSVoice(quality: .enhanced, language: .Australian)
speechKit.speakText(text: "ไฝ ๅฅฝๆˆ‘็š„ๅๅญ—ๆ˜ฏ ...")

OR

speechKit.voice = OSSVoice(quality: .enhanced, language: .Chinese)
speechKit.speakText(text: "Hello, my name is ...")

This same principle applies to all other languages such as German, Saudi Arabian, French, etc.. Failing to set the language for the text you wish to be spoken will not sound correct.

Contributions and Queries

If you have a question, please create a ticket or email me directly.

If you wish to contribute, please create a pull request.

Example Project

To run the example project, clone the repo, and run pod install from the Example directory first.

Unit Tests

For further examples, please look at the Unit Test class.

Author

App Dev Guy

profile for App Dev Guy at Stack Overflow, Q&A for professional and enthusiast programmers

License

OSSSpeechKit is available under the MIT license. See the LICENSE file for more info.

About

OSSSpeechKit offers a native iOS Speech wrapper for AVFoundation and Apple's Speech.

License:MIT License


Languages

Language:Swift 97.2%Language:Ruby 2.4%Language:Makefile 0.4%