Watson Developer Cloud iOS SDK

The Watson Developer Cloud iOS SDK is a collection of services to allow developers to quickly add Watson Cognitive Computing services to their Swift iOS applications.

Visit our Quickstart Guide to build your first iOS app with Watson!

The Watson Developer Cloud iOS SDK is currently in beta.

Installation
IBM Watson Services
Authentication
Building and Testing
Open Source @ IBM
License
Contributing

Upgrading to Xcode 7.3

Apple released Xcode 7.3 and Swift 2.2 on March 21, 2016. To use the Watson Developer Cloud iOS SDK with Xcode 7.3 you will have to rebuild all dependencies (including those with pre-built binaries) because there is no binary compatability between Swift 2.1 and Swift 2.2.

Please use the terminal to navigate to your project directory and execute the following command: carthage update --platform iOS --no-use-binaries

This will rebuild all dependencies (including those with pre-built binaries) using Xcode 7.3 and Swift 2.2. Be aware that you will receive many warnings related to deprecations that will occur in Swift 3. These warnings do not affect the operation of the SDK and will be addressed in future releases of our dependencies.

Installation

The Watson Developer Cloud iOS SDK requires third-party dependencies such as ObjectMapper and Alamofire. The dependency management tool Carthage is used to help manage those frameworks. The recommended version of Carthage is v0.11 or higher.

There are two main methods to install Carthage. The first method is to download and run the Carthage.pkg installer. You can locate the latest release here.

The second method of installing is using Homebrew for the download and installation of carthage with the following commands

brew update && brew install carthage

Once the dependency manager is installed, the next step is to download the needed frameworks for the SDK to the project path. Make sure you are in the root of the project directory and run the following command. All of the frameworks can be found on the filesystem directory at location ./Carthage/Build/iOS/

carthage update --platform iOS

For more details on using the iOS SDK in your application, please review the Quickstart Guide.

Frameworks Used:

IBM Watson Services

Getting started with Watson Developer Cloud and Bluemix

The IBM Watson™ Developer Cloud (WDC) offers a variety of services for developing cognitive applications. Each Watson service provides a Representational State Transfer (REST) Application Programming Interface (API) for interacting with the service. Some services, such as the Speech to Text service, provide additional interfaces.

IBM Bluemix™ is the cloud platform in which you deploy applications that you develop with Watson Developer Cloud services. The Watson Developer Cloud documentation provides information for developing applications with Watson services in Bluemix. You can learn more about Bluemix from the following links:

The IBM Bluemix documentation, specifically the pages What is Bluemix? and the Bluemix overview. IBM developerWorks, specifically the IBM Bluemix section of IBM developerWorks and the article that provides An introduction to the application lifecycle on IBM Bluemix.

Alchemy Language

The AlchemyLanguage API utilizes sophisticated natural language processing techniques to provide high-level semantic information about your content.

AlchemyLanguage Features

Entity Extraction
Sentiment Analysis
Keyword Extraction
Concept Tagging
Relation Extraction
Taxonomy Classification
Author Extraction
Language Detection
Text Extraction
Microformats Parsing
Feed Detection

Requirements

Review the original AlchemyLanguage API here
An Alchemy API Key

Usage

Instantiate an AlchemyLanguage object and set its api key via a TokenAuthenticationStrategy

let alchemyLanguage = AlchemyLanguage(apiKey: "your-apikey-here")

API calls are instance methods, and model class instances are returned as part of our callback.

e.g.

alchemyLanguage.getEntities(requestType: .URL,
  html: nil,
  url: "http://www.google.com",
  text: nil) {

    (error, entities) in

    // returned data is inside "entities" in this case
    // code here

}

Alchemy Vision

AlchemyVision is an API that can analyze an image and return the objects, people, and text found within the image. AlchemyVision can enhance the way businesses make decisions by integrating image cognition.

Requirements

An Alchemy API Key

Usage

Instantiate an AlchemyVision object and set its api key

let alchemyVision = AlchemyVision(apiKey: "your-apikey-here")

API calls are instance methods, and model class instances are returned as part of our callback.

e.g.

let failure = { (error: NSError) in print(error) }

alchemyVision.getRankedImageFaceTags(url: url,
                                     failure: failure) { facetags in
	code here
}

Dialog

The IBM Watson Dialog service provides a comprehensive and robust platform for managing conversations between virtual agents and users through an application programming interface (API). Developers automate branching conversations that use natural language to automatically respond to user questions, cross-sell and up-sell, walk users through processes or applications, or even hand-hold users through difficult tasks.

To use the Dialog service, developers script conversations as they would happen in the real world, upload them to a Dialog application, and enable back-and-forth conversations with a user.

Instantiate the Dialog service:

let dialog = Dialog(username: "your-username-here", password: "your-password-here")

Create a Dialog application by uploading a Dialog file:

var dialogID: Dialog.DialogID?
let failure = { (error: NSError) in print(error) }
dialog.createDialog(dialogName,
                    fileURL: fileURL,
                    failure: failure) { (dialogID) in
    // code here
}

Start a conversation with the Dialog application:

var conversationID: Int?
var clientID: Int?
let failure = { (error: NSError) in print(error) }
dialog.converse(dialogID!,
                failure: failure) { conversationResponse in
    // save conversation parameters
    self.conversationID = conversationResponse.conversationID
    self.clientID = conversationResponse.clientID
    
    // print message from Watson
    print(conversationResponse.response)
}

Continue a conversation with the Dialog application:

let failure = { (error: NSError) in print(error) }
dialog.converse(dialogID!,
                conversationID: conversationID!,
                clientID: clientID!,
                input: input,
                failure: failure) { conversationResponse in
                
    // print message from Watson
    print(conversationResponse.response)
}

The following links provide additional information about the IBM Watson Dialog Service:

Language Translator

The IBM Watson™ Language Translator service provides an Application Programming Interface (API) that lets you select a domain, customize it, then identify or select the language of text, and then translate the text from one supported language to another.

How to instantiate and use the Language Translator service:

let languageTranslator = LanguageTranslator(username: "your-username-here", password: "your-password-here")
let failure = { (error: NSError) in print(error) }
languageTranslator.getIdentifiableLanguages(failure) { identifiableLanguage in
    // code here
}

The following links provide more information about the Language Translator service:

Natural Language Classifier

The IBM Watson™ Natural Language Classifier service uses machine learning algorithms to return the top matching predefined classes for short text inputs.

How to instantiate and use the Natural Language Classifier service:

let naturalLanguageClassifier = NaturalLanguageClassifier(username: "your-username-here", password: "your-password-here")
let failure = { (error: NSError) in print(error) }
naturalLanguageClassifier.classify(self.classifierIdInstanceId,
                                   text: "is it sunny?",
                                   failure: failure) { classification in
    // code here
}

The following links provide more information about the Natural Language Classifier service:

Personality Insights

The IBM Watson™ Personality Insights service provides an Application Programming Interface (API) that enables applications to derive insights from social media, enterprise data, or other digital communications. The service uses linguistic analytics to infer personality and social characteristics, including Big Five, Needs, and Values, from text.

let personalityInsights = PersonalityInsights(username: "your-username-here", password: "your-password-here")
let failure = { (error: NSError) in print(error) }
personalityInsights.getProfile(text: "Some text here",
                               failure: failure) { profile in
    // code here                          
}

The following links provide more information about the Personality Insights service:

Speech to Text

The IBM Watson Speech to Text service enables you to add speech transcription capabilities to your application. It uses machine intelligence to combine information about grammar and language structure to generate an accurate transcription. Transcriptions are supported for various audio formats and languages.

Recorded Audio

The following example demonstrates how to use the Speech to Text service to transcribe an audio file.

let bundle = NSBundle(forClass: self.dynamicType)
guard let fileURL = bundle.URLForResource("filename", withExtension: "wav") else {
	print("File could not be loaded.")
	return
}

let speechToText = SpeechToText(username: "your-username-here", password: "your-password-here")
let settings = TranscriptionSettings(contentType: .WAV)
let failure = { (error: NSError) in print(error) }

speechToText.transcribe(fileURL,
                        settings: settings,
                        failure: failure) { results in
    if let transcription = results.last?.alternatives.last?.transcript {
        print(transcription)
    }
}

Streaming Audio

Audio can also be streamed from the microphone to the Speech to Text service for real-time transcriptions. The following example demonstrates how to use the Speech to Text service with streaming audio. (Unfortunately, the microphone is not accessible from within the Simulator. Only applications on a physical device can stream microphone audio to Speech to Text.)

let speechToText = SpeechToText(username: "your-username-here", password: "your-password-here")

var settings = TranscriptionSettings(contentType: .L16(rate: 44100, channels: 1))
settings.continuous = true
settings.interimResults = true

let failure = { (error: NSError) in print(error) }
let stopStreaming = speechToText.transcribe(settings,
                                            failure: failure) { results in
    if let transcription = results.last?.alternatives.last?.transcript {
        print(transcription)
    }
}

// Streaming will continue until either an end-of-speech event is detected by
// the Speech to Text service or the `stopStreaming` function is executed.

Custom Capture Sessions

Advanced users who want to create and manage their own AVCaptureSession can construct an AVCaptureAudioDataOutput to stream audio to the Speech to Text service. This is particularly useful for users who would like to visualize an audio waveform, save audio to disk, or otherwise access the microphone audio data while simultaneously streaming to the Speech to Text service.

The following example demonstrates how to use an AVCaptureSession to stream audio to the Speech to Text service.

class ViewController: UIViewController {
    var captureSession: AVCaptureSession?
    
    override func viewDidLoad() {
        super.viewDidLoad()
        
        let speechToText = SpeechToText(username: "your-username-here", password: "your-password-here")
        
        captureSession = AVCaptureSession()
        guard let captureSession = captureSession else {
            return
        }
        
        let microphoneDevice = AVCaptureDevice.defaultDeviceWithMediaType(AVMediaTypeAudio)
        let microphoneInput = try? AVCaptureDeviceInput(device: microphoneDevice)
        if captureSession.canAddInput(microphoneInput) {
            captureSession.addInput(microphoneInput)
        }
        
        var settings = TranscriptionSettings(contentType: .L16(rate: 44100, channels: 1))
        settings.continuous = true
        settings.interimResults = true
        
        let failure = { (error: NSError) in print(error) }
        let outputOpt = speechToText.createTranscriptionOutput(settings,
                                                               failure: failure) { results in
            if let transcription = results.last?.alternatives.last?.transcript {
                print(transcription)
            }
        }
        
        guard let output = outputOpt else {
            return
        }
        let transcriptionOutput = output.0
        let stopStreaming = output.1
        
        if captureSession.canAddOutput(transcriptionOutput) {
            captureSession.addOutput(transcriptionOutput)
        }
        
        captureSession.startRunning()
    }
    
    // Streaming will continue until either an end-of-speech event is detected by
    // the Speech to Text service, the `stopStreaming` function is executed, or
    // the capture session is stopped.

Additional Information

The following links provide additional information about the IBM Speech to Text service:

Text to Speech

The Text to Speech service gives your app the ability to synthesize spoken text in a variety of voices.

Create a TextToSpeech service:

let textToSpeech = TextToSpeech(username: "your-username-here", password: "your-password-here")

To call the service to synthesize text:

let failure = { (error: NSError) in print(error) }
textToSpeech.synthesize("Hello World", failure: failure) { data in
        // code here
}

When the callback function is invoked, and the request was successful, the data object is an NSData structure containing WAVE formatted audio in 48kHz and mono-channel.

If you wish to play the audio through the device's speakers, create an AVAudioPlayer with that NSData object:

let audioPlayer = try AVAudioPlayer(data: data)
audioPlayer.prepareToPlay()
audioPlayer.play()

The Watson TTS service contains support for many voices with different genders, languages, and dialects. For a complete list, see the documentation or call the service's to list the possible voices in an asynchronous callback:

let failure = { (error: NSError) in print(error) }
textToSpeech.getVoices(failure) { voices in
    	  // code here
}

You can review the different voices and languages here.

To use the voice, such as Kate's, specify the voice identifier in the synthesize method:

textToSpeech.synthesize("Hello World", voice: SynthesisVoice.GB_Kate) { data in
    // code here
}

The following links provide more information about the Text To Speech service:

Tone Analyzer

The Tone Analyzer service uses linguistic analysis to detect three types of tones from text: emotion, social tendencies, and language style.

How to instantiate and use the Tone Analyzer service:

let username = "your-username-here"
let password = "your-password-here"
let versionDate = "YYYY-MM-DD" // use today's date for the most recent version
let service = ToneAnalyzer(username: username, password: password, versionDate: versionDate)

let failure = { (error: NSError) in print(error) }
service.getTone("Text that you want to get the tone of", failure: failure) { responseTone in
    print(responseTone.documentTone)
}

The following links provide more information about the Text To Speech service:

Visual Recognition

The Visual Recognition service helps you to understand the contents of images. Submit an image, and the service returns scores for relevant classifiers representing things. It can even detect objects, texts or faces.

Here is an example how to use the service to detect faces in an Image:

let apiKey = "your-apikey-here"
let versionDate = "YYYY-MM-DD" // use today's date for the most recent version

let service = VisualRecognition(apiKey: apiKey, version: versionDate)

let failure = { (error: NSError) in print(error) }
service.detectFaces(url, failure: failure) { imagesWithFaces in
    // code here
}

The following links provide more information about the Text To Speech service:

Authentication

IBM Watson Services are hosted in the Bluemix platform. Before you can use each service in the SDK, the service must first be created in Bluemix, bound to an Application, and you must have the credentials that Bluemix generates for that service. Alchemy services use a single API key, and all the other Watson services use a username and password credential.

Build + Test

XCode is used to build the project for testing and deployment. Select Product->Build For->Testing to build the project in XCode's menu.

In order to build the project and run the unit tests, a credentials.plist file needs to be populated with proper credentials in order to comumnicate with the running Watson services. A copy of this file is located in the project's folder under Source/SupportingFiles. The credentials.plist file contains a key and value for each service's user name and password. For example, Personality Insights has a key of PersonalityInsightsUsername for the user name and a key of PersonalityInsightsPassword for the password. A user name and password can be optained from a running Watson service on Bluemix. Please refer to the IBM Watson Services section for more information about Watson Services and Bluemix

There are many tests already in place, positive and negative, that can be displayed when selecting the Test Navigator in XCode. Right click on the test you want to run and select Test in the context menu to run that specific test. You can also select a full node and right-click to run all of the tests in that node or service.

Tests can be found in the ServiceName+Tests target, as well as in each individual service’s directory. All of them can be run through Xcode’s testing interface using XCTest. Travis CI will also execute tests for pull requests and pushes to the repository.

Open Source @ IBM

Find more open source projects on the IBM Github Page

License

This library is licensed under Apache 2.0. Full license text is available in LICENSE.

This SDK is intended solely for use with an Apple iOS product and intended to be used in conjunction with officially licensed Apple development tools.

Contributing

See CONTRIBUTING on how to help out.

thomaspaulmann / ios-sdk

Watson Developer Cloud iOS SDK

Table of Contents

Upgrading to Xcode 7.3

Installation

IBM Watson Services

Alchemy Language

AlchemyLanguage Features

Requirements

Usage

Alchemy Vision

Links

Requirements

Usage

Dialog

Language Translator

Natural Language Classifier

Personality Insights

Speech to Text

Recorded Audio

Streaming Audio

Custom Capture Sessions

Additional Information

Text to Speech

Tone Analyzer

Visual Recognition

Authentication

Build + Test

Open Source @ IBM

License

Contributing

About

Languages