mramshaw / Google-Assistant

Getting familiar with Google Assistant

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Google Assistant

Google Assistant logo

Having heavily investigated Amazon Alexa, it seemed to be time to take a good look at Google Assistant.

Content

The contents are as follows:

About Google Assistant

Google Assistant is Google's voice assistant, which grew out of Google Now.

Wikipedia has a pretty good article about Google Assistant:

http://en.wikipedia.org/wiki/Google_Assistant

Software versus Hardware

Google Assistant is the software part of Google's voice offerings while Google Home devices are (one of) the hardware components.

Devices

In addition to Google Home and other Google devices, Google Assistant is available on Android devices (including Wear OS devices such as smartwatches) and also iOS devices.

So what's it good for?

An awful lot of things. For one thing, it's pretty much a fully-fledged VUI (voice user interface) as you can acccess just about every feature on your Android phone with a simple voice command. And with the upcoming App Actions it will soon be possible to call into your phone apps (this will require some action on the part of app developers, however).

Installation

Google Assistant is available for a variety of different devices, often pre-installed.

Android phones

To enable Google Assistant on your Android phone, navigate to the Google Play store and install it:

http://play.google.com/store/apps/details?id=com.google.android.apps.googleassistant

Entry-level Android devices

For entry-level Android phones there is Google Assistant Go:

http://play.google.com/store/apps/details?id=com.google.android.apps.assistant

[This app should come pre-installed on Android (Go edition) devices.]

There is a nice summary of this app's uses here:

http://support.google.com/assistant/answer/7556235

iOS Devices

To enable Google Assistant on iOS devices, navigate to the App Store and install it:

http://itunes.apple.com/us/app/the-google-assistant-get-help-anytime-anywhere/id1220976145

[As might be expected, Google Assistant is not integrated as tightly with iOS as Siri is - so there are things that Siri can do that Google Assistant cannot, nevertheless the reviews are pretty good.]

Voices

Internally, Google Assistant has several voices available. These are colour-coded, with names like "Red" and "Purple". Additionally, there are voices with names that subtly hint at their origins - these have names like "British Racing Green" and "Sydney Harbour Blue".

As you would expect, the default voice (Red) is actually the best choice - although being able to choose a personalized custom voice is a very nice touch.

Cancel

It is possible to stop at any point by saying (or typing) "cancel".

Here in the Actions Simulator we can see this (Android devices will look slightly different but function in the same way):

Google Assistant cancel

Likewise, the large "X" (top left) may be tapped or clicked to terminate the app.

In the Actions Simulator, the "cancel" lozenge may be tapped or clicked to terminate the app.

[This lozenge will not be presented on Android devices unless programmed.]

Wake Word

To let Google know you want to invoke a Google Action, start with:

"Hey Google"

As in:

"Hey Google, Talk to Peanut Allergy Facts"

Here Peanut Allergy Facts is the app to be invoked, and Hey Google is what is known as a Wake Word.

Google have taken the concept of a wake word literally, as you can set up your Android phone to wake up whenever you say the words Hey Google - this requires configuration and is not the default behaviour.

As opposed to Amazon Alexa, which expects verbs such as Open, Launch or Start, in Google Assistant the standard way to invoke an app is with Talk to (of course you may still specify Open, Launch or Start instead).

This breaks down as follows:

Vendor wake word verb app name
Amazon Alexa, open Peanut Allergy Facts
Google Hey Google, Talk to Peanut Allergy Facts

This distinction between the normal verb is probably insignificant, as both vendors allow customization.

[Amazon also allows customization of its wake word.]

Smartwatches

With a Wear OS device, either say "Ok Google" or else press and hold the power button to get started.

Smartphones

If you are using the Google Assistant app on your smartphone, simply press the Google Assistant icon.

There is no need to say either "Hey Google" or "Ok Google" if you are talking to the Google Assistant app.

Simply say "start Peanut Allergy Facts" and proceed accordingly.

Easter Eggs

Google Assistant comes with a nice selection of Easter Eggs.

Say any of the following phrases for an Easter Egg:

"Testing"
"Tell me a joke"
"Tell me a story"
"Do a barrel roll"
"What's the loneliest number?"
"Make me a sandwich"
"When am I?"
"Can you pass the turing test?"
"I am your father"
"Set phasers to stun"
"Set phasers to kill"
"It's my birthday"
"Do you want to build a snowman?"
"How many roads must a man walk down?"

[There are other Easter Eggs as well. They are mostly topical, so some of them may be replaced or removed.]

Google Actions

Individual components for Google Assistant are called Actions.

These are similiar to - but not quite the same thing - as what AWS Alexa calls Skills. It seems Google uses a very narrow definition - as in, an action consists of an intent and its matching fullfillment - while Alexa skills are groupings of these, usually oriented towards a particular end use. This apparently explains the gap between the 80,000 or so current Alexa Skills and Google's purported 1,000,000+ Actions.

To see available Google Actions, refer to:

http://assistant.google.com/explore

Note that certain actions may not be available in all languages or all regions.

It is possible to configure Google Assistant so as to trigger multiple actions with a single voice command.

Google Actions offers Built-in intents, Templates and Home automation - any of which may serve your purposes.

For everything else there are Custom intents - which will open a Dialogflow console in a new window.

Actions on Google

Internally the various web pages that address Google Actions refer to them as Actions on Google.

Simulator

Actions on Google has an excellent simulator, which is invaluable for testing.

Be aware that there can be subtle differences between how things look and sound in this simulator and how they look and sound on an actual device. As always, remember to test on any targetted devices, as this is the real test of your app.

Languages

It is possible to make specific Actions on Google multilingual. While this may be fine for simpler projects, in my experience it is not a good idea for projects that will use Dialogflow (generally more advanced projects). While there will be some extra maintenance with having separate projects for each language, it will simplify testing and has other benefits.

Likewise, unless you are planning on addressing regional language differences (using specific vocabulary and terms for American English versus British English, say) then it is another good practice to NOT specify language locales - this will also simplify testing, but mainly reduces maintenance efforts.

Dialogflow

While it is possible to create simple actions within the Google Actions console, for more sophisticated actions there is the aptly-named Dialogflow.

Google originally purchased API.AI which it rebranded as Dialogflow. [API.AI was previously known as Speaktoit. Note that the API.AI URL redirects to Dialogflow.] Nevertheless, YouTube videos and the like occasionally still refer to API.AI; any changes are generally minor and cosmetic.

One interesting thing about Dialogflow is that it can interact with multiple backend services, such as Slack and Alexa (it refers to these as integrations). It is not limited to Google Actions although these are obviously the prime target. However, it does require a Google Project for the frontend portion.

Dialogflow refers to what Alexa calls skills and Google calls apps as Agents.

Dialogflow offers Prebuilt agents as well as Small Talk - both of which may serve your purposes and are well worth a look.

Agents are generally coordinated within a Request/Response format, possibly using webhooks.

Inidividual intents must be established - after which optional entities may be established and fulfillment or integrations may take place.

Caveat

If work is to be carried out in a team setting, be aware that Dialogflow (and Google Actions) offer limited protection against concurrent access. Random undocumented errors may occur if more than one person is trying to modify a given resource at any one time.

Dialogflow has good Import/Export functionality and it is probably a best practice to use this to take frequent backups - as corruption may easily occur if there is multi-user access.

[This is not often a problem, but may well be serious if it happens. Forewarned is forearmed.]

Intents

Broadly speaking these are the main concepts of a question or statement.

[I have diagrammed these for the Wit.ai API.]

Intents can be individually tested from within Dialogflow.

Follow-up intents are a special case - these can only match if their parents have previously matched.

Dialogflow Follow-up intents

Read more about Follow-up intents here:

http://cloud.google.com/dialogflow/docs/contexts-follow-up-intents

Parameters

These seem to be parameters of the Intent question or statement, for instance in the phrase:

"tell me yesterday's weather"

Dialogflow Parameter

In this case yesterday constitutes an intent parameter of type 'date' that can be defaulted.

Entities

These are the concepts to be established in the dialogue, such as departure date or animals.

In general they are useful for whitelisting a finite set of acceptable values. For instance, a finite list of cities (such as "New York, Paris, Tokyo"). For maintaining an infinite list (such as any city) they do not work well.

For certain specialized uses where they can be precisely defined programmatically (such as dates, for instance) they work exceptionally well.

For more information, refer to the documentation:

http://cloud.google.com/dialogflow/docs/entities-overview

Fulfillment

Generally speaking, these operate as extension points to the existing dialogue flow.

Normally these would operate as external API calls.

Fulfillment is code that's deployed as a webhook

http://cloud.google.com/dialogflow/docs/fulfillment-overview

[It may also consist of code that is defined in the Inline Editor. It's either/or, as in EITHER a webhook OR inline code. The inline code will be javascript and will be deployed to Google Firebase.]

Integrations

Generally speaking, integrations will be to Google Assistant - but many other options are possible.

Small Talk

This is where to set up responses to random human utterances, which is generally known as Small Talk. Wikipedia has a pretty good article:

http://en.wikipedia.org/wiki/Small_talk

Generally it seems to be for social purposes; how appropriate it is for human-computer interations is still to be established.

Google Assistant has a number of preprogrammed responses but it is possible to create responses specific to your app here. So basically app-specific Easter Eggs.

Surprisingly, the answer to the most useful question ("tell me about yourself") does not seem to be available in the "About agent" list to be over-ridden (however the Google-supplied replies seem to be adequate). Still, it might be nice to be able to define more specific responses about the particular app being interacted with.

Code repositories

Dialogflow maintains a useful set of code repositories:

http://github.com/dialogflow

SSML

SSML or Speech Synthesis Markup Language is markup language that was created by the W3C’s Voice Browser working group. It is used in Amazon Alexa (and probably other voice apps) as well as in Google Assistant.

The version of SSML available for Google Assistant is a subset of the W3 SSML specification but also includes Google-specific SSML extensions, such as <par> and <seq> (these respectively allow parallel playback of media clips and sequential playback of media clips).

While SSML is supported, it is not supported in the Dialogflow simulator:

Note: SSML is supported in the Actions Simulator, but not the Dialogflow simulator.

Likewise SSML is not fully supported:

Note that not all of the elements and options described in the W3 SSML specification are currently supported by the Actions on Google platform.

Both of the above quotes are from the following page:

http://developers.google.com/actions/reference/ssml

[This page is worth bookmarking.]

Certification

While Alexa offers beta testing, Google Assistant offers alpha testing and beta testing.

Unlike Alexa, Google Assistant will increment a version number as each successive Google Action is certified.

Also unlike Alexa, Google requires a published privacy policy as a part of its certification process (it is impossible to get an Action certified without one, even if it captures no data of any kind).

[It seems that certification takes about 4 business days.]

Privacy

As noted above, Google has taken steps to be transparent as to the type of data it captures.

Read Google's published Privacy Policy:

http://policies.google.com/privacy

Also, each certified Action has a published privacy policy.

If privacy is a concern, it is possible to view (and manage) the personal information that Google tracks:

http://myactivity.google.com/

[This is very much worth looking at, if simply to see the type of granular detail that Google tracks.]

The following privacy checkup link is worth a look as well:

http://myaccount.google.com/intro/privacycheckup

Google appears to be fairly responsible in the way that it always asks for consent to capture personal data (such as voice snippets, or gesture usage, etc). While Google Apps generally require opting-in to this type of data capture (if only for voice recognition or gesture recognition purposes), permission can always be revoked at a later stage (and the captured data can also be deleted).

Reference

One or two useful references are listed below.

Dialogflow Concepts

A good place to start is by reading up on Dialogflow Concepts:

http://cloud.google.com/dialogflow/docs/concepts

Google's Conversation design

Some very useful content that is well worth a read:

http://designguidelines.withgoogle.com/conversation/

[These guides seem to include a lot of information that doesn't appear anywhere else.]

Google Design is a cooperative effort led by a group of designers, writers, and developers at Google. We work across teams to publish original content, produce events, and foster creative and educational partnerships that advance design and technology.

From: http://design.google/resources/

To Do

  • Continue testing
  • Investigate Firebase integration
  • Write some webhooks to investigate fulfillment
  • Add notes on what Google Assistant is useful for
  • Add practical considerations for working in a team environment
  • Add comprehensive installation notes
  • Add a selection of Easter Eggs
  • Add a Reference section
  • Investigate Dialogflow
  • Investigate Dialogflow and SSML
  • Investigate Dialogflow fulfillment
  • Investigate Dialogflow integrations (other than Google Actions)
  • Investigate Dialogflow Small Talk
  • Investigate Google Assistant versus Alexa versus Siri
  • Investigate Google Stackdriver logging
  • Update links for the migration of Dialogflow documentation to Google docs
  • Publish a Google Action