Edison is a voice activated way to navigate the Chrome browser.
The voice interface can be triggered by the wakeword "Hey Edison", followed by one of the supported commands documented in the User Guide below.
You can download the extension from the chrome store here.
The following commands are currently supported:
-
Open
Opens the first google search result that best matches the words spoken after the "open" command.
Examples: "Hey Edison, Open News", "Hey Edison, Open Youtube", "Hey Edison, Open Netflix". -
Click
Tries to click anything that approximately matches the words spoken after the "click" command.
Examples: "Hey Edison, Click Sign-In", "Hey Edison, Click the title of a video", "Hey Edison, Click a Netflix profile". -
Close
Closes current tab. Useful if a mistake or unintended tab is opened.
Example: Just say "Hey Edison, Close tab". -
Scroll
Scrolls the page up/down/left/right.
Examples: "Hey Edison, Scroll Down", "Hey Edison, Scroll Up", "Hey Edison, Scroll Left", "Hey Edison, Scroll Right". -
Media Controls for Video
Plays or pauses the video in the current tab.
Example: Just say "Hey Edison, Play" or "Hey Edison, Pause" when viewing a video. -
Focus Next Tab
Navigates to the next tab Example: "Hey Edison, Focus Next Tab" -
Focus Previous Tab
Navigates to the previous tab Example: "Hey Edison, Focus Previous Tab" -
Go back
Hits the browser back button Example: "Hey Edison, Go Back" -
Go forward
Hits the browser forward button Example: "Hey Edison, Go Forward" -
Rewind
Specific to Netflix. Rewinds the current Netflix title by 10 seconds.
Example: Just say "Hey Edison, Rewind" when viewing a Netflix title. -
Skip
Specific to Netflix. Fast forwards the Netflix title by 10 seconds.
Example: Just say "Hey Edison, Skip" when viewing a Netflix title.
Note that the interface currently handles one command at a time, therefore, each command will need to invoke the interface again separately.
For accessibility use cases, it is recommended that passwords be saved for the most commonly used websites to improve the overall user experience.
Check out the Demos to see the tool in action!
To start up a development environment:
- Ensure you have nodejs.
- Clone the project and run:
npm install
npm run build
- Load the project directory as an unpacked chrome extension by:
- Going to
chrome://extensions
- Toggle on "developer mode" in the top right corner of the page
- Click the "Load unpacked" button on the top left and point to the directory you cloned in step 2.
- If you are making
.jsx
changes, you can run the watch command to automatically convert your.jsx
changes to loadable.js
files:
npm run watch
The entry point for all voice commands is located in the background script here.
Logging from extension side javascript is viewable by inspecting the background.html
view from the extension entry under chrome://extensions
. Note that developer mode must be enabled.
For injected content scripts, logging is viewable by opening the regular developer tools on the webpage the content script was injected into.
Note, the extension currently utilizes a few external dependencies:
-
Speech recognition with annyang.
-
Fuzzy search powered by fuse.
-
Wakeword detection powered by bumblebee.
Some useful resources:
(1) Chrome Extension Architecture Overview
(2) Chrome Extension Message Passing
If you have any questions, feel free to shoot me an email at klee2010@gmail.com
.
A design document for this project is available here.
You can also watch a presentation on the motivations behind the project here.
If you have any ideas on how to improve the tool, or encounter any behaviour that is unexpected, please feel free to shoot me an email at klee2010@gmail.com