nfreear / dictation

An adaptive dictation-mode speech recognizer ponyfill compatible with WebChat that gives the user time to think and stutter!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Test status

@nfreear/speech-dictation

An adaptive dictation-mode speech recognizer ponyfill compatible with WebChat that gives the user time to think and stutter (stammer)!

Mastering 'endSilenceTimeoutMs' in Microsoft Speech SDK dictation mode!

(08-Oct-2020)

Ponyfill

See Integrating with Cognitive Services Speech Services.

import { createDictationRecognizerPonyfill } from './createDictationRecognizerPonyfill.js';

const asrPonyfill = await createDictationRecognizerPonyfill({ region, key });

// ... Combine speech synthesis from default
// 'createCognitiveServicesSpeechServicesPonyfillFactory()' ...

renderWebChat(
  {
    directLine: createDirectLine({ ... }),
    // ...
    webSpeechPonyfillFactory: await createCustomHybridPonyfill({ ... })
  },
  document.getElementById('webchat')
);

Dictation mode

The key lines in createDictationRecognizerPonyfill to force dictation mode, and enable the setting of initialSilenceTimeoutMs and endSilenceTimeoutMs:

const initialSilenceTimeoutMs = 5 * 1000;
const endSilenceTimeoutMs = 5 * 1000;
// Scroll to right! → →
const url = `wss://${region}.stt.speech.microsoft.com/speech/recognition/dictation/cognitiveservices/v1?initialSilenceTimeoutMs=${initialSilenceTimeoutMs || ''}&endSilenceTimeoutMs=${endSilenceTimeoutMs}&`;
const urlObj = new URL(url);

const speechConfig = SpeechConfig.fromEndpoint(urlObj, subscriptionKey);

speechConfig.enableDictation();

// ...

const recognizer = new SpeechRecognizer(speechConfig, audioConfig);

Usage

npm install
npm start
npm test

Useful links

Credit

Developed in IET at The Open University for the ADMINS project, funded by Microsoft.


About

An adaptive dictation-mode speech recognizer ponyfill compatible with WebChat that gives the user time to think and stutter!

License:MIT License


Languages

Language:JavaScript 86.6%Language:HTML 6.9%Language:TypeScript 4.2%Language:CSS 2.3%