tempo-riz / deepgram_speech_to_text

A Deepgram client for Dart and Flutter, supporting all Speech-to-Text and Text-to-Speech features on every platform.

Home Page:https://pub.dev/packages/deepgram_speech_to_text

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A Deepgram client for Dart and Flutter, supporting all Speech-to-Text and Text-to-Speech features on every platform.

You need something else ? Feel free to create issues, contribute to this project or to ask for new features on GitHub !

Features

Speech to text (STT) transcription from:

  • Local file, remote URL, Raw data
  • Streaming audio

Text to speech (TTS) is also supported

  • Text to raw audio data

Getting started

All you need is a Deepgram API key. You can get a free one by signing up on Deepgram

Usage

First create the client with optional parameters

String apiKey = 'your_api_key';

Deepgram deepgram = Deepgram(apiKey, baseQueryParams: {
  'model': 'nova-2-general',
  'detect_language': true,
  'filler_words': false,
  'punctuation': true,
    // more options here : https://developers.deepgram.com/reference/listen-file
});

Then you can transcribe audio from different sources :

STT Result

All STT methods return a DeepgramSttResult object with the following properties :

class DeepgramSttResult {
  final String json; // raw json response
  final Map<String, dynamic> map; // parsed json response into a map
  final String? transcript; // the transcript extracted from the response
  final String? type; // the response type (Result, Metadata, ...) non-null for streaming
}

File

File audioFile = File('audio.wav');
DeepgramSttResult res = await deepgram.transcribeFromFile(audioFile); // or transcribeFromPath() if you prefer
print(res.transcript); // you can also acces .json and .map (json already parsed)

URL

final res = await deepgram.transcribeFromUrl('https://somewhere/audio.wav');

Raw data

final res = await deepgram.transcribeFromBytes(List.from([1, 2, 3, 4, 5]));

Stream

let's say from a microphone :

//  https://pub.dev/packages/record (other packages would work too)
Stream<List<int>> micStream = await AudioRecorder().startStream(RecordConfig(
      encoder: AudioEncoder.pcm16bits,
      sampleRate: 16000,
      numChannels: 1,
));

final streamParams = {
  'detect_language': false, // not supported by streaming API
  'language': 'en',
  // must specify encoding and sample_rate according to the audio stream
  'encoding': 'linear16',
  'sample_rate': 16000,
};

then you got 2 options depending if you want to have more control over the stream or not :

// 1. you want the stream to manage itself automatically
Stream<DeepgramSttResult> stream = deepgram.transcribeFromLiveAudioStream(micStream, queryParams:streamParams);

// 2. you want to manage the stream manually
DeepgramLiveTranscriber transcriber = deepgram.createLiveTranscriber(micStream, queryParams:streamParams);
transcriber.stream.listen((res) {
    print(res.transcript);
});

transcriber.start();

// you can pause and resume the transcription (stop sending audio data to the server)
transcriber.pause(); 
// ...
transcriber.resume();

// then close the stream when you're done, you can call start() again if you want to restart a transcription 
transcriber.close(); 

Text to speech

Deepgram deepgram = Deepgram(apiKey, baseQueryParams: {
  'model': 'aura-asteria-en',
  'encoding': "linear16",
  'container': "wav",
  // options here: https://developers.deepgram.com/reference/text-to-speech-api

  final res = await deepgram.speakFromText('Hello world');
  print(res.data); // raw audio data that you can use as you wish. Check flutter example for a simple player
});

For more detailed usage check the /example tab

There is a full flutter demo here

Tested on Android and iOS, but should work on other platforms too.

Debugging common errors

  • make sure your API key is valid and has enough credits
deepgram.isApiKeyValid()
  • "Websocket was not promoted ..." : you are probably using wrong parameters, for example trying to use a whisper model with live streaming (not supported by deepgram)
  • empty transcript/only metadata : if streaming check that you specified encoding and sample_rate properly and that it matches the audio stream
  • double check the parameters you are using, some are not supported for streaming or for some models

Additional information

I created this package for my own needs since there are no dart sdk for deepgram. Happy to share !

Don't hesitate to ask for new features or to contribute on GitHub !

Support

If you'd like to support this project, consider contributing here. Thank you! :)

About

A Deepgram client for Dart and Flutter, supporting all Speech-to-Text and Text-to-Speech features on every platform.

https://pub.dev/packages/deepgram_speech_to_text

License:MIT License


Languages

Language:Dart 54.3%Language:C++ 21.1%Language:CMake 17.2%Language:Ruby 2.5%Language:Swift 1.8%Language:HTML 1.6%Language:C 1.3%Language:Kotlin 0.1%Language:Objective-C 0.0%