Stawa/GTTS

ai gemini google-gemini tts stt typescript

Gemini Text-To-Speech

Convert written material into speech using Google AI (Gemini) for text creation or internet searches.

📜 Table of Contents

How It Works
Project Note
Project Installation
Project Examples
Contributors

❓ How It Works

This project is based on an example in test/app.ts. It fetches a voice, sends a request to the Google Gemini API to receive an AI-generated response, and automatically plays it as TTS.

📌 Project Note

This project is tested on Linux (Ubuntu 24.04 LTS x86_64). Windows users can install SoX via SourceForge. No MacOS-specific information is available.

Task	Priority	Complete	Status
Implement Gemini Chat	High	✓	Completed
Develop Voice Recognition	High	✓	Completed
Implement Audio Language Detection	High	✓	Completed
Implement Text Language Detection	Medium	✓	Completed
Implement an Audio Player	Low	✓	Completed
Define Enums	Low	✓	Completed
Integrate Debugging	Low	✓	Completed

📦 Project Installation

Before using this repository, ensure the following libraries are installed on Linux:

SoX
- sudo apt-get install sox
- Windows Users (SourceForge)
libsox-fmt-all
- sudo apt-get install libsox-fmt-all
FFmpeg
- choco install ffmpeg
- sudo apt install ffmpeg
- FFmpeg Downloads

Then install the repository using the following commands:

# npm
$ npm install git+https://github.com/Stawa/GTTS.git --legacy-peer-deps
# Bun
$ bun install git+https://github.com/Stawa/GTTS.git --trust

📄 Project Examples

Requirements for successful execution:

Google Gemini API Key (lib.GoogleGemini)
- Obtain from Google Cloud
TikTok SessionID (lib.TextToSpeech)
- Obtain from TikTok cookies
Google Speech API Key (lib.VoiceRecognition.fetchTranscriptGoogle)
- Obtain from Chromium API Key
Deepgram API Key (lib.VoiceRecognition.fetchTranscriptDeepgram)
- Obtain from Deepgram
EdenAI (lib.SummarizeText)
- Obtain from EdenAI

This is an example of how you get a generated response from the Google Gemini API; it only takes one function:

import { GoogleGemini } from "@stawa/gtts";

const google = new GoogleGemini({
  apiKey: "XXXXX",
  logger: true,
});

async function app() {
  const res = await google.chat("When was Facebook launched?");
  console.log(res);
}

app();

👥 Contributors

About

This project converts written material into speech by using Google AI (Gemini) for text creation or internet searches.

https://stawa.github.io/GTTS/

ai gemini google-gemini tts stt typescript

MIT License

Languages

Language:TypeScript 84.6%Language:Python 15.4%