paulhshort's repositories

TrueAudioVIdeoGemini

This is a repo demonstrating Gemini 1.5 pros ability to ingest audio and not just transcribed text it can listen to qualities of voice guest regional accents and other things. Out! Use your own vertex api key enter that funny

License:MITStargazers:0Issues:0Issues:0

chatpad

Not just another ChatGPT user-interface!

License:AGPL-3.0Stargazers:0Issues:0Issues:0

cookbook

A collection of guides and examples for the Gemini API.

License:Apache-2.0Stargazers:0Issues:0Issues:0

transcribe

Transcribe is OpenAI's chatGPT based real time transcription, conversation, Language learning platform. It provides live transcripts from microphone and speaker. It generates a suggested conversation response using OpenAI's GPT API. It will read out the responses, simulating a real live conversation in English or another language.

License:MITStargazers:1Issues:0Issues:0

speak-gpt

Your personal voice assistant based on OpenAI ChatGPT.

License:Apache-2.0Stargazers:0Issues:0Issues:0

faster-whisper

Faster Whisper transcription with CTranslate2

License:MITStargazers:0Issues:0Issues:0

AlwaysReddy

AlwaysReddy is a LLM voice assistant that is always just a hotkey away.

License:MITStargazers:0Issues:0Issues:0

whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

License:BSD-2-ClauseStargazers:1Issues:0Issues:0

dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

License:NOASSERTIONStargazers:0Issues:0Issues:0

thepipe

Multimodal file/web extraction for GPT-4o in one line of code ⚡

License:MITStargazers:0Issues:0Issues:0

LocalAI

:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.

License:MITStargazers:0Issues:0Issues:0

lobe-chat

🤯 Lobe Chat - an open-source, modern-design LLMs/AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Bedrock / Azure / Mistral / Perplexity ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT chat application.

License:MITStargazers:0Issues:0Issues:0

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

License:BSD-4-ClauseStargazers:0Issues:0Issues:0

Awesome-LLMOps

An awesome & curated list of best LLMOps tools for developers

License:CC0-1.0Stargazers:0Issues:0Issues:0

PentestGPT

A GPT-empowered penetration testing tool

License:MITStargazers:0Issues:0Issues:0

gemini-ai-processaudio-js

Process Audio Files With Gemini Api In Javascript

Stargazers:0Issues:0Issues:0