ViaLect streamlines your media intake by transforming audio into workable text and generated summaries!
Tip
π‘ Audio Extraction β Pull from various media platforms or uploads
πΈ ASR & Diarization β Identify and align speakers to timestamped dialogues
π Translation β Detect languages and translate to English
π€ Speech-to-Text β Accurately transcribe text from extracted audio
π¬ Summarization β Focus on key concepts with transcript-based summaries
π Text-to-Speech β Have your generated summaries read back to you
π Media Collection β Locally store and navigate your transformed data
π Intuitive UI β Seamless frontend layout via Streamlit
Important
Key Packages: OpenAI Whisper, PyTorch (CUDA v11.8), pyannote.audio, yt-dlp, Streamlit
1. Git clone this repository:
git clone https://github.com/imgta/vialect.git
2. Install ffmpeg and requirements:
sudo apt install ffmpeg
pip install -r requirements.txt
3. Obtain Hugging Face token/access, obtain OpenAI API Key
4. Create and update .streamlit/secrets.toml' (Optional: input keys in Secret Keys Drawer after launch)
5. Launch streamlit app:
streamlit run app/Home.py
Note
1. Select whisper model and options
2. Input or upload video/audio file
3. Submit for transcription