Audio Transcriber

📜 Description

Audio Transcriber is a tool that allows users to upload audio files via a custom API platform. The backend processes the audio files, converting them to WAV format, and then performs a transcription in text form. Users can select different language models for precise transcriptions, and the transcribed text is further processed by a language model (e.g., ChatGPT) to generate an accurate summary. Both the transcribed text and the summary are provided via the API.

💫 Features

Development of a custom API platform
Users can upload audio files via the API
The backend receives the audio files and converts them to WAV format
Subsequently, a transcription is performed in text form
Integration of speech recognition and/or language selection for different languages
Option to select different language models (e.g., Whisper) for precise transcriptions
The transcribed text is sent to a language model (e.g., ChatGPT) to generate an accurate summary
The transcribed text and summary are provided via the API.

📝 TODO

Planned features and enhancements can be found on the project's board. Check out the board for updates and future developments.

📩 Installation

Ensure Python 3.8 or later and NodeJS is installed.
Clone the repository.
Navigate in the console to the root folder of the repository.
OPTIONAL: Create a virtual environment with python -m venv .venv if needed. Otherwise, proceed to step 6.
OPTIONAL: Activate the virtual environment.
Install all required Python packages by running pip install -r requirements.txt.
Install PyTorch from here.
OPTIONAL: When using a supported OS, install Accelerate to benefit from performance gains. Check here for instructions.
OPTIONAL: Install Deepspeed to benefit from additional performance gains. Instructions can be found here.
Rename config.example.py to config.py and configure to your liking.
Start the backend by executing main.py in the backend folder.
Start the frontend by typing ng serve into the console inside the frontend folder.

💾 Contributing

For more information on how to contribute, please visit the Contributing page.

About

Audio Transcriber is a tool that allows users to upload audio files via a custom API. It converts audio to WAV format, performs text transcription, and generates summaries using language models like ChatGPT. It supports multiple languages and provides the transcribed text and summaries via the API. Check the project board for planned features.

GNU General Public License v3.0

Languages

Language:Python 52.7%Language:TypeScript 20.9%Language:HTML 17.3%Language:CSS 9.1%