Niklashere / Transcriber

Audio Transcriber is a tool that allows users to upload audio files via a custom API. It converts audio to WAV format, performs text transcription, and generates summaries using language models like ChatGPT. It supports multiple languages and provides the transcribed text and summaries via the API. Check the project board for planned features.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Audio Transcriber

GitHub licence Contributor Covenant

πŸ“œ Description

Audio Transcriber is a tool that allows users to upload audio files via a custom API platform. The backend processes the audio files, converting them to WAV format, and then performs a transcription in text form. Users can select different language models for precise transcriptions, and the transcribed text is further processed by a language model (e.g., ChatGPT) to generate an accurate summary. Both the transcribed text and the summary are provided via the API.

πŸ’« Features

  • Development of a custom API platform
  • Users can upload audio files via the API
  • The backend receives the audio files and converts them to WAV format
  • Subsequently, a transcription is performed in text form
  • Integration of speech recognition and/or language selection for different languages
  • Option to select different language models (e.g., Whisper) for precise transcriptions
  • The transcribed text is sent to a language model (e.g., ChatGPT) to generate an accurate summary
  • The transcribed text and summary are provided via the API.

πŸ“ TODO

Planned features and enhancements can be found on the project's board. Check out the board for updates and future developments.

πŸ“© Installation

  1. Ensure Python 3.8 or later and NodeJS is installed.
  2. Clone the repository.
  3. Navigate in the console to the root folder of the repository.
  4. OPTIONAL: Create a virtual environment with python -m venv .venv if needed. Otherwise, proceed to step 6.
  5. OPTIONAL: Activate the virtual environment.
  6. Install all required Python packages by running pip install -r requirements.txt.
  7. Install PyTorch from here.
  8. OPTIONAL: When using a supported OS, install Accelerate to benefit from performance gains. Check here for instructions.
  9. OPTIONAL: Install Deepspeed to benefit from additional performance gains. Instructions can be found here.
  10. Rename config.example.py to config.py and configure to your liking.
  11. Start the backend by executing main.py in the backend folder.
  12. Start the frontend by typing ng serve into the console inside the frontend folder.

πŸ’Ύ Contributing

For more information on how to contribute, please visit the Contributing page.

About

Audio Transcriber is a tool that allows users to upload audio files via a custom API. It converts audio to WAV format, performs text transcription, and generates summaries using language models like ChatGPT. It supports multiple languages and provides the transcribed text and summaries via the API. Check the project board for planned features.

License:GNU General Public License v3.0


Languages

Language:Python 52.7%Language:TypeScript 20.9%Language:HTML 17.3%Language:CSS 9.1%