This project uses the OpenAI Whisper model to transcribe audio and video files into text. It is designed to be easy to use, allowing users to convert their multimedia files into written content using advanced machine learning technologies.
- Python 3.8 3.11.
- Chocolatey
- CUDA (for users with NVIDIA GPUs)
requirements.txt
: Contains a list of the Python packages required to run the transcription scripts, including the Whisper model for handling multimedia files.setup_environment.bat
: This batch script is designed to set up the required Python environment for the project. Its main function is to automate various processes, including the creation of a virtual environment and the installation of the dependencies specified inrequirements.txt
. It also handles the installation of FFmpeg through Chocolatey and PyTorch version 12.1.run_script.bat
: This batch file is used to start the transcription process. It provides a simple way to initiate the transcription with a double-click.transcribe.py
: The main Python script that uses the Whisper model to transcribe audio and video files. It includes the logic for processing multimedia files and generating the transcriptions.
input/
: Folder where users should place the audio or video files they want to transcribe.output/
: Folder where the generated transcriptions will be saved. Each transcription is named according to its source file. The transcriptions are saved in (srt, tsv, txt, vtt) formats.
- Clone the Project: Start by cloning the repository or downloading the project files to your local machine.
- Set up the Environment:
- Navigate to the project directory.
- Run
setup_environment.bat
as an administrator. This script will create a Python virtual environment and install the necessary dependencies.
Refer to our detailed guide for installing Whisper on Windows.
- Prepare Your Multimedia Files: Place the audio or video files you want to transcribe in the
input/
folder. - Run the Transcription:
- Double-click on
run_script.bat
to start the transcription process. - The script will automatically process all the files present in the
input/
folder and generate the transcriptions in theoutput/
folder.
- Double-click on
- Access the Transcriptions: Once the transcription process is complete, you can find the resulting texts in the
output/
folder. Each transcription file will be named after the original file for easy identification.