Welcome to Overdub API. This API serves as an automated tool to create narrated, subtitled videos from raw content with minimal guidance needed. It's a streamlined process to convert any raw video into accessible, and social-media-ready content. Getting Started
To use this software, you will need:
- Python 3.6 or higher
- A ElevenLabs API key
- A Google Cloud account with gcloud installed on your local machine. You can follow this guide to setup gcloud.
git clone https://github.com/your-username/Overdub-API.git
cd Overdub-API
It's best practice to create a virtual environment to manage dependencies for the project:
python -m venv venv
source venv/bin/activate
On Windows use venv\Scripts\activate
Install the required packages for the project using the provided requirements.txt file:
pip install -r requirements.txt
Rename the .env.example file to .env and add your ElevenLabs API key to it.
ELEVENLABS_API_KEY='your-api-key'
Ensure you've installed gcloud and then set up gcloud for your local environment. Please refer to the Google Cloud documentation provided above to authenticate and set up your local environment.
Configure your Google Cloud project to match the settings in the settings.py
file.
Adjust the settings.py
file to match your requirements. The file includes parameters for video processing, API keys, and other configurations.
The software processes each video file listed in the input CSV. By default, this file should be located in the __temp__/ directory
. The CSV should contain video URLs or file paths, with an optional content direction field to describe what is happening in the video, which can improve the results.
The API uses Google's Vertex AI to generate metadata. It classifies content through image data extracted from the video, complemented with the optional content direction field if provided.
Once metadata is obtained, ElevenLabs' capabilities are used to overdub the video with natural-sounding speech. Subsequent steps involve generating dub timestamps for the overdub and creating dynamic subtitles that fit the requirements for various social media platforms, all of which are configurable within the settings.py
file.
Processed videos are outputted to the __temp__/02_processed
directory. Any generated metadata will be stored in __temp__/01_metadata
.
To start processing your content, place your CSV file in the __temp__/ directory
or specifiy a location in settings.py
and run the main script:
python main.py
Make sure your environment variables and Google Cloud settings are correctly configured before running the script.
Currently this project is self-mainted. If you are interested in contributing, please open a pull request. Thank you.
This project is released under the MIT License.
By using this API, you agree to the terms and conditions of both ElevenLabs and Google Cloud services. We strive to make automated video processing as seamless as possible. With this tool, we simplify the path from raw content to engaging digital media. Should you encounter any issues or have suggestions, please open up an issue in the repository. Happy processing!