chatgpt huggingface image-to-audio image-to-text langchain llm transformer

PictureTales

PictureTales allows you to upload an image, and it will generate a short story based on the image's content using image captioning. The generated story is then converted to audio using text-to-speech technology. You can both see the generated story and listen to it.

Demo

Launching the application
Select an image and Upload
Image
Download the audio story

story.mp4

Features

Upload an image.
Generate a story based on the content of the image.
Listen to the generated story as an audio file.

Usage

Clone this repository to your local machine.

git clone https://github.com/SartajBhuvaji/Image-to-Story-Generator.git

pip install -r requirements.txt

python app.py

Create a .env file and paste your HUGGINGFACE, OPEN AI API Keys (Check the dummy_env file)
Open your web browser and navigate to http://localhost:7860 to access the app.
Upload an image to the app and click "Generate Story." You will see the generated story and be able to listen to it as audio.

Tech

HuggingFace
Image to Caption model
Chat GPT 3.5 LLM
Text-to-speech

About

Generating a simple story from an image.

chatgpt huggingface image-to-audio image-to-text langchain llm transformer

MIT License

Languages

Language:Python 100.0%