TomatoFT / Image-Captioning

Image Captioning with EffiecentNet and Transformer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Image Captioning with EfficentNet and Transformer

Tech stack and Tools

  • Tensorflow
  • Streamlit
  • PostgresQL
  • Visual Studio Code
  • Anaconda
  • Google Colab (Jupiter Notebook)
  • What is Image Captioning problem ?

    Image captioning is the process of generating a natural language description of an image. It is a task in the field of computer vision and natural language processing. The goal of image captioning is to generate a coherent and fluent sentence that accurately describes the image content.

    An image captioning system typically consists of two main components:

  • An image feature extractor: This component is responsible for extracting features from the input image, such as object locations, sizes, and colors.
  • A natural language generator: This component takes the image features as input and generates a natural language description of the image.
  • The generated captions are typically evaluated using metrics such as BLEU, METEOR, ROUGE, and CIDEr.
  • How to run this project

    This project uses streamlit to demo the result of EfficentNet + Transformer (Trained with 11 epoches) and connect with PostgreSQL to save the information about the picture and some metadata to a database.

    So first you will need to install Anaconda, PostgreSQL and Python 3. Depend on your OS, there maybe many different ways to install it. In this project I use Ubuntu OS to install all of them. So I will put some video tutorial to install them here.

    PostgreSQL + pgAdminIII:

    Python 3:


    After completed install these things, you can do the below step.

    Clone the project

    git clone
    cd Image-Captioning-with-Transformer

    Create and Enter the Anaconda Environment

    conda create --name image-captioning
    conda activate image-captioning

    Install dependencies

    conda install -c anaconda pip
    pip install -r requirements.txt

    Connect Streamlit to PostgreSQL

    Read this document from Streamlit: Then go to pgAminIII, press Add the connection to server and fill this form.


    In .streamlit/secrets.toml file. Change these information to YOUR PostgreSQL information.


    Open the streamlit file and run demo

    streamlit run



    Exit Anaconda Environment

    conda deactivate


    Training model file is here: I use this file to train model and save its weights to local computer to deploy in Streamlit (You can find it at model/model_IC.h5).

    The model tutorial:

    You can read the submitted report to understand the process I do this project.

    Feel free to clone my code to use.


    Image Captioning with EffiecentNet and Transformer


    Language:Python 100.0%