zhenqi-he / COMP7404-Project-Flamingo

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

COMP7404-Project-Flamingo

This is the implementation of a demo using Flamingo for few-shot image classification and simple OCR (Optical Character Recognition).

Quick Start

Architecture

Examples

Few-shot Classification for cats

Input

Category 1:

Text input: An image of a cat named Hanbao 🍔 .

Category 2:

Text input: An image of a cat named Tuanzi 🍡 .

Test Case:

Text input: An image of a cat named

Output

An image of a cat named tuanzi.

Simple OCR for University Logos

Input

Text Input: This is the logo of {University Name}

Images Input:

Test Case:

Text Input: This is the logo of

Images Input:

Output This is the logo of the university of hong kong

Set-up

Installation

To install related packages, run the following code to set up the environment.

pip install -r requirements.txt

Run

Run Cats Classification

python flamingo.py --class_1_path=PATH_TO_HANBAO --class_2_path=PATH_TO_TUANZI --test_cases_path=PATH_TO_TESTCASES 

Run Logo OCR

python flamingo_OCR.py --image_paths=PATH_TO_OCR_EXAMPLES

Acknowledgement

We implement our demo based on the Open-Flamingo and Flamingo-Pytorch based on the paper Flamingo: a visual language model for few-shot learning.

About


Languages

Language:Python 100.0%