ArtElingo

Dataset and models' checkpoints can be found here

You will need to download the WikiArt images from here. This is provided by ArtGAN repo

Dataset Preparation

Installing the raw dataset and tokenizer

Download the dataset from here

Place the dataset at dataset/raw/

Download our tokenizer from here

Place the tokenizer at dataset/

Download the WikiArt images from here

unzip the images at dataset/

Installing the required Env

cd sat
conda create -n artemis-sat python=3.6.9 cudatoolkit=10.0
conda activate artemis-sat
pip install -e .
cd ..

Preprocessing the dataset

cd dataset
python preprocess.py --raw_data 'raw/artelingo.csv' --wikiart_dir 'wikiart/'

Show, Attend and Tell

To train a SAT model,

conda activate artemis-sat
mkdir -p sat_logs/sat_english
python sat/artemis/scripts/train_speaker.py \
 -log-dir sat_logs/sat_english \
 -data-dir dataset/english/train/  \
 -img-dir dataset/wikiart/  \
 --use-emo-grounding True

The trained SAT model will be saved under the sat_logs/sat_english. Alternatively, you can download one of our checkpoints from here

To generate captions from a trained SAT model,

conda activate artemis-sat
mkdir -p sat_generations
python sat/artemis/scripts/sample_speaker.py \
-speaker-saved-args sat_logs/sat_english/config.json.txt \
-speaker-checkpoint sat_logs/sat_english/checkpoints/best_model.pt \
-img-dir dataset/wikiart/ \
-out-file  sat_generations/sat_english.pkl \
--custom-data-csv  dataset/test_english/test_english.csv

The generations will be saved under sat_generations/sat_english.pkl.

Note that for training and testing, you can use any combination from the datasets found under dataset/

To evaluate the generated captions

conda activate artemis-sat
pip install jieba
python sat/get_scores.py \
--references dataset/test_english/test_english.csv  \
--generations sat_generations/sat_english.pkl \

Meshed Memory Transformer

Setting up the env

cd m2/
conda env create -f environment.yml
conda activate artemis-m2
python -m spacy download en

For training, sampling, and evaluation, please Follow the instructions in m2/README.md

Emotion Prediction

Setting up the env

cd emotion_prediction/
conda env create -f environment.yml
conda activate artemis-emo

We have 2 separate scripts for training the 3-headed transformer and the single head models respectively.

For the 3-headed transformer, you just need to run emotion_prediction/three_heads.py without any arguments, i.e.,

conda activate artemis-emo
python emotion_prediction/three_heads.py

For the single head models, you need to provide the tokenizer and the dataset language, i.e.,

conda activate artemis-emo
python emotion_prediction/single_head.py --bert_version bert-base-uncased --dataset_language english

For arabic, we used 'CAMeL-Lab/bert-base-arabic-camelbert-mix' tokenizer and model

For chinese, we used 'bert-base-chinese'

For evaluation metrics and analysis, dataset/test_*.ipynb are three notebooks for analyzing the different models.

Vision-CAIR / artelingo

ArtElingo

Dataset Preparation

Installing the raw dataset and tokenizer

Installing the required Env

Preprocessing the dataset

Show, Attend and Tell

Meshed Memory Transformer

Setting up the env

Emotion Prediction

Setting up the env

About

Languages