Automatic generation of comic dialogues

Author: Sergi Masip Cabeza
Thesis director: Ernest Valveny Llobet
Developed as a bachelor's thesis at Universitat Autònoma de Barcelona. Check the thesis for more information (in Catalan).

Project Description

The purpose of this project is to generate subsequent dialogues given a multimodal context. To do this, we used the database provided in COMICS. First, we trained a text-only and a multimodal model to perform a Text cloze task. Then, we trained another based on the previous one for generating dialogues.

Getting Started

Clone this repo (for help see this tutorial).

git clone https://github.com/Atenrev/comics-dialogue-generation.git
cd comics-dialogue-generation

Download and prepare the dataset.

a. Download our preprocessed version of COMICS and place it in datasets/COMICS. If you would like to use the original dataset, you can download it from the COMICS repository.

Note that there are two files for the test set. The original authors filtered some dialogues from this set based on their tokenizer. To be able to compare the results of our model with the original ones, we maintain this filter in the test set. The "full" version, on the other hand, is not filtered.

b. Extract the visual features using Detectron2. Follow the instructions here within this repo.
Install the dependencies.

# Create python environment (optional but recommended)
conda create -n comicsgen python=3.9
source activate comicsgen

# Install python dependencies
pip install -r requirements.txt

[Optional] Download the pre-trained weights (not uploaded yet). If you use the VL-T5 model, download the pretrained weights from their repository here.

Configuration

Every model, dataset, and trainer is configured in a configuration file. The configuration file is a YAML file. The configuration files are located in the configs folder. In case you want to add a new model, dataset, or trainer, you should create a new configuration file and add it to the configs folder, as well as the corresponding model or dataset script in src.

Dataset configuration

For each dataset, you need a configuration file in the configs/datasets folder. The file must contain the "name" parameter, which is the same as the name of the dataset script in src/datasets that will be used to load the dataset.

Model configuration

For each model, you need a configuration file in the configs/models folder. The name of the file must be the same as the name of the model script in src/models that will be used to load the model. The file must contain the the following parameters:

classname: <class name of the model>

tokenizer: <name of the tokenizer (we use the AutoTokenizer class from HuggingFace)>
# or
feature_extractor: <name of the feature extractor>

Trainer configuration

For the trainer, you need a configuration file in the configs/trainers folder. The file must contain the the following parameters:

epochs: <number of epochs>
runs_path: <path to the runs folder>
report_path: <path to the report folder>

optimizer:
    type: <type of optimizer>
    # ... parameters of the optimizer

The runs folder is where the training logs will be saved. The report folder is where the evaluation reports will be saved.

Code structure

# Run dataset setup and feature extraction
./tools
    extract_visual_features/
    setup_dataset/

# Store dataset
./datasets
    COMICS/
        frcnn_features/
            boxes36.h5
        text_cloze_train_easy.csv
        ...
    ...

# Store pre-trained weights
./pretrained_weights
    vlt5_epoch30.pth                                            <= The VL-T5 model expects this file to be in this folder (you can change this in its config file).
    ...

# Store configuration
./config
    datasets/
    models/
    trainers/

# Create you own models or datasets
.src/
    models/                                               <= This is where you should add your own models. They should inherit from the BaseModel class. 
        base_model.py
    datasets/                                             <= This is where you should add your own datasets. They must inherit from the ```BaseDataset``` class.
        base_dataset.py

# Run the model
./main.py

Training and evaluation

To train the model, run the following command:

python main.py
  --mode "train"
  --model               Model to run
  --dataset_config      Dataset config to use
  --trainer_config      Trainer params to use
  --dataset_dir         Dataset directory path
  --load_checkpoint     Path to model checkpoint
  --batch_size          Batch size
  --seed                Seed to use

To evaluate the model, change the --mode to "eval".

Reference

@mastersthesis{ddd.uab.cat:264189,
      author = {Masip Cabeza, Sergi and Valveny Llobet, Ernest, dir.},
       title = {Generació automàtica de diàlegs de còmic},
        year = {2022},
         url = {https://ddd.uab.cat/record/264189},
}

Atenrev / comics-dialogue-generation