mever-team / rine

Implementation of paper "Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Paper

This repository contains the implementation code for the ECCV 2024 accepted paper:

Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection (available at arXiv:2402.19091)

Christos Koutlis, Symeon Papadopoulos

Figure 1. The RINE architecture. A batch of $b$ images is processed by CLIP's image encoder. The concatenation of the $n$ $d$-dimensional CLS tokens (one from each Transformer block) is first projected and then multiplied with the blocks' scores, estimated by the Trainable Importance Estimator (TIE) module. Summation across the second dimension results in one feature vector per image. Finally, after the second projection and the consequent classification head modules, two loss functions are computed. Binary cross-entropy $\mathfrak{L}_{CE}$ directly optimizes SID, while the contrastive loss $\mathfrak{L}_{Cont.}$ assists the training by forming a dense feature vector cluster per class.

News

πŸŽ‰ 4/7/2024 Paper acceptance at ECCV 2024

✨ 29/2/2024 Pre-print release --> arXiv:2402.19091

πŸ’₯ 29/2/2024 Code and checkpoints release

Setup

Clone the repository:

git clone https://github.com/mever-team/rine

Create the environment:

conda create -n rine python=3.9
conda activate rine
conda install pytorch==2.1.1 torchvision==0.16.1 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt

Store the datasets in data/:

The data/ directory should look like:

data
└── coco
└── latent_diffusion_trainingset
└── RAISEpng
└── synthbuster
└── train
      β”œβ”€β”€ airplane	
      │── bicycle
      |     .
└── val
      β”œβ”€β”€ airplane	
      │── bicycle
      |     .
└── test					
      β”œβ”€β”€ progan	
      │── cyclegan   	
      │── biggan
      β”‚      .
      │── diffusion_datasets
                │── guided
                │── ldm_200
                |       .

Evaluation

To evaluate the 1-class, 2-class, and 4-class chechpoints as well as the LDM-trained model provided in ckpt/ run python scripts/validation.py. The results will be displayed in terminal.

To get all the reported results (figures, tables) of the paper run python scripts/results.py.

Re-run experiments

To reproduce the conducted experiments, re-run in the following order:

  1. the 1-epoch hyperparameter grid experiments with python scripts/experiments.py
  2. the ablation study with python scripts/ablations.py
  3. the training duration experiments with python scripts/epochs.py
  4. the training set size experiments with python scripts/dataset_size.py
  5. the perturbation experiments with python scripts/perturbations.py
  6. the LDM training experiments with python scripts/diffusion.py

Finally, to save the best 1-class, 2-class, and 4-class models (already stored in ckpt/) run python scripts/best.py, that re-trains the best configurations and stores the corresponding trainable model parts.

With this code snippet the whole project can be reproduced:

import subprocess

subprocess.run("python scripts/experiments.py", shell=True)
subprocess.run("python scripts/ablations.py", shell=True)
subprocess.run("python scripts/epochs.py", shell=True)
subprocess.run("python scripts/dataset_size.py", shell=True)
subprocess.run("python scripts/perturbations.py", shell=True)
subprocess.run("python scripts/diffusion.py", shell=True)
subprocess.run("python scripts/best.py", shell=True)
subprocess.run("python scripts/validation.py", shell=True)
subprocess.run("python scripts/results.py", shell=True)

Demo

In demo/, we also provide code for inference on one real and one fake image from the DALL-E generative model. To demonstrate run python demo/demo.py.

Citation

@article{koutlis2024leveraging,
  title={Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection},
  author={Koutlis, Christos and Papadopoulos, Symeon},
  journal={arXiv preprint arXiv:2402.19091},
  year={2024}
}

Contact

Christos Koutlis (ckoutlis@iti.gr)

About

Implementation of paper "Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection"

License:Apache License 2.0


Languages

Language:Python 100.0%