DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection (NeurIPS 2023 D&B)

Authors: Zhiyuan Yan, Yong Zhang, Xinhang Yuan, Siwei Lyu, Baoyuan Wu*

Welcome to DeepfakeBench, your one-stop solution for deepfake detection! Here are some key features of our platform:

✅ Unified Platform: DeepfakeBench presents the first comprehensive benchmark for deepfake detection, resolving the issue of lack of standardization and uniformity in this field.

✅ Data Management: DeepfakeBench provides a unified data management system that ensures consistent input across all detection models.

✅ Integrated Framework: DeepfakeBench offers an integrated framework for the implementation of state-of-the-art detection methods.

✅ Standardized Evaluations: DeepfakeBench introduces standardized evaluation metrics and protocols to enhance the transparency and reproducibility of performance evaluations.

✅ Extensive Analysis and Insights: DeepfakeBench facilitates an extensive analysis from various perspectives, providing new insights to inspire the development of new technologies.

Table of Contents

Features
Quick Start
Supported Detectors
Results
Citation
Copyright

📚 Features

[Back to top]

DeepfakeBench has the following features:

⭐️ Detectors (15 detectors):

5 Naive Detectors: Xception, MesoNet, MesoInception, CNN-Aug, EfficientNet-B4
7 Spatial Detectors: Capsule, DSP-FWA, Face X-ray, FFD, CORE, RECCE, UCF
3 Frequency Detectors: F3Net, SPSL, SRM

⭐️ Datasets (9 datasets): FaceForensics++, FaceShifter, DeepfakeDetection, Deepfake Detection Challenge (Preview), Deepfake Detection Challenge, Celeb-DF-v1, Celeb-DF-v2, DeepForensics-1.0, UADFV

DeepfakeBench will be continuously updated to track the latest advances in deepfake detection. The implementations of more detection methods, as well as their evaluations, are on the way. You are welcome to contribute your detection methods to DeepfakeBench.

⏳ Quick Start

1. Installation

(option 1) You can run the following script to configure the necessary environment:

git clone git@github.com:SCLBD/DeepfakeBench.git
cd DeepfakeBench
conda create -n DeepfakeBench python=3.7.2
conda activate DeepfakeBench
sh install.sh

(option 2) You can also utilize the supplied Dockerfile to set up the entire environment using Docker. This will allow you to execute all the codes in the benchmark without encountering any environment-related problems. Simply run the following commands to enter the Docker environment.

docker build -t DeepfakeBench .
docker run --gpus all -itd -v /path/to/this/repository:/app/ --shm-size 64G DeepfakeBench

Note we used Docker version 19.03.14 in our setup. We highly recommend using this version for consistency, but later versions of Docker may also be compatible.

2. Download Data

[Back to top]

All datasets used in DeepfakeBench can be downloaded from their own websites or repositories. For convenience, we also provide the data we use in our research. All the downloaded datasets have been organized and arranged in the same folder. Users can easily access and download the preprocessed data, including original videos, mask videos, frames, and landmarks:

Dataset Name	OneDrive Link	Notes
Celeb-DF-v1	Code: cdfv1	-
Celeb-DF-v2	Code: cdfv2	-
FaceForensics++, DeepfakeDetection, FaceShifter	Code: ffpp	both c23 and c40 version
UADFV	Code: uadfv	-
Deepfake Detection Challenge (Preview)	Code: dfdcp	-
Deepfake Detection Challenge	Code: dfdc	Only Test Data
DeepForensics-1.0	Coming Soon	Only Test Data

🛡️ Copyright of the above datasets belongs to their original providers.

Other detailed information about the datasets used in DeepfakeBench is summarized below:

Dataset	Real Videos	Fake Videos	Total Videos	Rights Cleared	Total Subjects	Synthesis Methods	Perturbations	Original Repository
FaceForensics++	1000	4000	5000	NO	N/A	4	2	Hyper-link
FaceShifter	1000	1000	2000	NO	N/A	1	-	Hyper-link
DeepfakeDetection	363	3000	3363	YES	28	5	-	Hyper-link
Deepfake Detection Challenge (Preview)	1131	4119	5250	YES	66	2	3	Hyper-link
Deepfake Detection Challenge	23654	104500	128154	YES	960	8	19	Hyper-link
CelebDF-v1	408	795	1203	NO	N/A	1	-	Hyper-link
CelebDF-v2	590	5639	6229	NO	59	1	-	Hyper-link
DeepForensics-1.0	50000	10000	60000	YES	100	1	7	Hyper-link
UADFV	49	49	98	NO	49	1	-	Hyper-link

Upon downloading the datasets, please ensure to store them in the ./datasets folder, arranging them in accordance with the directory structure outlined below:

datasets
├── FaceForensics++
│   ├── original_sequences
│   │   ├── youtube
│   │   │   ├── c23
│   │   │   │   ├── videos
│   │   │   │   │   └── *.mp4
│   │   │   │   └── frames (if you download my processed data)
│   │   │   │   │   └── *.png
|   |   |   |   └── masks (if you download my processed data)
│   │   │   │   │   └── *.png
│   │   │   │   └── landmarks (if you download my processed data)
│   │   │   │   │   └── *.png
│   │   │   └── c40
│   │   │   │   ├── videos
│   │   │   │   │   └── *.mp4
│   │   │   │   └── frames (if you download my processed data)
│   │   │   │   │   └── *.png
|   |   |   |   └── masks (if you download my processed data)
│   │   │   │   │   └── *.png
│   │   │   │   └── landmarks (if you download my processed data)
│   │   │   │       └── *.npy
│   │   ├── actors
│   │   │   ├── c23
│   │   │   │   ├── videos
│   │   │   │   │   └── *.mp4
│   │   │   │   └── frames (if you download my processed data)
│   │   │   │   │   └── *.png
|   |   |   |   └── masks (if you download my processed data)
│   │   │   │   │   └── *.png
│   │   │   │   └── landmarks (if you download my processed data)
│   │   │   │       └── *.npy
│   │   │   └── c40
│   │   │   │   ├── videos
│   │   │   │   │   └── *.mp4
│   │   │   │   └── frames (if you download my processed data)
│   │   │   │   │   └── *.png
|   |   |   |   └── masks (if you download my processed data)
│   │   │   │   │   └── *.png
│   │   │   │   └── landmarks (if you download my processed data)
│   │   │   │       └── *.npy
│   ├── manipulated_sequences
│   │   ├── Deepfakes
│   │   │   ├── c23
│   │   │   │   └── videos
│   │   │   │   │   └── *.mp4
│   │   │   │   └── frames (if you download my processed data)
│   │   │   │   │   └── *.png
|   |   |   |   └── masks (if you download my processed data)
│   │   │   │   │   └── *.png
│   │   │   │   └── landmarks (if you download my processed data)
│   │   │   │       └── *.npy
│   │   │   └── c40
│   │   │   │   ├── videos
│   │   │   │   │   └── *.mp4
│   │   │   │   └── frames (if you download my processed data)
│   │   │   │   │   └── *.png
|   |   |   |   └── masks (if you download my processed data)
│   │   │   │   │   └── *.png
│   │   │   │   └── landmarks (if you download my processed data)
│   │   │   │       └── *.npy
│   │   ├── Face2Face
│   │   │   ├── ...
│   │   ├── FaceSwap
│   │   │   ├── ...
│   │   ├── NeuralTextures
│   │   │   ├── ...
│   │   ├── FaceShifter
│   │   │   ├── ...
│   │   └── DeepFakeDetection
│   │       ├── ...

Other datasets are similar to the above structure

If you choose to store your datasets in a different folder, for instance, ./deepfake/data, it's important to reflect this change in the dataset path in the config.yaml for preprocessing purposes.

3. Preprocessing (optional)

[Back to top]

❗️Note: If you want to directly utilize the data, including frames, landmarks, masks, and more, that I have provided above, you can skip the pre-processing step. However, you still need to run the rearrangement script to generate the JSON file for each dataset for the unified data loading in the training and testing process.

DeepfakeBench follows a sequential workflow for face detection, alignment, and cropping. The processed data, including face images, landmarks, and masks, are saved in separate folders for further analysis.

To start preprocessing your dataset, please follow these steps:

Download the shape_predictor_81_face_landmarks.dat file. Then, copy the downloaded shape_predictor_81_face_landmarks.dat file into the ./preprocessing/dlib_tools folder. This file is necessary for Dlib's face detection functionality.
Open the ./preprocessing/config.yaml and locate the line default: DATASET_YOU_SPECIFY. Replace DATASET_YOU_SPECIFY with the name of the dataset you want to preprocess, such as FaceForensics++.
Specify the dataset_root_path in the config.yaml file. Search for the line that mentions dataset_root_path. By default, it looks like this: dataset_root_path: ./datasets. Replace ./datasets with the actual path to the folder where your dataset is arranged.

Once you have completed these steps, you can proceed with running the following line to do the preprocessing:

cd preprocessing

python preprocess.py

4. Rearrangement

To simplify the handling of different datasets, we propose a unified and convenient way to load them. The function eliminates the need to write separate input/output (I/O) code for each dataset, reducing duplication of effort and easing data management.

After the preprocessing above, you will obtain the processed data (i.e., frames, landmarks, and masks) for each dataset you specify. Similarly, you need to set the parameters in ./preprocessing/config.yaml for each dataset. After that, run the following line:

cd preprocessing

python rearrange.py

After running the above line, you will obtain the JSON files for each dataset in the ./preprocessing/dataset_json folder. The rearranged structure organizes the data in a hierarchical manner, grouping videos based on their labels and data splits (i.e., train, test, validation). Each video is represented as a dictionary entry containing relevant metadata, including file paths, labels, compression levels (if applicable), etc.

5. Training (optional)

[Back to top]

To run the training code, you should first download the pretrained weights for the corresponding backbones (These pre-trained weights are from ImageNet). You can download them from Link. After downloading, you need to put all the weights files into the folder ./training/pretrained.

Then, you should go to the ./training/config/detector/ folder and then Choose the detector to be trained. For instance, you can adjust the parameters in xception.yaml to specify the parameters, e.g., training and testing datasets, epoch, frame_num, etc.

After setting the parameters, you can run with the following to train the Xception detector:

python training/train.py \
--detector_path ./training/config/detector/xception.yaml

You can also adjust the training and testing datasets using the command line, for example:

python training/train.py \
--detector_path ./training/config/detector/xception.yaml  \
--train_dataset "FaceForensics++" \
--test_dataset "Celeb-DF-v1" "Celeb-DF-v2"

By default, the checkpoints and features will be saved during the training process. If you do not want to save them, run with the following:

python training/train.py \
--detector_path ./training/config/detector/xception.yaml \
--train_dataset "FaceForensics++" \
--test_dataset "Celeb-DF-v1" "Celeb-DF-v2" \
--no-save_ckpt \
--no-save_feat

To train other detectors using the code mentioned above, you can specify the config file accordingly. However, for the Face X-ray detector, an additional step is required before training. To save training time, a pickle file is generated to store the Top-N nearest images for each given image. To generate this file, you should run the generate_xray_nearest.py file. Once the pickle file is created, you can train the Face X-ray detector using the same way above. If you want to check/use the files I have already generated, please refer to the link.

6. Evaluation

If you only want to evaluate the detectors to produce the results of the cross-dataset evaluation, you can use the the test.py code for evaluation. Here is an example:

python3 training/test.py \
--detector_path ./training/config/detector/xception.yaml \
--test_dataset "Celeb-DF-v1" "Celeb-DF-v2" "DFDCP" \
--weights_path ./training/weights/xception_best.pth

Note that we have provided the pre-trained weights for each detector (you can download them from the link). Make sure to put these weights in the ./training/weights folder.

📦 Supported Detectors

[Back to top]

	File name	Paper
Xception	xception_detector.py	FaceForensics++: Learning to Detect Manipulated Facial Images ICCV 2019
Meso4	meso4_detector.py	MesoNet: a Compact Facial Video Forgery Detection Network WIFS 2018
Meso4Inception	meso4Inception_detector.py	MesoNet: a Compact Facial Video Forgery Detection Network WIFS 2018
CNN-Aug	resnet34_detector.py	CNN-generated images are surprisingly easy to spot... for now CVPR 2020
EfficientNet-B4	efficientnetb4_detector.py	Efficientnet: Rethinking model scaling for convolutional neural networks ICML 2019
Capsule	capsule_net_detector.py	Capsule-Forensics: Using Capsule Networks to Detect Forged Images and Videos ICASSP 2019
DSP-FWA	fwa_detector.py	Exposing DeepFake Videos By Detecting Face Warping Artifacts CVPRW 2019
Face X-ray	facexray_detector.py	Face X-ray for More General Face Forgery Detection CVPR 2020
FFD	ffd_detector.py	On the Detection of Digital Face Manipulation CVPR 2020
CORE	facexray_detector.py	CORE: COnsistent REpresentation Learning for Face Forgery Detection CVPRW 2022
RECCE	recce_detector.py	End-to-End Reconstruction-Classification Learning for Face Forgery Detection CVPR 2022
UCF	ucf_detector.py	UCF: Uncovering Common Features for Generalizable Deepfake Detection ICCV 2023
F3Net	f3net_detector.py	Thinking in Frequency: Face Forgery Detection by Mining Frequency-aware Clues ECCV 2020
SPSL	spsl_detector.py	Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain CVPR 2021
SRM	srm_detector.py	Generalizing Face Forgery Detection with High-frequency Features CVPR 2021

🏆 Results

[Back to top]

In our Benchmark, we apply TensorBoard to monitor the progress of training models. It provides a visual representation of the training process, allowing users to examine training results conveniently.

To demonstrate the effectiveness of different detectors, we present partial results from both within-domain and cross-domain evaluations. The evaluation metric used is the frame-level Area Under the Curve (AUC). In this particular scenario, we train the detectors on the FF++ (c23) dataset and assess their performance on other datasets.

For a comprehensive overview of the results, we strongly recommend referring to our paper. These resources provide a detailed analysis of the training outcomes and offer a deeper understanding of the methodology and findings.

Type	Detector	Backbone	FF++_c23	FF++_c40	FF-DF	FF-F2F	FF-FS	FF-NT	Avg.	Top3	CDFv1	CDFv2	DF-1.0	DFD	DFDC	DFDCP	Fsh	UADFV	Avg.	Top3
Naive	Meso4	MesoNet	0.6077	0.5920	0.6771	0.6170	0.5946	0.5701	0.6097	0	0.7358	0.6091	0.9113	0.5481	0.5560	0.5994	0.5660	0.7150	0.6551	1
Naive	MesoIncep	MesoNet	0.7583	0.7278	0.8542	0.8087	0.7421	0.6517	0.7571	0	0.7366	0.6966	0.9233	0.6069	0.6226	0.7561	0.6438	0.9049	0.7364	3
Naive	CNN-Aug	ResNet	0.8493	0.7846	0.9048	0.8788	0.9026	0.7313	0.8419	0	0.7420	0.7027	0.7993	0.6464	0.6361	0.6170	0.5985	0.8739	0.7020	0
Naive	Xception	Xception	0.9637	0.8261	0.9799	0.9785	0.9833	0.9385	0.9450	4	0.7794	0.7365	0.8341	0.8163	0.7077	0.7374	0.6249	0.9379	0.7718	2
Naive	EfficientB4	Efficient	0.9567	0.8150	0.9757	0.9758	0.9797	0.9308	0.9389	0	0.7909	0.7487	0.8330	0.8148	0.6955	0.7283	0.6162	0.9472	0.7718	3
Spatial	Capsule	Capsule	0.8421	0.7040	0.8669	0.8634	0.8734	0.7804	0.8217	0	0.7909	0.7472	0.9107	0.6841	0.6465	0.6568	0.6465	0.9078	0.7488	2
Spatial	FWA	Xception	0.8765	0.7357	0.9210	0.9000	0.8843	0.8120	0.8549	0	0.7897	0.6680	0.9334	0.7403	0.6132	0.6375	0.5551	0.8539	0.7239	1
Spatial	Face X-ray	HRNet	0.9592	0.7925	0.9794	0.9872	0.9871	0.9290	0.9391	3	0.7093	0.6786	0.5531	0.7655	0.6326	0.6942	0.6553	0.8989	0.6985	0
Spatial	FFD	Xception	0.9624	0.8237	0.9803	0.9784	0.9853	0.9306	0.9434	1	0.7840	0.7435	0.8609	0.8024	0.7029	0.7426	0.6056	0.9450	0.7733	1
Spatial	CORE	Xception	0.9638	0.8194	0.9787	0.9803	0.9823	0.9339	0.9431	2	0.7798	0.7428	0.8475	0.8018	0.7049	0.7341	0.6032	0.9412	0.7694	0
Spatial	Recce	Designed	0.9621	0.8190	0.9797	0.9779	0.9785	0.9357	0.9422	1	0.7677	0.7319	0.7985	0.8119	0.7133	0.7419	0.6095	0.9446	0.7649	2
Spatial	UCF	Xception	0.9705	0.8399	0.9883	0.9840	0.9896	0.9441	0.9527	6	0.7793	0.7527	0.8241	0.8074	0.7191	0.7594	0.6462	0.9528	0.7801	5
Frequency	F3Net	Xception	0.9635	0.8271	0.9793	0.9796	0.9844	0.9354	0.9449	1	0.7769	0.7352	0.8431	0.7975	0.7021	0.7354	0.5914	0.9347	0.7645	0
Frequency	SPSL	Xception	0.9610	0.8174	0.9781	0.9754	0.9829	0.9299	0.9408	0	0.8150	0.7650	0.8767	0.8122	0.7040	0.7408	0.6437	0.9424	0.7875	3
Frequency	SRM	Xception	0.9576	0.8114	0.9733	0.9696	0.9740	0.9295	0.9359	0	0.7926	0.7552	0.8638	0.8120	0.6995	0.7408	0.6014	0.9427	0.7760	2

In the above table, "Avg." donates the average AUC for within-domain and cross-domain evaluation, and the overall results. "Top3" represents the count of each method ranks within the top-3 across all testing datasets. The best-performing method for each column is highlighted.

Also, we provide all experimental results in Link (code: qjpd). You can use these results for further analysis using the code in ./analysis folder.

📝 Citation

[Back to top]

If you find our benchmark useful to your research, please cite it as follows:

@article{yan2023deepfakebench,
  title={DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection},
  author={Yan, Zhiyuan and Zhang, Yong and Yuan, Xinhang and Lyu, Siwei and Wu, Baoyuan},
  journal={arXiv preprint arXiv:2307.01426},
  year={2023}
}

If interested, you can read our recent works about deepfake detection, and more works about trustworthy AI can be found here.

@article{yan2023ucf,
  title={UCF: Uncovering Common Features for Generalizable Deepfake Detection},
  author={Yan, Zhiyuan and Zhang, Yong and Fan, Yanbo and Wu, Baoyuan},
  journal={arXiv preprint arXiv:2304.13949},
  year={2023}
}

🛡️ License

[Back to top]

This repository is licensed by The Chinese University of Hong Kong, Shenzhen under Creative Commons Attribution-NonCommercial 4.0 International Public License (identified as CC BY-NC-4.0 in SPDX). More details about the license could be found in LICENSE.

This project is built by the Secure Computing Lab of Big Data (SCLBD) at The School of Data Science (SDS) of The Chinese University of Hong Kong, Shenzhen, directed by Professor Baoyuan Wu. SCLBD focuses on the research of trustworthy AI, including backdoor learning, adversarial examples, federated learning, fairness, etc.

If you have any suggestions, comments, or wish to contribute code or propose methods, we warmly welcome your input. Please contact us at wubaoyuan@cuhk.edu.cn or yanzhiyuan1114@gmail.com. We look forward to collaborating with you in pushing the boundaries of deepfake detection.

hubin111 / DeepfakeBench

DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection (NeurIPS 2023 D&B)

📚 Features

⏳ Quick Start

1. Installation

2. Download Data

3. Preprocessing (optional)

4. Rearrangement

5. Training (optional)

6. Evaluation

📦 Supported Detectors

🏆 Results

📝 Citation

🛡️ License

About

Languages