chinmaynehate / DFSpot-Deepfake-Recognition

Determine whether a given video sequence has been manipulated or synthetically generated

Home Page:https://colab.research.google.com/drive/1s0e0OO_Xcyw7S81s8GydTDtTQXJvJPpL?usp=sharing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PR Maintenance License Made with Open In Colab PyTorch


Logo

DFSpot-Deepfake-Recognition

Determine whether a given video sequence has been manipulated or synthetically generated
Report Bug · Request Feature

⚡️ Try the demo here ⚡️

example1

drawing drawing
Ensemble of 4 models produce the above results on test videos from datasets such as Celeb-DF(v2), FaceForensics++ and DFDC

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Contributing
  5. License
  6. Acknowledgments

📖 About The Project

PyTorch code for DF-Spot, a model ensemble that determines if an input video/image is real or fraudulent. To identify deepfakes, this study proposes an ensemble-based metric learning technique based on a siamese network architecture, in which four models are built beginning from a base network. This method has been validated using publicly available datasets such as Celeb-DF (v2), FaceForensics++, and DFDC.

(back to top)

🧱 Built With

⚡️ Getting Started

Set up the project on your local machine by following the instructions below. You can also run the demo on Google Colab here

✔️ Prerequisites

  • Update system and install pip3

    sudo apt update
    sudo apt -y install python3-pip
  • Python virtual environment (optional)

    sudo apt install python3-venv   

⚙️ Installation

  1. Create a python virtual environment (optional)
    mkdir df_spot
    cd df_spot
    python3 -m venv df_spot_env
    source df_spot_env/bin/activate
  2. Clone the repo
    git clone https://github.com/chinmaynehate/DeepFake-Spot.git
  3. Install dependencies
    cd DFSpot-Deepfake-Recognition
    sudo chmod +x setup.sh

🔔 Note The following command, which runs the setup.sh file, requires the -m parameter, which accepts either dfdc, celeb, ffpp, or all as inputs. If the flag -m is used with the option dfdc, setup.sh will download the models trained on the dfdc dataset. The models are currently saved on Google Drive and there appears to be a limit to the number of files that may be downloaded using the command-line utility tool gdown. As a result, it is possible that this limit has been reached and you are unable to download the models. If this occurs, try running the script again after 24 hours. You can also manually download the models by visiting the google drive link for the models from the setup.sh file.
Downloading the models manually is recommended.

./setup.sh -m <dataset>

For eg. If you want to download models trained on dfdc dataset, then run:

./setup.sh -m dfdc

The other options are: celeb, ffpp or all

💾 Project file structure

After running the requirements, prerequisites and installation scripts, the directory structure of 'DFSpot-Deepfake-Recognition/' is as follows

|-- assets # contains images & gifs for readme
|-- examples.sh # contains example for running spot_deepfakes.py 
|-- models # contains twelve .pth files. These are downloaded using gdown and extracted in setup.sh
|   |-- celeb_v2.pth
|   |-- dfdc_v2st.pth
|   |-- ffpp_v2.pth
|-- README.md
|-- requirements.txt
|-- sample_images # contains sample images from test set of ffpp, celebdf & dfdc dataset. Save the images that have to be tested in this folder           
|-- sample_output_videos # contains sample output videos that are obtained after running the code 
|-- sample_videos # contains all the sample videos downloaded using gdown and extracted in setup.sh. Save the video files that have to be tested in this folder
|   |-- abc.mp4 # video whose authenticity has to be tested
|   |-- pqr.mp4 # video whose authenticity has to be tested
|-- setup.sh # downloads all the models, sample_videos and installs dependencies
|-- src
    |-- architectures # contains definitions of models
    |-- blazeface # for face extraction
    |-- ensemble_model.ipynb 
    |-- output # contains the annotated video files generated by running spot_deepfakes.py
    |   |-- abc.avi # annotated video with frame-level predictions done by the ensemble of models for sample_videos/abc.mp4
    |   |-- pqr.avi # annotated video with frame-level predictions done by the ensemble of models for sample_videos/pqr.mp4
    |   |-- predictions.csv # final prediction class of abc.mp4 and pqr.mp4 i.e real or fake is stored as csv
    |-- spot_deepfakes.py # main()
    |-- utils # contains functions for extraction of faces from videos in sample_videos, loading models, ensemble of models and annotation

(back to top)

⚡️ Usage

📹 For videos

  1. When setup.sh is executed, a few example videos from the test set of datasets such as DFDC, FFPP, and CelebDF(V2) are saved in sample videos/ folder. Assume you run the setup.sh file with the -m flag option dfdc. If so, then pass dfdc as the --dataset argument, and the code will check for models trained on the dfdc dataset in the models directory specified by the --model dir argument. Command to check for deepfakes in these videos using models trained on dfdc dataset is:
python3 spot_deepfakes.py --media_type video --data_dir ../sample_videos/dfdc/fake/ --dataset dfdc --model TimmV2 TimmV2ST ViT ViTST  --model_dir ../models/ --video_id 2 3 4 --annotate True --device 0 --output_dir output/  

The predictions are stored in output/predictions.csv and video with frame level annotations of predictions made by individual models and ensemble of models is stored in output/ folder.

  1. Say you have three videos- video1.mp4, video2.mp4 and video3.mp4 and you want to check their authenticity. Place these three videos in the sample_videos/ folder and run:
python3 spot_deepfakes.py --media_type video --data_dir ../sample_videos/ --dataset ffpp --model TimmV2 TimmV2ST ViT ViTST  --model_dir ../models/ --video_id 0 1 2 --annotate True --device 0 --output_dir output/  

The predictions are stored in output/predictions.csv and video with frame level annotations of predictions made by individual models and ensemble of models is stored in output/ folder.

🖼️ For images

  1. By running setup.sh during installation, few sample images from test set of datasets like DFDC, FFPP and CelebDF(V2) are saved in sample_images/. To check the authenticity of these images, run:
python3 spot_deepfakes.py --media_type image --data_dir ../sample_images/ --dataset dfdc --model TimmV2 TimmV2ST ViT ViTST --model_dir ../models  --device 0 --output_dir output/  
  1. Say you have a few images and you need to check their authenticity. Place them in the sample_images/ folder and run the following command:
python3 spot_deepfakes.py --media_type image --data_dir ../sample_images/ --dataset dfdc --model TimmV2 TimmV2ST ViT ViTST --model_dir ../models  --device 0 --output_dir output/  

The predictions are stored in output/img_predictions.json

For more examples, please refer to examples.sh

(back to top)

⭐️ Contributing

Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

⚠️ License

Distributed under the MIT License. See LICENSE for more information.

(back to top)

👍 Acknowledgments

(back to top)

⭐ How to cite

Plain text:

C. Nehate, P. Dalia, S. Naik and A. Bhan, "Exposing DeepFakes using Siamese Training," 2022 IEEE India Council International Subsections Conference (INDISCON), 2022, pp. 1-6, doi: 10.1109/INDISCON54605.2022.9862825.

Bibtex:

@INPROCEEDINGS{9862825,
  author={Nehate, Chinmay and Dalia, Parth and Naik, Saket and Bhan, Aditya},
  booktitle={2022 IEEE India Council International Subsections Conference (INDISCON)}, 
  title={Exposing DeepFakes using Siamese Training}, 
  year={2022},
  volume={},
  number={},
  pages={1-6},
  doi={10.1109/INDISCON54605.2022.9862825}}