Mohsen Fayyaz's repositories
PyTorch_Video_Dataset
Video dataset class for loading videos in PyTorch using Dataloader
C3DSegmentation
Video Semantic Segmentation using 3D CNNs in Torch
3D-ResNets-PyTorch
3D ResNets for Action Recognition (CVPR 2018)
Mini-Kinetics-200
Mini-Kinetics-200 data splits used in paper "Rethinking Spatiotemporal Feature Learning For Video Understanding"
ActivityNet
This repository is intended to host tools and demos for ActivityNet
C3DPyTorch
Implementation of C3D Network in PyTorch
facenet-pytorch
Pretrained Pytorch face detection (MTCNN) and recognition (InceptionResnet) models
Lets-keep-it-simple-using-simple-architecture-to-outperform-deeper-architectures
This repository contains the architectures, Models, logs, etc pertaining to the SimpleNet Paper (Lets keep it simple: Using simple architectures to outperform deeper architectures )
mv-extractor
Extract frames and motion vectors from H.264 and MPEG-4 encoded video.
task-driven-object-detection
Author's implementation of "What Object Should I Use? - Task Driven Object Detection" (CVPR 2019)
vq-vae-2-pytorch
Implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2 in PyTorch
YOLOv2_Pytorch
This is a repository containing the implementation of YOLOv2.
DynamicViT
[NeurIPS 2021] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
LLaVA
[NeurIPS 2023 Oral] Visual Instruction Tuning: LLaVA (Large Language-and-Vision Assistant) built towards GPT-4V level capabilities.
pytorch-ssim
pytorch structural similarity (SSIM) loss
slowfast_feature_extractor
Feature Extractor module for videos using the PySlowFast framework
Swin-Transformer
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".