secutron's repositories
bvh-python
Python module for parsing BVH (Biovision hierarchical data) mocap files
Emote-hack
using chatgpt (now Claude 3) to reverse engineer code from Emote white paper. WIP
MachineLearning-AI
This repository contains all the work that I regularly did and studied from Medium blogs, several research papers, and other Repos (related/unrelated to the research papers).
stable-diffusion-webui
Stable Diffusion web UI
DiffFace
DiffFace: Diffusion-based Face Swapping with Facial Guidance
Easy-Wav2Lip
Colab for making Wav2Lip high quality and easy to use
football_analysis
This repository contains a comprehensive computer vision/machine learning football project that uses YOLO for object detection, Kmeans for pixel segmentation, optical flow for motion tracking, and perspective transformation to analyze player movements in football videos
hallo-for-windows
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
ImageBind
ImageBind One Embedding Space to Bind Them All
jepa
PyTorch code and models for V-JEPA self-supervised learning from video.
KoProgressiveTransformersSLP
Source code for "Progressive Transformers for End-to-End Sign Language Production" (ECCV 2020)
ListenDenoiseAction
Code to reproduce the results for our SIGGRAPH 2023 paper "Listen Denoise Action"
LivePortrait-Advanced-Portrait-Animation-System
LivePortrait is an advanced deep learning-based system for animating portrait images. It uses a two-stage training process to create realistic and controllable animations from static portrait images.
MARLIN
[CVPR] MARLIN: Masked Autoencoder for facial video Representation LearnINg
mediapipe_pose_compare
Joint angle comparison of mediapipe prediction results bvh conversion with ground truth bvh
Meteor
Official PyTorch implementation code for realizing the technical part of Mamba-based traversal of rationale (Meteor) to improve performance of numerous vision language performances for diverse capabilities. (Under Review)
minimal-diffusion
A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)
MultiTalk
[INTERSPEECH'24] Official repository for "MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset"
nitec
NITEC: Versatile Hand-Annotated Eye Contact Dataset for Ego-Vision Interaction (Accepted at WACV24)
SHOW
This is the codebase for SHOW.
Speech-driven-expressions
Speech-Driven Expression Blendshape Based on Single-Layer Self-attention Network (AIWIN 2022)
videollm-online
VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)