Create UIs for prototyping your machine learning model in 3 minutes
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
End-to-End Speech Processing Toolkit
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
Official Tensorflow implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Detection"
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
Large-scale pretraining for dialogue
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Simple image captioning model
Search inside YouTube videos using natural language
Omniscient Mozart, being able to transcribe everything in the music, including vocal, drum, chord, beat, instruments, and more.
Resources for the "CTRLsum: Towards Generic Controllable Text Summarization" paper
A Unified Toolkit for Deep Learning Based Document Image Analysis
Clone a voice in 5 seconds to generate arbitrary speech in real-time
EMNLP'2021: SimCSE: Simple Contrastive Learning of Sentence Embeddings
Official implementation of "OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal Association" in PyTorch.
[ICCV '21] In this repository you find the code to our paper "Keypoint Communities".
docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
Detectron2 is FAIR's next-generation platform for object detection, segmentation and other visual recognition tasks.
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)
General Speech Restoration
Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!
State-of-the-art 2D and 3D Face Analysis Project
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!
Dense Prediction Transformers
Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2020"
SwinIR: Image Restoration Using Swin Transformer
Code for "LoFTR: Detector-Free Local Feature Matching with Transformers", CVPR 2021