AK391's Most Used Languages
AK391's GitHub Stats

AK391's repositories

gradio

Create UIs for prototyping your machine learning model in 3 minutes

Language:PythonLicense:Apache-2.0Stargazers:0Forks:0Issues:0

SpecVQGAN

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

Language:Jupyter NotebookLicense:MITStargazers:0Forks:0Issues:0

espnet

End-to-End Speech Processing Toolkit

Language:PythonLicense:Apache-2.0Stargazers:0Forks:0Issues:0

lama

🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Forks:0Issues:0

mlsd

Official Tensorflow implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Detection"

Language:PythonLicense:Apache-2.0Stargazers:0Forks:0Issues:0

deepface

A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python

Language:PythonLicense:MITStargazers:0Forks:0Issues:0

DialoGPT

Large-scale pretraining for dialogue

Language:PythonLicense:MITStargazers:0Forks:0Issues:0

TransformerTTS

🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.

License:NOASSERTIONStargazers:0Forks:0Issues:0

Swin-Transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".

License:MITStargazers:0Forks:0Issues:0

CLIP_prefix_caption

Simple image captioning model

Language:Jupyter NotebookLicense:MITStargazers:0Forks:0Issues:0

natural-language-youtube-search

Search inside YouTube videos using natural language

Language:Jupyter NotebookLicense:MITStargazers:0Forks:0Issues:0

omnizart

Omniscient Mozart, being able to transcribe everything in the music, including vocal, drum, chord, beat, instruments, and more.

License:MITStargazers:0Forks:0Issues:0

ctrl-sum

Resources for the "CTRLsum: Towards Generic Controllable Text Summarization" paper

Language:PythonLicense:BSD-3-ClauseStargazers:0Forks:0Issues:0

layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis

Language:PythonLicense:Apache-2.0Stargazers:0Forks:0Issues:0

Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonLicense:NOASSERTIONStargazers:0Forks:0Issues:0

SimCSE

EMNLP'2021: SimCSE: Simple Contrastive Learning of Sentence Embeddings

Language:PythonLicense:MITStargazers:0Forks:0Issues:0

openpifpaf

Official implementation of "OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal Association" in PyTorch.

Language:PythonLicense:NOASSERTIONStargazers:0Forks:0Issues:0

Keypoint_Communities

[ICCV '21] In this repository you find the code to our paper "Keypoint Communities".

Language:PythonLicense:MITStargazers:0Forks:0Issues:0

doctr

docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

License:Apache-2.0Stargazers:0Forks:0Issues:0

detectron2

Detectron2 is FAIR's next-generation platform for object detection, segmentation and other visual recognition tasks.

Language:PythonLicense:Apache-2.0Stargazers:0Forks:0Issues:0

AudioCLIP

Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

License:MITStargazers:0Forks:0Issues:0

voicefixer_main

General Speech Restoration

Language:PythonLicense:AGPL-3.0Stargazers:0Forks:0Issues:0

VQMIVC

Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!

Language:Jupyter NotebookLicense:MITStargazers:0Forks:0Issues:0

insightface

State-of-the-art 2D and 3D Face Analysis Project

Language:PythonLicense:MITStargazers:0Forks:0Issues:0

yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Language:PythonLicense:GPL-3.0Stargazers:0Forks:0Issues:0

RobustVideoMatting-1

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Language:PythonLicense:GPL-3.0Stargazers:0Forks:0Issues:0

DPT

Dense Prediction Transformers

Language:PythonLicense:MITStargazers:0Forks:0Issues:0

MiDaS

Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2020"

Language:JavaLicense:MITStargazers:0Forks:0Issues:0

SwinIR

SwinIR: Image Restoration Using Swin Transformer

Language:PythonLicense:Apache-2.0Stargazers:0Forks:0Issues:0

LoFTR

Code for "LoFTR: Detector-Free Local Feature Matching with Transformers", CVPR 2021

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Forks:0Issues:0