GreedyIsGood's repositories
asap-dataset
A dataset of 222 digital musical scores aligned with 1068 performances (more than 92 hours) of Western classical piano music.
Awesome-CoreML-Models
Largest list of models for Core ML (for iOS 11+)
awesome-speech-enhancement
speech enhancement\speech seperation\sound source localization
CAT
A CRF-based ASR Toolkit
ChineseBQB
🇨🇳 Chinese sticker pack,More joy / 表情包的博物馆, Github最有毒的仓库, **表情包大集合, 聚欢乐~
DSTC8-AVSD
We rank the 1st in DSTC8 Audio-Visual Scene-Aware Dialog competition. This is the source code for our IEEE/ACM TASLP (AAAI2020-DSTC8-AVSD) paper "Bridging Text and Video: A Universal Multimodal Transformer for Video-Audio Scene-Aware Dialog".
end-to-end-synthetic-speech-detection
Time-domain synthetic speech detection net (TSSDNet), having the classic ResNet and Inception Net style structures (Res-TSSDNet and Inc-TSSDNet), for end-to-end synthetic speech detection. They achieve the state-of-the-art performance in terms of EER on ASVspoof 2019 challenge and promising generalization capability tested on ASVspoof 2015.
hair
remove image background
Ideal-Piano
这是一款智能钢琴软件,通过乐理逻辑的算法来判断当前演奏的音组成的是什么和弦,支持midi键盘,电脑键盘,DAW同步播放工程,播放midi文件分析和弦并且实时演示。This is a piano software that analyzes what chords you are playing in real time by music theory based chord types detection algorithms written by me and displays the chord types on the screen. This piano software supports midi keyboard playing, computer keyboard playing, play and analyze midi files, DAW synchronous display and so on.
MDVC
PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)
mmdetection
OpenMMLab Detection Toolbox and Benchmark
mmocr
OpenMMLab Text Detection, Recognition and Understanding Toolbox
MODNet
A Trimap-Free Solution for Portrait Matting in Real Time under Changing Scenes
OpenTransformer
A No-Recurrence Sequence-to-Sequence Model for Speech Recognition
PandaOCR
PandaOCR - 多功能OCR图文识别+翻译+朗读+弹窗+公式+表格+图床+搜图+二维码
pytorch-softdtw-cuda
Fast CUDA implementation of (differentiable) soft dynamic time warping for PyTorch using Numba
qlib
Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment. With Qlib, you can easily try your ideas to create better Quant investment strategies.
Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
RRPN_plusplus
RRPN++: Guidance Towards More Accurate Scene Text Detection
SlowFast
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
SoundLocation
基于pynq-z2的声源定位系统
source_separation
Deep learning based speech source separation using Pytorch
SpeechAlgorithms
Speech Algorithms Collections
speechbrain
A PyTorch-based Speech Toolkit
swapping-autoencoder-pytorch
Official Implementation of Swapping Autoencoder for Deep Image Manipulation (NeurIPS 2020)
wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit