Levon Dang (Droliven)

Droliven

Geek Repo

Company:South China University of Technology, @shuopensourcecommunity

Location:Guangzhou, Guangdong, China

Github PK Tool:Github PK Tool


Organizations
shuosc

Levon Dang's starred repositories

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:66570Issues:557Issues:0

generative-models

Generative Models by Stability AI

Language:PythonLicense:MITStargazers:23890Issues:253Issues:294

Swin-Transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".

Language:PythonLicense:MITStargazers:13505Issues:127Issues:309

chatgpt-mirai-qq-bot

🚀 一键部署!真正的 AI 聊天机器人!支持ChatGPT、文心一言、讯飞星火、Bing、Bard、ChatGLM、POE,多账号,人设调教,虚拟女仆、图片渲染、语音发送 | 支持 QQ、Telegram、Discord、微信 等平台

Language:PythonLicense:AGPL-3.0Stargazers:12873Issues:72Issues:1045

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Language:PythonLicense:Apache-2.0Stargazers:10820Issues:184Issues:1900

ASRT_SpeechRecognition

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

Language:PythonLicense:GPL-3.0Stargazers:7711Issues:186Issues:289

fast-stable-diffusion

fast-stable-diffusion + DreamBooth

Language:PythonLicense:MITStargazers:7459Issues:85Issues:2036
Language:PythonLicense:Apache-2.0Stargazers:4766Issues:53Issues:908

latent-consistency-model

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Language:PythonLicense:MITStargazers:4260Issues:63Issues:93

mmpretrain

OpenMMLab Pre-training Toolbox and Benchmark

Language:PythonLicense:Apache-2.0Stargazers:3350Issues:30Issues:770

Mubert-Text-to-Music

A simple notebook demonstrating prompt-based music generation via Mubert API

Language:Jupyter NotebookStargazers:2731Issues:46Issues:16

deep-motion-editing

An end-to-end library for editing and rendering motion of 3D characters with deep learning [SIGGRAPH 2020]

Language:PythonLicense:BSD-2-ClauseStargazers:1541Issues:65Issues:200

stable-diffusion-webui-wd14-tagger

Labeling extension for Automatic1111's Web UI

improved-aesthetic-predictor

CLIP+MLP Aesthetic Score Predictor

Language:PythonLicense:Apache-2.0Stargazers:845Issues:6Issues:10

Make-An-Audio

PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model

Language:PythonLicense:MITStargazers:731Issues:71Issues:14

stable-diffusion-aesthetic-gradients

Personalization for Stable Diffusion via Aesthetic Gradients 🎨

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:716Issues:18Issues:18

musicnn

Pronounced as "musician", musicnn is a set of pre-trained deep convolutional neural networks for music audio tagging.

Language:Jupyter NotebookLicense:ISCStargazers:586Issues:20Issues:21

aesthetic-predictor

A linear estimator on top of clip to predict the aesthetic quality of pictures

Language:Jupyter NotebookLicense:MITStargazers:436Issues:13Issues:7

ubisoft-laforge-ZeroEGGS

All about ZeroEggs

Language:PythonLicense:NOASSERTIONStargazers:360Issues:12Issues:41

HPSv2

Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:350Issues:10Issues:38

llark

Code for the paper "LLark: A Multimodal Instruction-Following Language Model for Music" by Josh Gardner, Simon Durand, Daniel Stoller, and Rachel Bittner.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:287Issues:7Issues:7

Gesture-Generation-from-Trimodal-Context

Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity (SIGGRAPH Asia 2020)

Language:PythonLicense:NOASSERTIONStargazers:242Issues:10Issues:58

DiffuseStyleGesture

DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models (IJCAI 2023) | The DiffuseStyleGesture+ entry to the GENEA Challenge 2023 (ICMI 2023, Reproducibility Award)

Language:PythonLicense:MITStargazers:146Issues:7Issues:39

Audio2Gestures

Audio2Motion Official implementation for Audio2Motion: Generating Diverse Gestures from Speech with Conditional Variational Autoencoders.

youtube-gesture-dataset

This repository contains scripts to build Youtube Gesture Dataset.

Language:PythonLicense:BSD-3-ClauseStargazers:115Issues:4Issues:9

AudioEmotion

Recognize Audio Emotion.

Language:PythonLicense:MITStargazers:86Issues:3Issues:2

ImageAestheticAssessmentPyTorch

Image Aesthetic Assessment in PyTorch with implemented popular datasets and models (possibly providing the pretrained ones).

Language:PythonLicense:Apache-2.0Stargazers:36Issues:3Issues:1
Language:PythonStargazers:9Issues:0Issues:2