Ruiqi Li (RickyL-2000)

RickyL-2000

Geek Repo

Location:ZJU

Github PK Tool:Github PK Tool

Ruiqi Li's starred repositories

CogVideo

Text-to-video generation. The repo for ICLR2023 paper "CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers"

Language:PythonLicense:Apache-2.0Stargazers:3552Issues:0Issues:0

pytorchvideo

A deep learning library for video understanding research.

Language:PythonLicense:Apache-2.0Stargazers:3234Issues:0Issues:0

acad-homepage.github.io

AcadHomepage: A Modern and Responsive Academic Personal Homepage

Language:SCSSLicense:MITStargazers:1131Issues:0Issues:0

av-superb

A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)

Language:PythonLicense:NOASSERTIONStargazers:43Issues:0Issues:0

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonLicense:Apache-2.0Stargazers:2299Issues:0Issues:0

RectifiedFlow

Official Implementation of Rectified Flow (ICLR2023 Spotlight)

Language:PythonStargazers:704Issues:0Issues:0

speechmetrics

A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR

Language:PythonLicense:MITStargazers:870Issues:0Issues:0

ChatTTS

A generative speech model for daily dialogue.

Language:PythonLicense:AGPL-3.0Stargazers:27957Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:23381Issues:0Issues:0

ROSVOT

Robust Singing Voice Transcription and MIDI Extraction

Language:PythonStargazers:28Issues:0Issues:0

Prompt-Singer

Implementation of Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt (NAACL'24).

Language:PythonLicense:MITStargazers:49Issues:0Issues:0

tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:12497Issues:0Issues:0

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonLicense:Apache-2.0Stargazers:2888Issues:0Issues:0

VideoMAEv2

[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

Language:PythonLicense:MITStargazers:456Issues:0Issues:0

VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Language:PythonLicense:NOASSERTIONStargazers:1271Issues:0Issues:0

InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Language:PythonLicense:Apache-2.0Stargazers:1140Issues:0Issues:0

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonLicense:MITStargazers:1908Issues:0Issues:0

Omost

Your image is almost there!

Language:PythonLicense:Apache-2.0Stargazers:6922Issues:0Issues:0

autochord

Automatic Chord Recognition tools - ISMIR2021 Late-Breaking Demo presentation

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:99Issues:0Issues:0

MERT

Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".

Language:PythonLicense:Apache-2.0Stargazers:272Issues:0Issues:0

OpenVoice

Instant voice cloning by MyShell.

Language:PythonLicense:MITStargazers:27363Issues:0Issues:0

WeChatMsg

提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手

Language:PythonLicense:GPL-3.0Stargazers:31573Issues:0Issues:0

BeatNet

BeatNet is state-of-the-art (Real-Time) and Offline joint music beat, downbeat, tempo, and meter tracking system using CRNN and particle filtering. (ISMIR 2021's paper implementation).

Language:PythonLicense:CC-BY-4.0Stargazers:306Issues:0Issues:0
Language:Jupyter NotebookStargazers:417Issues:0Issues:0

muzic

Muzic: Music Understanding and Generation with Artificial Intelligence

Language:PythonLicense:MITStargazers:4373Issues:0Issues:0

musegan

An AI for Music Generation

Language:PythonLicense:MITStargazers:1773Issues:0Issues:0

audioldm_eval

This toolbox aims to unify audio generation model evaluation for easier comparison.

Language:PythonLicense:MITStargazers:277Issues:0Issues:0

AudioLDM

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Language:PythonLicense:NOASSERTIONStargazers:2344Issues:0Issues:0

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonLicense:MITStargazers:20248Issues:0Issues:0

sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

Language:C++License:Apache-2.0Stargazers:9861Issues:0Issues:0