Jiashuo Yu (JustinYuu)

JustinYuu

Geek Repo

Company:OpenGVLab

Location:Shanghai, China

Github PK Tool:Github PK Tool


Organizations
OpenGVLab
VideoIntern

Jiashuo Yu's starred repositories

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:35062Issues:322Issues:430

ChatTTS

A generative speech model for daily dialogue.

Language:PythonLicense:AGPL-3.0Stargazers:29698Issues:172Issues:480

mojo

The Mojo Programming Language

Language:MojoLicense:NOASSERTIONStargazers:22735Issues:267Issues:1998

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonLicense:MITStargazers:20507Issues:201Issues:371

OpenGFW

OpenGFW is a flexible, easy-to-use, open source implementation of GFW (Great Firewall of China) on Linux

Language:GoLicense:MPL-2.0Stargazers:9433Issues:66Issues:69
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7177Issues:63Issues:184

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Language:PythonLicense:MITStargazers:5225Issues:50Issues:488

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4403Issues:58Issues:149

InternGPT

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

Language:PythonLicense:Apache-2.0Stargazers:3182Issues:43Issues:49

Baichuan-13B

A 13B large language model developed by Baichuan Intelligent Technology

Language:PythonLicense:Apache-2.0Stargazers:2978Issues:31Issues:195

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:1365Issues:25Issues:64

LLaSM

第一个支持中英文双语语音-文本多模态对话的开源可商用对话模型。便捷的语音输入将大幅改善以文本为输入的大模型的使用体验,同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。

Language:PythonLicense:Apache-2.0Stargazers:505Issues:13Issues:8

VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

Language:PythonLicense:Apache-2.0Stargazers:460Issues:12Issues:54

CLAP

Learning audio concepts from natural language supervision

Language:PythonLicense:MITStargazers:451Issues:14Issues:19

Vlogger

[CVPR2024] Make Your Dream A Vlog

Language:PythonLicense:Apache-2.0Stargazers:404Issues:10Issues:15

llark

Code for the paper "LLark: A Multimodal Instruction-Following Language Model for Music" by Josh Gardner, Simon Durand, Daniel Stoller, and Rachel Bittner.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:287Issues:7Issues:7

audioldm_eval

This toolbox aims to unify audio generation model evaluation for easier comparison.

Language:PythonLicense:MITStargazers:283Issues:5Issues:9

lp-music-caps

LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]

VAST

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

Language:Jupyter NotebookLicense:MITStargazers:228Issues:18Issues:26

MU-LLaMA

MU-LLaMA: Music Understanding Large Language Model

Language:PythonLicense:GPL-3.0Stargazers:219Issues:9Issues:22

WavCaps

This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.

Awesome-Evaluation-of-Visual-Generation

A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems

Language:PythonLicense:Apache-2.0Stargazers:85Issues:3Issues:4

MWAFM

Multi-Scale Attention for Audio Question Answering

Certifiable-Robust-Multi-modal-Training

A python implement for Certifiable Robust Multi-modal Training

Language:PythonStargazers:12Issues:0Issues:0

perception_test_iccv2023

Champion Solutions repository for Perception Test challenges in ICCV2023 workshop.

Language:PythonLicense:MITStargazers:11Issues:1Issues:0