Rongjiehuang

Rongjiehuang

Geek Repo

Company:Facebook AI Research (FAIR)

Home Page:rongjiehuang.github.io

Github PK Tool:Github PK Tool


Organizations
AIGC-Audio

Rongjiehuang's starred repositories

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:63806Issues:531Issues:0

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:33605Issues:339Issues:2627

ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Language:PythonLicense:NOASSERTIONStargazers:15603Issues:135Issues:615

codellama

Inference code for CodeLlama models

Language:PythonLicense:NOASSERTIONStargazers:13860Issues:159Issues:169

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:10500Issues:139Issues:328

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonLicense:NOASSERTIONStargazers:7959Issues:77Issues:492

llama-recipes

Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & question answering. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment.Demo apps to showcase Llama2 for WhatsApp & Messenger

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:7850Issues:68Issues:227

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonLicense:MITStargazers:6341Issues:61Issues:77

lit-gpt

Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Language:PythonLicense:Apache-2.0Stargazers:5189Issues:63Issues:476

aim

Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.

Language:PythonLicense:Apache-2.0Stargazers:4928Issues:43Issues:993

latent-consistency-model

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Language:PythonLicense:MITStargazers:4185Issues:61Issues:91

motion-diffusion-model

The official PyTorch implementation of the paper "Human Motion Diffusion Model"

Language:PythonLicense:MITStargazers:2934Issues:68Issues:195

LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Language:PythonLicense:NOASSERTIONStargazers:2602Issues:35Issues:132

Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Language:PythonLicense:BSD-3-ClauseStargazers:2568Issues:31Issues:149

stable-audio-tools

Generative models for conditional audio generation

Language:PythonLicense:MITStargazers:2218Issues:40Issues:60

consistencydecoder

Consistency Distilled Diff VAE

Language:PythonLicense:MITStargazers:2097Issues:23Issues:19

gigagan-pytorch

Implementation of GigaGAN, new SOTA GAN out of Adobe. Culmination of nearly a decade of research into GANs

Language:PythonLicense:MITStargazers:1685Issues:73Issues:47

Macaw-LLM

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Language:PythonLicense:Apache-2.0Stargazers:1460Issues:33Issues:34

LLM-eval-survey

The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".

CLAP

Contrastive Language-Audio Pretraining

Language:PythonLicense:CC0-1.0Stargazers:1231Issues:29Issues:83

Sophia

The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”

Language:PythonLicense:MITStargazers:914Issues:16Issues:40

LLaSM

第一个支持中英文双语语音-文本多模态对话的开源可商用对话模型。便捷的语音输入将大幅改善以文本为输入的大模型的使用体验,同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。

Language:PythonLicense:Apache-2.0Stargazers:493Issues:13Issues:6

UniAudio

The Open Source Code of UniAudio

Persona-Dialogue-Generation

The code of ACL 2020 paper "You Impress Me: Dialogue Generation via Mutual Persona Perception"

Language:PythonLicense:MITStargazers:308Issues:8Issues:34

lp-music-caps

LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]

Whispering-LLaMA

EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction

Language:Jupyter NotebookLicense:MITStargazers:204Issues:4Issues:10

bigvsan

Pytorch implementation of BigVSAN

Language:PythonLicense:MITStargazers:188Issues:28Issues:5

cbtm

Code repository for the c-BTM paper

Language:PythonLicense:Apache-2.0Stargazers:105Issues:5Issues:3

MULTI-AUDIODEC

This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.

Language:PythonStargazers:40Issues:2Issues:0

Look2hear

A toolkit for researchers in the multimodal sound separation.