Yizhou Lu (luyizhou4)

luyizhou4

Geek Repo

Company:Shanghai Jiao Tong University

Location:Shanghai

Github PK Tool:Github PK Tool

Yizhou Lu's starred repositories

bert

TensorFlow code and pre-trained models for BERT

Language:PythonLicense:Apache-2.0Stargazers:37439Issues:999Issues:1140

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonLicense:MITStargazers:20169Issues:194Issues:364

Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Language:PythonLicense:NOASSERTIONStargazers:9889Issues:131Issues:48

autocut

用文本编辑器剪视频

Language:PythonLicense:Apache-2.0Stargazers:6404Issues:49Issues:82

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++License:Apache-2.0Stargazers:5632Issues:65Issues:623

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonLicense:BSD-3-ClauseStargazers:5344Issues:63Issues:93

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonLicense:Apache-2.0Stargazers:5148Issues:38Issues:35

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonLicense:NOASSERTIONStargazers:3955Issues:48Issues:841

audio-ai-timeline

A timeline of the latest AI models for audio generation, starting in 2023!

FateZero

[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"

Language:Jupyter NotebookLicense:MITStargazers:1070Issues:14Issues:33

tango

A family of diffusion models for text-to-audio generation.

Language:PythonLicense:NOASSERTIONStargazers:950Issues:25Issues:43

llm-hallucination-survey

Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"

IguanaTex

A PowerPoint add-in allowing you to insert LaTeX equations into PowerPoint presentations on Windows and Mac

Language:VBALicense:NOASSERTIONStargazers:790Issues:14Issues:64

Long-Context

This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and benchmark tasks that evaluate a model’s information retrieval capabilities with context expansion. We also include key experimental results and instructions for reproducing and building on them.

Language:PythonLicense:Apache-2.0Stargazers:567Issues:13Issues:6

Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Language:Jupyter NotebookStargazers:543Issues:23Issues:28

AudioMAE

This repo hosts the code and models of "Masked Autoencoders that Listen".

Language:PythonLicense:NOASSERTIONStargazers:503Issues:34Issues:27

distrifuser

[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Language:PythonLicense:MITStargazers:490Issues:8Issues:16

libri-light

dataset for lightly supervised training using the librivox audio book recordings. https://librivox.org/.

Language:PythonLicense:MITStargazers:460Issues:21Issues:15

tiny-training

On-Device Training Under 256KB Memory [NeurIPS'22]

Language:PythonLicense:MITStargazers:414Issues:17Issues:8

Large-Audio-Models

Keep track of big models in audio domain, including speech, singing, music etc.

Youku-mPLUG

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks

Language:PythonLicense:Apache-2.0Stargazers:268Issues:5Issues:29

CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

Language:PythonLicense:MITStargazers:184Issues:9Issues:52

retraining-free-pruning

[NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers

KeSpeech

The repo provides information about KeSpeech dataset.

ReinMax

Beyond Straight-Through

Language:PythonLicense:MITStargazers:78Issues:4Issues:1

patch_conv

Patch convolution to avoid large GPU memory usage of Conv2D

Language:PythonLicense:MITStargazers:68Issues:8Issues:1

PSL

Source code for ICASSP2022 "Pseudo Strong labels for large scale weakly supervised audio tagging"

Language:PythonLicense:GPL-3.0Stargazers:30Issues:4Issues:3

public_talks

Materials of public talks given By SJTU X-LANCE members