Hannieliao

Hannieliao

Geek Repo

Company:Tsinghua University

Location:Shen Zhen, China

Github PK Tool:Github PK Tool

Hannieliao's starred repositories

self-llm

《开源大模型食用指南》基于Linux环境快速部署开源大模型,更适合**宝宝的部署教程

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7328Issues:0Issues:0

piano-a2s

End-to-end real-world polyphonic piano audio-to-score transcription with hierarchical decoding (IJCAI 2024)

Language:PythonLicense:Apache-2.0Stargazers:17Issues:0Issues:0

Baton

Official Repository of IJCAI 2024 Paper: "BATON: Aligning Text-to-Audio Model with Human Preference Feedback"

Language:PythonStargazers:12Issues:0Issues:0

Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

Language:HTMLStargazers:243Issues:0Issues:0

stable-audio-tools

Generative models for conditional audio generation

Language:PythonLicense:MITStargazers:2431Issues:0Issues:0

hello-algo

《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing

Language:JavaLicense:NOASSERTIONStargazers:93224Issues:0Issues:0

LLMs-from-scratch

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:24883Issues:0Issues:0

youtube-8m-videos-downloader

Download videos from YouTube-8M dataset for testing

Language:PythonStargazers:6Issues:0Issues:0

audiosetdl

Scripts for download AudioSet

Language:Jupyter NotebookStargazers:64Issues:0Issues:0

Fast-Audioset-Download

Download audioset data super fastly with youtube-dl, ffmpeg and python multiprocessing

Language:PythonLicense:BSD-3-ClauseStargazers:26Issues:0Issues:0

CVPR-2024-Speech_Audio_Music-Papers

A curated collections of papers related to speech, audio and music in CVPR 2024.

Stargazers:6Issues:0Issues:0

MLQuestions

Machine Learning and Computer Vision Engineer - Technical Interview Questions

Stargazers:2808Issues:0Issues:0

REPARO

The official implementation of work "REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment".

Stargazers:36Issues:0Issues:0

Seeing-and-Hearing

[CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners

Language:PythonLicense:NOASSERTIONStargazers:106Issues:0Issues:0

lightning-hydra-template

PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡

Language:PythonStargazers:3986Issues:0Issues:0

awesome-mlss

🤖 Machine Learning Summer School deadlines

Language:HTMLLicense:MITStargazers:2634Issues:0Issues:0

Awesome-Video-Diffusion-Models

[Arxiv] A Survey on Video Diffusion Models

Stargazers:1623Issues:0Issues:0

Diff-Foley

Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models

Language:PythonLicense:Apache-2.0Stargazers:141Issues:0Issues:0

audioldm_eval

This toolbox aims to unify audio generation model evaluation for easier comparison.

Language:PythonLicense:MITStargazers:283Issues:0Issues:0

mfa-models

Collection of pretrained models for the Montreal Forced Aligner

Language:PythonLicense:CC-BY-4.0Stargazers:105Issues:0Issues:0

audio-dataset

Audio Dataset for training CLAP and other models

Language:PythonStargazers:610Issues:0Issues:0

ImageSelect

Code for the paper "If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection"

Language:PythonLicense:MITStargazers:27Issues:0Issues:0

d3po

[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"

Language:PythonLicense:MITStargazers:149Issues:0Issues:0

AudioLDM-training-finetuning

AudioLDM training, finetuning, evaluation and inference.

Language:PythonLicense:MITStargazers:181Issues:0Issues:0

Make-An-Audio

PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model

Language:PythonLicense:MITStargazers:730Issues:0Issues:0

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4386Issues:0Issues:0

CoMoSpeech

CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model

Language:PythonLicense:MITStargazers:172Issues:0Issues:0

UniAudio

The Open Source Code of UniAudio

Language:PythonStargazers:498Issues:0Issues:0

ImageReward

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation

Language:PythonLicense:Apache-2.0Stargazers:1066Issues:0Issues:0

audiocaps-download

This package aims at simplifying the download of the AudioCaps dataset.

Language:PythonStargazers:28Issues:0Issues:0