DanielLin94144

followers

following

stars

National Taiwan University

Taiwan

https://daniellin94144.github.io/

Guan-Ting (Daniel) Lin's starred repositories

leetcode-master

《代码随想录》LeetCode 刷题攻略：200道经典题目刷题顺序，共60w字的详细图解，视频难点剖析，50余张思维导图，支持C++，Java，Python，Go，JavaScript等多语言版本，从此算法学习不再迷茫！🔥🔥 来看看，你会发现相见恨晚！🚀

Language:Shell49546 378 225

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION25177 207 215

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.021052 179 424

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookMIT13972 108 309

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonMIT11076 163 240

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

LLMsPracticalGuide

A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Language:Jupyter NotebookNOASSERTION7295 89 114

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonApache-2.02929 47 61

jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

Language:PythonNOASSERTION2575 37 52

stable-audio-tools

Generative models for conditional audio generation

Language:PythonMIT2397 42 77

Awesome-Graph-LLM

A collection of AWESOME things about Graph-Related LLMs.

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonMIT1143 57 50

acl-style-files

Official style files for papers submitted to venues of the Association for Computational Linguistics

Language:TeX641 9 26

speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

natural_voice_assistant

Language:PythonMIT437 21 18

StableTTS

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

Language:PythonMIT290 26 13

snac

Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate

Language:PythonMIT258 6 16

CPED

CPED: A Large-Scale Chinese Personalized and Emotional Dialogue Dataset for Conversational AI | 中文个性情感对话数据集

Language:PythonApache-2.0194 4 6

control-vc

This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"

Language:PythonNOASSERTION125 9 12

agc

Audiogen Codec

Language:PythonMIT107 3 1

EAT

[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer

Language:PythonMIT94 5 5

DiscreteSpeechMetrics

Reference-aware automatic speech evaluation toolkit

Language:PythonMIT82 4 2

SpeechAgents

SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

last

A JAX library for building lattice-based speech transducer models

Language:PythonApache-2.038 7 1

Interspeech2024_DiscreteSpeechChallenge

This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.

Spatial-AST

🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)

Language:PythonNOASSERTION2600

PyToBI

A Toolkit for ToBI Labeling with Python Data Structures

Language:PythonGPL-3.024 2 7

emphassess

This repository presents an evaluation framework for speech-to-speech (S2S) models, following the methodology described in the EmphAsses paper (de Seyssel et al., 2023).

Language:PythonNOASSERTION11 5 2

Wav2ToBI

Language:PythonMIT4 10