Wen-Yi Hsiao (wayne391)

wayne391

Geek Repo

Company:@ailabstw

Location:Taiwan

Twitter:@wenyihsiao

Github PK Tool:Github PK Tool


Organizations
twmusicai
YatingMusic

Wen-Yi Hsiao's starred repositories

fucking-algorithm

刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.

yt-dlp

A feature-rich command-line audio/video downloader

Language:PythonLicense:UnlicenseStargazers:80829Issues:497Issues:7602

ultralytics

NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite

Language:PythonLicense:AGPL-3.0Stargazers:27631Issues:154Issues:8431

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:13085Issues:115Issues:983

TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Language:C++License:Apache-2.0Stargazers:10476Issues:156Issues:3647

wavesurfer.js

Audio waveform player

Language:TypeScriptLicense:BSD-3-ClauseStargazers:8567Issues:167Issues:2094

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7994Issues:87Issues:1739

GroundingDINO

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Language:PythonLicense:Apache-2.0Stargazers:6076Issues:37Issues:292

latent-consistency-model

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Language:PythonLicense:MITStargazers:4260Issues:63Issues:93

Segment-and-Track-Anything

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.

Language:Jupyter NotebookLicense:AGPL-3.0Stargazers:2744Issues:51Issues:153

audio-ai-timeline

A timeline of the latest AI models for audio generation, starting in 2023!

XMem

[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model

Language:PythonLicense:MITStargazers:1699Issues:22Issues:127

waveform-playlist

Multitrack Web Audio editor and player with canvas waveform preview. Set cues, fades and shift multiple tracks in time. Record audio tracks or provide audio annotations. Export your mix to AudioBuffer or WAV! Add effects from Tone.js. Project inspired by Audacity.

Language:JavaScriptLicense:MITStargazers:1448Issues:65Issues:132

CLAP

Contrastive Language-Audio Pretraining

Language:PythonLicense:CC0-1.0Stargazers:1305Issues:28Issues:86

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Language:PythonLicense:MITStargazers:1084Issues:26Issues:72

MidiTok

MIDI / symbolic music tokenizers for Deep Learning models 🎶

Language:PythonLicense:MITStargazers:646Issues:8Issues:85
Language:PythonLicense:AGPL-3.0Stargazers:487Issues:17Issues:36

all-in-one

All-In-One Music Structure Analyzer

Language:PythonLicense:MITStargazers:391Issues:9Issues:12

BeatNet

BeatNet is state-of-the-art (Real-Time) and Offline joint music beat, downbeat, tempo, and meter tracking system using CRNN and particle filtering. (ISMIR 2021's paper implementation).

Language:PythonLicense:CC-BY-4.0Stargazers:311Issues:9Issues:27

llark

Code for the paper "LLark: A Multimodal Instruction-Following Language Model for Music" by Josh Gardner, Simon Durand, Daniel Stoller, and Rachel Bittner.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:287Issues:7Issues:7

lp-music-caps

LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]

WavCaps

This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.

ffmpeg-scripts

ffmpeg shell scripts

Language:ShellLicense:BSD-3-ClauseStargazers:188Issues:11Issues:6

360monodepth

Code release for 360monodepth. With our framework we achieve monocular depth estimation for high resolution 360° images based on aligning and blending perspective depth maps.

Language:PythonLicense:MITStargazers:147Issues:5Issues:27

AQUA-Tk

AQUA-Tk = Audio QUality Assessment-Toolkit. (In development)

Language:PythonLicense:GPL-3.0Stargazers:93Issues:3Issues:3

demucs.cpp

C++17 port of Demucs v3 (hybrid) and v4 (hybrid transformer) models with ggml and Eigen3

Language:C++License:MITStargazers:82Issues:4Issues:12

hFT-Transformer

Pytorch implementation of automatic music transcription method that uses a two-level hierarchical frequency-time Transformer architecture (hFT-Transformer).

Language:PythonLicense:MITStargazers:70Issues:3Issues:2

nansypp

Unofficial implementation of NANSY++ in Pytorch Lightning

Language:PythonLicense:MITStargazers:45Issues:8Issues:3

coco-mulla-repo

Official source codes of coco-mulla

music-modeling-time-duration

Code of the paper "Impact of time and note duration tokenizations on deep learning symbolic music modeling" (ISMIR 2023)

Language:PythonStargazers:9Issues:1Issues:0