Jiayu DU's repositories
.tmux
🇫🇷 Oh my tmux! My self-contained, pretty & versatile tmux configuration made with ❤️
audio-dataset
Audio Dataset for training CLAP and other models
C-Macro-Collections
Easy to use, modular, header only, macro based, generic and type-safe Data Structures in C
cc2dataset
Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...
chatgpt-retrieval-plugin
The ChatGPT Retrieval Plugin lets you easily search and find personal or work documents by asking questions in everyday language.
cJSON
Ultralightweight JSON parser in ANSI C
datasette
An open source multi-tool for exploring and publishing data
emhash
Fast and memory efficient c++ flat hash map/set
faster-whisper
Faster Whisper transcription with CTranslate2
GigaSpeech
Large, modern dataset for speech recognition
hamt
A hash array-mapped trie implementation in C
highway
Performance-portable, length-agnostic SIMD with runtime dispatch
ipa-dict
Monolingual wordlists with pronunciation information in IPA
kaldi-native-fbank
Kaldi-compatible online fbank extractor without external dependencies
llama-dl
High-speed download of LLaMA, Facebook's 65B parameter GPT model
lossless-cut
The swiss army knife of lossless video/audio editing
mediamtx
Ready-to-use RTSP / RTMP / LL-HLS / WebRTC server and proxy that allows to read, publish and proxy video and audio streams. Formerly known as rtsp-simple-server.
mimalloc
mimalloc is a compact general purpose allocator with excellent performance.
parallel-hashmap
A family of header-only, very fast and memory-friendly hashmap and btree containers.
pysubs2
A Python library for editing subtitle files
qoa
The “Quite OK Audio Format” for fast, lossy audio compression
re2
RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.
sectorc
A C Compiler that fits in the 512 byte boot sector of an x86 machine
tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
unblob
Extract files from any kind of container formats
visidata
A terminal spreadsheet multitool for discovering and arranging data
yt-dlp
A youtube-dl fork with additional features and fixes