tuanio

followers

following

stars

I2R, A*Star Group

Singapore

https://tuanio.github.io

Organizations

AI-CLUB-IUH

Nguyễn Văn Anh Tuấn's starred repositories

attorch

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

Language:PythonMIT42800

crewai-experiments

Experiments with local as well as models available through an api

Language:Python81200

llm.c

LLM training in simple, raw C/CUDA

Language:CudaMIT2244700

speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

SSSL

Language:JavaScriptGPL-3.020200

ABigSurvey

A collection of 1000+ survey papers on Natural Language Processing (NLP) and Machine Learning (ML).

GPL-3.0197000

audio-captioning

Audio captioning - DCASE challenge 2023 task 6a

Language:Jupyter NotebookMIT1800

LLocalSearch

LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a chain of LLMs to find the answer. The user can see the progress of the agents and the final answer. No OpenAI or Google API keys are needed.

Language:GoApache-2.0544500

elm-implementation

Language:Jupyter Notebook200

Pytorch_mixture-of-experts

PyTorch implementation of moe, which stands for mixture of experts

Language:Python3100

taming-transformers

Taming Transformers for High-Resolution Image Synthesis

Language:Jupyter NotebookMIT561200

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Language:PythonMIT105600

latent-diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Language:Jupyter NotebookMIT1123700

Leaderboard

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Language:Python41800

VQSA

CVPR2023: Vector Quantization with Self-Attention for Quality-Independent Representation Learning.

Language:PythonMIT1200

paperlib

An open-source academic paper management tool.

Language:TypeScriptGPL-3.0143700

dscore

Diarization scoring tools.

Language:PythonBSD-2-Clause20800

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookMIT564000

dihard3_baseline

Language:PerlBSD-2-Clause2600

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.02104800

Diffusion-GAN

Official PyTorch implementation for paper: Diffusion-GAN: Training GANs with Diffusion

Language:PythonMIT58600

semantic-router

Superfast AI decision making and intelligent processing of multi-modal data.

Language:PythonMIT175600

MidiTok

MIDI / symbolic music tokenizers for Deep Learning models 🎶

Language:PythonMIT63500

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonApache-2.0833900

Conv-TasNet

Language:Python27700

TS-VAD

Language:Python3800

Conv-Tasnet-for-speech-enchancement-and-seperation

The state-of-art time domain network for speech separation, and it performs well on speech enhancement and music separation

Language:Python4100

DeepXi

Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.

Language:MATLABMPL-2.049400

Robust-E2E-ASR

This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 2021.

Language:PythonMIT4400

kaldi-gop

Computes the GMM-based Goodness of Pronunciation (GOP). Bases on Kaldi.

Language:C++NOASSERTION13900