Nguyễn Văn Anh Tuấn (tuanio)

tuanio

Geek Repo

Company:I2R, A*Star Group

Location:Singapore

Home Page:https://tuanio.github.io

Github PK Tool:Github PK Tool


Organizations
AI-CLUB-IUH

Nguyễn Văn Anh Tuấn's starred repositories

attorch

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

Language:PythonLicense:MITStargazers:428Issues:0Issues:0

crewai-experiments

Experiments with local as well as models available through an api

Language:PythonStargazers:812Issues:0Issues:0

llm.c

LLM training in simple, raw C/CUDA

Language:CudaLicense:MITStargazers:22447Issues:0Issues:0

speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

Stargazers:527Issues:0Issues:0
Language:JavaScriptLicense:GPL-3.0Stargazers:202Issues:0Issues:0

ABigSurvey

A collection of 1000+ survey papers on Natural Language Processing (NLP) and Machine Learning (ML).

License:GPL-3.0Stargazers:1970Issues:0Issues:0

audio-captioning

Audio captioning - DCASE challenge 2023 task 6a

Language:Jupyter NotebookLicense:MITStargazers:18Issues:0Issues:0

LLocalSearch

LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a chain of LLMs to find the answer. The user can see the progress of the agents and the final answer. No OpenAI or Google API keys are needed.

Language:GoLicense:Apache-2.0Stargazers:5445Issues:0Issues:0
Language:Jupyter NotebookStargazers:2Issues:0Issues:0

Pytorch_mixture-of-experts

PyTorch implementation of moe, which stands for mixture of experts

Language:PythonStargazers:31Issues:0Issues:0

taming-transformers

Taming Transformers for High-Resolution Image Synthesis

Language:Jupyter NotebookLicense:MITStargazers:5612Issues:0Issues:0

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Language:PythonLicense:MITStargazers:1056Issues:0Issues:0

latent-diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Language:Jupyter NotebookLicense:MITStargazers:11237Issues:0Issues:0

Leaderboard

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Language:PythonStargazers:418Issues:0Issues:0

VQSA

CVPR2023: Vector Quantization with Self-Attention for Quality-Independent Representation Learning.

Language:PythonLicense:MITStargazers:12Issues:0Issues:0

paperlib

An open-source academic paper management tool.

Language:TypeScriptLicense:GPL-3.0Stargazers:1437Issues:0Issues:0

dscore

Diarization scoring tools.

Language:PythonLicense:BSD-2-ClauseStargazers:208Issues:0Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookLicense:MITStargazers:5640Issues:0Issues:0
Language:PerlLicense:BSD-2-ClauseStargazers:26Issues:0Issues:0

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonLicense:Apache-2.0Stargazers:21048Issues:0Issues:0

Diffusion-GAN

Official PyTorch implementation for paper: Diffusion-GAN: Training GANs with Diffusion

Language:PythonLicense:MITStargazers:586Issues:0Issues:0

semantic-router

Superfast AI decision making and intelligent processing of multi-modal data.

Language:PythonLicense:MITStargazers:1756Issues:0Issues:0

MidiTok

MIDI / symbolic music tokenizers for Deep Learning models 🎶

Language:PythonLicense:MITStargazers:635Issues:0Issues:0

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8339Issues:0Issues:0
Language:PythonStargazers:277Issues:0Issues:0
Language:PythonStargazers:38Issues:0Issues:0

Conv-Tasnet-for-speech-enchancement-and-seperation

The state-of-art time domain network for speech separation, and it performs well on speech enhancement and music separation

Language:PythonStargazers:41Issues:0Issues:0

DeepXi

Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.

Language:MATLABLicense:MPL-2.0Stargazers:494Issues:0Issues:0

Robust-E2E-ASR

This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 2021.

Language:PythonLicense:MITStargazers:44Issues:0Issues:0

kaldi-gop

Computes the GMM-based Goodness of Pronunciation (GOP). Bases on Kaldi.

Language:C++License:NOASSERTIONStargazers:139Issues:0Issues:0