Jason's Lab (li563042811)

li563042811

Geek Repo

Github PK Tool:Github PK Tool

Jason's Lab's repositories

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

License:MITStargazers:0Issues:0Issues:0

lit-llama

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

License:Apache-2.0Stargazers:0Issues:0Issues:0

LLaMA-Adapter

Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

License:GPL-3.0Stargazers:0Issues:0Issues:0

open_flamingo

An open-source framework for training large multimodal models.

License:MITStargazers:0Issues:0Issues:0

Multimodal-GPT

Multimodal-GPT

License:Apache-2.0Stargazers:0Issues:0Issues:0

RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

License:Apache-2.0Stargazers:0Issues:0Issues:0

voxpopuli

A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation

License:NOASSERTIONStargazers:0Issues:0Issues:0

sherpa-onnx

Real-time speech recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, x86_64 servers, websocket server/client, C/C++, Python, Kotlin

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.

License:Apache-2.0Stargazers:0Issues:0Issues:0

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

License:Apache-2.0Stargazers:0Issues:0Issues:0

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

License:MITStargazers:0Issues:0Issues:0

Conference-Acceptance-Rate

Acceptance rates for the major AI conferences

License:MITStargazers:0Issues:0Issues:0

ColossalAI

Making big AI models cheaper, easier, and more scalable

License:Apache-2.0Stargazers:0Issues:0Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

License:Apache-2.0Stargazers:0Issues:0Issues:0

sherpa

Speech-to-text server framework with next-gen Kaldi

License:NOASSERTIONStargazers:0Issues:0Issues:0

mediapipe

Cross-platform, customizable ML solutions for live and streaming media.

License:Apache-2.0Stargazers:0Issues:0Issues:0

OpenFace

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

License:NOASSERTIONStargazers:0Issues:0Issues:0

youtube-dl

Command-line program to download videos from YouTube.com and other video sites

License:UnlicenseStargazers:0Issues:0Issues:0

sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

License:Apache-2.0Stargazers:0Issues:0Issues:0

avsr-conformer

AVSR with NIA

Stargazers:0Issues:0Issues:0

e2e_lfmmi

E2E system with LF-MMI; word N-gram for Mandarin

Stargazers:0Issues:0Issues:0

AVSR_papers

This repository mainly collects the papers for transformation between three modalities: audio, visual and text..

Stargazers:0Issues:0Issues:0

Leveraging-Self-Supervised-Learning-for-AVSR

Official PyTorch implementation of paper Leveraging Unimodal Self Supervised Learning for Multimodal Audio-Visual Speech Recognition

License:MITStargazers:0Issues:0Issues:0

av_hubert

A self-supervised learning framework for audio-visual speech

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:0Issues:0Issues:0

voxceleb_trainer

In defence of metric learning for speaker recognition

License:MITStargazers:0Issues:0Issues:0

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

License:Apache-2.0Stargazers:0Issues:0Issues:0

rnn-transducer

A Pytorch Implementation of Transducer Model for End-to-End Speech Recognition

Language:PythonStargazers:0Issues:0Issues:0

PaddleSpeech

An Easy-to-use Speech Toolkit including SOTA ASR pipeline, influential TTS with text frontend and End-to-End Speech Simultaneous Translation.

License:Apache-2.0Stargazers:0Issues:0Issues:0

hugo

The world’s fastest framework for building websites.

License:Apache-2.0Stargazers:0Issues:0Issues:0