Beast code in Giters

Mickey's starred repositories

cc_net

Tools to download and cleanup Common Crawl data

Language:PythonMIT96500

arthas

Alibaba Java Diagnostic Tool Arthas/Alibaba Java诊断利器Arthas

Language:JavaApache-2.03553700

LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Language:PythonApache-2.0242100

UnsupervisedMT

Phrase-Based & Neural Unsupervised Machine Translation

Language:PythonNOASSERTION150600

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonMIT1900

ConST

code for paper "Cross-modal Contrastive Learning for Speech Translation" (NAACL 2022)

Language:PythonMIT6200

MooER

MooER: Open-sourced LLM for audio understanding trained on 80,000 hours of data

Language:PythonNOASSERTION13500

simul_whisper

Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection

Language:Python4200

whisper_streaming

Whisper realtime streaming for long speech-to-text transcription and translation

Language:PythonMIT194700

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonApache-2.0565300

SenseVoice

Multilingual Voice Understanding Model

Language:PythonNOASSERTION307700

SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

Language:PythonMIT53900

SpeechGen

《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》

7400

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookNOASSERTION1086100

asrp

ASR text preprocessing utility

Language:PythonApache-2.02000

LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonApache-2.03267500

axolotl

Go ahead and axolotl questions

Language:PythonApache-2.0775300

EMO

Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

746700

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION144600

Qwen2.5

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:Shell894800

Use PEFT or Full-parameter to finetune 350+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)

Language:PythonApache-2.0389600

Mickey-Stone

Mickey's starred repositories

FDRL-MHAD

cc_net

arthas

moshi

LLaMA-Omni

UnsupervisedMT

fairseq

ConST

MooER

simul_whisper

whisper_streaming

CosyVoice

SenseVoice

TeleSpeech-ASR

SLAM-LLM

SpeechGen

seamless_communication

asrp

LLaMA-Factory

axolotl

EMO

Qwen-Audio

Qwen2.5

ms-swift

TransformerCompression

silero-models

silero-vad

unilm

bolt

bark