Beast code in Giters

view1234567's starred repositories

Ovis

A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.

Language:PythonApache-2.035500

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Language:PythonNOASSERTION42400

LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Language:PythonApache-2.0213000

moshi

Language:PythonApache-2.0595700

g1

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Language:PythonMIT340700

LeCo

This the implementation of LeCo

Language:Python2400

LongCite

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Language:PythonApache-2.027600

TAG-Bench

TAG-Bench: A benchmark for table-augmented generation (TAG)

Language:PythonMIT45700

DAMO-ConvAI

DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.

Language:PythonMIT118000

CrisperWhisper

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

Language:PythonNOASSERTION19500

MambaInLlama

Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Language:PythonApache-2.014400

QAnything

Question and Answer based on Anything.

Language:PythonAGPL-3.01153300

MaxKB

🚀 基于大语言模型和 RAG 的知识库问答系统。开箱即用、模型中立、灵活编排，支持快速嵌入到第三方业务系统。

Language:PythonGPL-3.01050600

icefall

Language:PythonApache-2.090200

GitHubDaily

坚持分享 GitHub 上高质量、有趣实用的开源技术教程、开发者工具、编程网站、技术资讯。A list cool, interesting projects of GitHub.

3209000

VideoLingo

Netflix级字幕切割、翻译、对齐、甚至加上配音，一键全自动视频搬运AI字幕组

Language:PythonApache-2.0296600

ultimatevocalremovergui

GUI for a Vocal Remover that uses Deep Neural Networks.

Language:PythonMIT1769700

GPT-SoVITS-Inference

Inference Specialization

Language:PythonMIT31700

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT3347800

whisper_streaming

Whisper realtime streaming for long speech-to-text transcription and translation

Language:PythonMIT187400

whisper-medusa

Whisper with Medusa heads

Language:PythonMIT78500

MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具，支持PDF/网页/多格式电子书提取。

Language:PythonAGPL-3.01187600

llama-cpp-python

Python bindings for llama.cpp

Language:PythonMIT783200

llama-assistant

Language:Python16800

BitNet

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

Language:PythonMIT155400

kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Language:ShellNOASSERTION1417000

vosk

VOSK Speech Recognition Toolkit

Language:CApache-2.037800

vosk-android-demo

Offline speech recognition for Android with Vosk library.

Language:JavaApache-2.074000

piper

A fast, local neural text to speech system

Language:C++MIT599400

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Language:PythonApache-2.01213900