DrakeYang1's starred repositories

whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Language:Jupyter NotebookLicense:BSD-2-ClauseStargazers:3373Issues:0Issues:0

awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

License:Apache-2.0Stargazers:1576Issues:0Issues:0

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonLicense:NOASSERTIONStargazers:6133Issues:0Issues:0

Awesome-Speaker-Diarization

Some comprehensive papers about speaker diarization

Stargazers:197Issues:0Issues:0

Deep-Live-Cam

real time face swap and one-click video deepfake with only a single image

Language:PythonLicense:AGPL-3.0Stargazers:37209Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:5726Issues:0Issues:0

Ovis

A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.

Language:PythonLicense:Apache-2.0Stargazers:328Issues:0Issues:0
Language:PythonStargazers:37Issues:0Issues:0

ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.

Language:PythonLicense:Apache-2.0Stargazers:6541Issues:0Issues:0

Qwen2-VL-Finetune

An open-source implementaion for fine-tuning Qwen2-VL series by Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:57Issues:0Issues:0
Language:PythonLicense:MITStargazers:109Issues:0Issues:0

SillyTavern

LLM Frontend for Power Users.

Language:JavaScriptLicense:AGPL-3.0Stargazers:7653Issues:0Issues:0

KoboldAI-Client

For GGUF support, see KoboldCPP: https://github.com/LostRuins/koboldcpp

Language:PythonLicense:AGPL-3.0Stargazers:3477Issues:0Issues:0

GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Language:PythonStargazers:4599Issues:0Issues:0

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Language:PythonLicense:MITStargazers:5623Issues:0Issues:0

Efficient-Live-Portrait

Fast running Live Portrait with TensorRT and ONNX models

Language:PythonLicense:MITStargazers:125Issues:0Issues:0

Q-Align

③[ICML2024] [IQA, IAA, VQA] All-in-one Foundation Model for visual scoring. Can efficiently fine-tune to downstream datasets.

Language:PythonLicense:NOASSERTIONStargazers:249Issues:0Issues:0
Language:C++License:MITStargazers:5661Issues:0Issues:0

twitter

AI Agent for Twitter Personality Analysis

Language:TypeScriptStargazers:1213Issues:0Issues:0

MeshAnythingV2

From anything to mesh like human artists. Official impl. of "MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization"

Language:PythonLicense:NOASSERTIONStargazers:554Issues:0Issues:0

torchchat

Run PyTorch LLMs locally on servers, desktop and mobile

Language:PythonLicense:BSD-3-ClauseStargazers:3197Issues:0Issues:0

Stable-Hair

Stable-Hair: Real-World Hair Transfer via Diffusion Model

License:Apache-2.0Stargazers:344Issues:0Issues:0

Husky-v1

Code for Husky, an open-source language agent that solves complex, multi-step reasoning tasks. Husky v1 addresses numerical, tabular and knowledge-based reasoning tasks.

Language:PythonStargazers:315Issues:0Issues:0

exo

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚

Language:PythonLicense:GPL-3.0Stargazers:7314Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:27546Issues:0Issues:0

album-ai

AI-First Album: Chat with your gallery using plain language! LLM Vision + RAG + Album/Gallery.

Language:TypeScriptLicense:Apache-2.0Stargazers:770Issues:0Issues:0

InternLM

Official release of InternLM2.5 base and chat models. 1M context support

Language:PythonLicense:Apache-2.0Stargazers:6281Issues:0Issues:0

OpenGlass

Turn any glasses into AI-powered smart glasses

Language:CLicense:MITStargazers:3276Issues:0Issues:0

LivePortrait

Bring portraits to life!

Language:PythonLicense:NOASSERTIONStargazers:12011Issues:0Issues:0

mem0

The Memory layer for your AI apps

Language:PythonLicense:Apache-2.0Stargazers:22015Issues:0Issues:0