Mustard Bean's repositories
ADer
ADer is an open source visual anomaly detection toolbox based on PyTorch, which supports multiple popular AD datasets and approaches.
Chat-UniVi
[CVPR 2024🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
chatgpt_system_prompt
store all agent's system prompt
DataDreamer
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
face_recognition
The world's simplest facial recognition api for Python and the command line
GPTs
leaked prompts of GPTs
HuggingFists
A low-code data flow tool that allows for convenient use of LLM and HuggingFace models, with some features considered as a low-code version of Langchain.
insightface
State-of-the-art 2D and 3D Face Analysis Project
instructor-embedding
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源模型
LISA
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
MiniCPM-V
MiniCPM-V 2.0: An Efficient End-side MLLM with Strong OCR and Understanding Capabilities
MiniGPT4Qwen
Personal Project: MPP-Qwen14B(Multimodal Pipeline Parallel-Qwen14B). Don't let the poverty limit your imagination! Train your own 14B LLaVA-like MLLM on RTX3090/4090 24GB.
mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
OneLLM
OneLLM: One Framework to Align All Modalities with Language
PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
prismatic-vlms
*****A flexible and efficient codebase for training visually-conditioned language models (VLMs)
RWKV-Infer
A large-scale RWKV v6 inference wrapper using the Cuda backend. Easy to deploy on docker. Supports multi-batch generation and dynamic State switching. Let's spread RWKV, which combines RNN technology with impressively low inference costs!
Segment-and-Track-Anything
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
TikTokDownload
抖音去水印批量下载用户主页作品、喜欢、收藏、图文、音频
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Valley
The official repository of "Video assistant towards large language model makes everything easy"
Youku-mPLUG
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks