Beast code in Giters

Mustard Bean's repositories

QAnything

Question and Answer based on Anything.

Language:PythonApache-2.0100

ADer

ADer is an open source visual anomaly detection toolbox based on PyTorch, which supports multiple popular AD datasets and approaches.

Language:Python000

Chat-UniVi

[CVPR 2024🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

Apache-2.0000

DataDreamer

DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤

MIT000

face_recognition

The world's simplest facial recognition api for Python and the command line

MIT000

HuggingFists

A low-code data flow tool that allows for convenient use of LLM and HuggingFace models, with some features considered as a low-code version of Langchain.

000

insightface

State-of-the-art 2D and 3D Face Analysis Project

000

instructor-embedding

[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings

Apache-2.0000

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源模型

MIT000

LISA

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Apache-2.0000

MiniCPM-V

MiniCPM-V 2.0: An Efficient End-side MLLM with Strong OCR and Understanding Capabilities

Apache-2.0000

MiniGPT4Qwen

Personal Project: MPP-Qwen14B(Multimodal Pipeline Parallel-Qwen14B). Don't let the poverty limit your imagination! Train your own 14B LLaVA-like MLLM on RTX3090/4090 24GB.

000

mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Apache-2.0000

OneLLM

OneLLM: One Framework to Align All Modalities with Language

NOASSERTION000

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Apache-2.0000

prismatic-vlms

*****A flexible and efficient codebase for training visually-conditioned language models (VLMs)

MIT000

RWKV-Infer

A large-scale RWKV v6 inference wrapper using the Cuda backend. Easy to deploy on docker. Supports multi-batch generation and dynamic State switching. Let's spread RWKV, which combines RNN technology with impressively low inference costs!

Apache-2.0000

Segment-and-Track-Anything

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.

AGPL-3.0000

quduoduo

Mustard Bean's repositories

QAnything

ADer

quduoduo.github.io

Chat-UniVi

chatgpt_system_prompt

DataDreamer

face_recognition

GPT4V-Image-Captioner

GPTs

HuggingFists

IG-VLM

insightface

instructor-embedding

InternVL

LISA

MiniCPM-V

MiniGPT4Qwen

mlc-llm

multihieve

OneLLM

PaddleSpeech

prismatic-vlms

RWKV-Infer

Segment-and-Track-Anything

TikTokDownload

transformers

Valley

VideoRecap

vision_transformer

Youku-mPLUG