Beast code in Giters

jzyztzn's starred repositories

StableDiffusionOnDevice

本项目是一个通过文字生成图片的项目，基于开源模型Stable Diffusion V1.5生成可以在手机的CPU和NPU上运行的模型，包括其配套的模型运行框架。

Language:C++MIT6800

UniVL

An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"

Language:PythonMIT33500

faiss

A library for efficient similarity search and clustering of dense vectors.

Language:C++MIT2971000

CLIP4Clip

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

Language:PythonMIT82300

re2

RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.

Language:C++BSD-3-Clause880800

CLIP_benchmark

CLIP-like model evaluation

Language:Jupyter NotebookMIT54700

EVA

EVA Series: Visual Representation Fantasies from BAAI

Language:PythonMIT213800

clip-as-service

🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP

Language:PythonNOASSERTION1231100

Text2Image-Retrieval

计算机视觉课程设计-基于Chinese-CLIP的图文检索系统

Language:Python3100

datacomp

DataComp: In search of the next generation of multimodal datasets

Language:PythonNOASSERTION62200

LocalLM

Android app for running transformers locally using LLama.cpp & Whisper.cpp

Language:KotlinGPL-3.01300

clip-image-search

A simple image search engine using CLIP feature.

Language:PythonMIT4700

CLIP-ImageSearch-NCNN

CLIP⚡NCNN⚡基于自然语言的图片搜索(Image Search)⚡以字搜图⚡x86⚡Android

Language:C++20000

clip-retrieval

Easily compute clip embeddings and build a clip retrieval system with them

Language:Jupyter NotebookMIT228800

maid

Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.

Language:DartMIT108000

CLIP-Chinese

中文CLIP预训练模型

Language:Python37100

ollama-app

A modern and easy-to-use client for Ollama

Language:DartApache-2.030100

OllamaDroid

A Ollama client for Android!

Language:Java6900

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonApache-2.02786900

DiffSynth-Studio

Enjoy the magic of Diffusion models!

Language:PythonApache-2.0601100

mnn-segment-anything

segment-anything based mnn

Language:C++3100

mobileSAM-Android-MNN

Language:C++2100

open-webui

User-friendly WebUI for LLMs (Formerly Ollama WebUI)

Language:SvelteMIT3314400

MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Language:PythonApache-2.0310900

all-seeing

[ICLR 2024] This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"

Language:Python42900

Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Language:PythonMIT289900

InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Language:PythonApache-2.0115600

InternImage

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

Language:PythonMIT242700

big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Language:Jupyter NotebookApache-2.0206900

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonMIT1102200