tuofeilun (tuofeilunhifi)

tuofeilunhifi

Geek Repo

Company:Li Auto

Location:hangzhou,china

Github PK Tool:Github PK Tool

tuofeilun's starred repositories

ollama

Get up and running with Llama 3, Mistral, Gemma, and other large language models.

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookLicense:MITStargazers:13270Issues:115Issues:210

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:7257Issues:74Issues:227

llama-cpp-python

Python bindings for llama.cpp

Language:PythonLicense:MITStargazers:6977Issues:67Issues:944

IC-Light

More relighting!

Language:PythonLicense:Apache-2.0Stargazers:3780Issues:40Issues:58

efficient-kan

An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).

Language:PythonLicense:MITStargazers:3344Issues:28Issues:33

SupContrast

PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

Language:PythonLicense:BSD-2-ClauseStargazers:2954Issues:18Issues:132

ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)

Language:PythonLicense:MITStargazers:2607Issues:42Issues:251

PyContrast

PyTorch implementation of Contrastive Learning methods

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonLicense:MITStargazers:1806Issues:17Issues:41

CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Language:PythonLicense:Apache-2.0Stargazers:1333Issues:23Issues:90

mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Language:PythonLicense:Apache-2.0Stargazers:1065Issues:27Issues:81

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonLicense:Apache-2.0Stargazers:824Issues:19Issues:62

pytorch-randaugment

Unofficial PyTorch Reimplementation of RandAugment.

Language:PythonLicense:MITStargazers:621Issues:15Issues:30

VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks

Language:PythonLicense:Apache-2.0Stargazers:557Issues:8Issues:72

Grounding-DINO-1.5-API

API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

Language:PythonLicense:Apache-2.0Stargazers:519Issues:10Issues:23

Groma

Grounded Multimodal Large Language Model with Localized Visual Tokenization

Language:PythonLicense:Apache-2.0Stargazers:455Issues:36Issues:14

TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models

Language:PythonLicense:Apache-2.0Stargazers:420Issues:11Issues:75

MultimodalOCR

On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)

prismatic-vlms

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Language:PythonLicense:MITStargazers:273Issues:10Issues:31

scaling_on_scales

When do we not need larger vision models?

Language:PythonLicense:MITStargazers:243Issues:4Issues:11

ScreenAI

Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"

Language:PythonLicense:MITStargazers:220Issues:8Issues:3

OmniFusion

OmniFusion — a multimodal model to communicate using text and images

Language:PythonLicense:Apache-2.0Stargazers:211Issues:5Issues:3

Retrieval-Augmented-Visual-Question-Answering

This is the official repository for Retrieval Augmented Visual Question Answering

Language:PythonLicense:GPL-3.0Stargazers:117Issues:4Issues:38

MMBench

Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"

MoVA

MoVA: Adapting Mixture of Vision Experts to Multimodal Context

Awesome-Vision-Mamba

✨✨Latest Papers on Vision Mamba and Related Areas

fld

PyTorch code for FLD (Feature Likelihood Divergence), FID, KID, Precision, Recall, etc. using DINOv2, InceptionV3, CLIP, etc.

CLIP-KD

[CVPR-2024] Official implementations of CLIP-KD: An Empirical Study of CLIP Model Distillation