lz's starred repositories

ollama

Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:34570Issues:343Issues:2700

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookLicense:MITStargazers:14372Issues:108Issues:350

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:11412Issues:98Issues:484

bertviz

BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

Language:PythonLicense:Apache-2.0Stargazers:6727Issues:71Issues:123

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Language:PythonLicense:MITStargazers:5341Issues:49Issues:502

Transformer-Explainability

[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.

Language:Jupyter NotebookLicense:MITStargazers:1752Issues:21Issues:61

awesome-grounding

awesome grounding: A curated list of research papers in visual grounding

onnxmltools

ONNXMLTools enables conversion of models to ONNX

Language:PythonLicense:Apache-2.0Stargazers:988Issues:43Issues:290

BlueLM

BlueLM(蓝心大模型): Open large language models developed by vivo AI Lab

Language:PythonLicense:NOASSERTIONStargazers:826Issues:14Issues:27

Transformer-MM-Explainability

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

Language:Jupyter NotebookLicense:MITStargazers:767Issues:8Issues:35
Language:PythonLicense:NOASSERTIONStargazers:726Issues:8Issues:64

Convolutional-KANs

This project extends the idea of the innovative architecture of Kolmogorov-Arnold Networks (KAN) to the Convolutional Layers, changing the classic linear transformation of the convolution to learnable non linear activations in each pixel.

Language:Jupyter NotebookLicense:MITStargazers:689Issues:13Issues:12

RADIO

Official repository for "AM-RADIO: Reduce All Domains Into One"

Language:PythonLicense:NOASSERTIONStargazers:584Issues:23Issues:31

DenseCL

Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021 Oral.

Language:PythonLicense:GPL-3.0Stargazers:544Issues:7Issues:35

Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Language:PythonLicense:Apache-2.0Stargazers:530Issues:35Issues:22

all-seeing

[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of the Open World"

MultimodalOCR

On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)

Language:PythonLicense:MITStargazers:417Issues:13Issues:28

rho

Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.

Visual-Instruction-Tuning

SVIT: Scaling up Visual Instruction Tuning

Language:PythonLicense:MITStargazers:159Issues:5Issues:15

Document-AI-Recommendations

Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.

torchkan

An easy to use PyTorch implementation of the Kolmogorov Arnold Network and a few novel variations

Language:PythonLicense:MITStargazers:140Issues:3Issues:7

Awesome-Open-Vocabulary-Detection-and-Segmentation

Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future

VimTS

VimTS: A Unified Video and Image Text Spotter

Language:PythonLicense:GPL-3.0Stargazers:70Issues:2Issues:5

KANs

🕹️The toy examples of Kolmogorov-Arnold Network (Get Started Quickly)

Language:PythonStargazers:68Issues:2Issues:0

multimodal_cognitive_ai

research work on multimodal cognitive ai

Language:PythonLicense:Apache-2.0Stargazers:42Issues:4Issues:5

onnx-donut

Export Donut model to onnx and run it with onnxruntime

Language:PythonLicense:Apache-2.0Stargazers:23Issues:1Issues:1

EST-VQA

[CVPR2020] EST-VQA Dataset

Language:PythonStargazers:5Issues:1Issues:0