Beast code in Giters

hekaijie123's starred repositories

MPP-LLaVA

Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.

Language:Jupyter Notebook37400

multimodal-prompt-learning

[CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".

Language:PythonMIT65700

mlx-examples

Examples in the MLX framework

Language:PythonMIT612200

Cream

This is a collection of our NAS and Vision Transformer work.

Language:PythonMIT167500

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

1242800

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:Jupyter NotebookBSD-3-Clause984800

sd-webui-controlnet

WebUI extension for ControlNet

Language:PythonGPL-3.01700900

paper-reading

深度学习经典、新论文逐段精读

Apache-2.02691300

zero_nlp

中文nlp解决方案(大模型、数据、模型、训练、推理)

Language:Jupyter NotebookMIT293400

SiamTrackers

(2020-2022)The PyTorch version of SiamFC，SiamRPN，DaSiamRPN, UpdateNet , SiamDW, SiamRPN++, SiamMask, SiamFC++, SiamCAR, SiamBAN, Ocean, LightTrack , TrTr, NanoTrack; Visual object tracking based on deep learning

Language:PythonApache-2.0132900

gpt-3

GPT-3: Language Models are Few-Shot Learners

1568500

LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Language:PythonMIT1062500

slambook2

edition 2 of the slambook

Language:C++MIT551100

slambook

Language:C++MIT691700

stable-diffusion-webui

Stable Diffusion web UI

Language:PythonAGPL-3.014207300

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonApache-2.03528300

gpt-neo

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

Language:PythonMIT823200

OpenSeeD

[ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"

Language:PythonApache-2.065100

mfpsg

mask2former psg

Language:PythonApache-2.02200

VITA

VITA: Video Instance Segmentation via Object Token Association (NeurIPS 2022)

Language:PythonApache-2.010200

Awesome-Transformer-Attention

An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites

462200

SOTDrawRect

You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.Moreover, the author will update some of the problems in the pysot-toolkit toolkit from time to time.

Language:Python7900

pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonApache-2.03207700

hekaijie123