Beast code in Giters

qiaolm's starred repositories

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT391600

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.02117100

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonApache-2.02876700

MobileVLM

Strong and Open Vision Language Assistant for Mobile Devices

Language:PythonApache-2.091900

litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Language:PythonApache-2.0931600

InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Language:PythonApache-2.0236200

LISA

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Language:PythonApache-2.0169200

TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Language:PythonApache-2.0747900

BeMapNet

Language:PythonNOASSERTION18300

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookApache-2.04614200

DeFRCN

Language:PythonMIT18200

Models

采用MegEngine实现的各种主流深度学习模型

Language:PythonNOASSERTION30200

MegEngine

MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架

Language:C++Apache-2.0474600

pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonApache-2.03106100

CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Language:Jupyter NotebookMIT2411900