mrswang1

mrswang1

Geek Repo

Github PK Tool:Github PK Tool

mrswang1's starred repositories

llama-agentic-system

Agentic components of the Llama Stack APIs

Language:PythonLicense:NOASSERTIONStargazers:2604Issues:0Issues:0

gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Language:PythonLicense:Apache-2.0Stargazers:31278Issues:0Issues:0

LLaVA-Plus-Codebase

LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills

Language:PythonLicense:Apache-2.0Stargazers:674Issues:0Issues:0

LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

Language:PythonStargazers:768Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:24973Issues:0Issues:0

llama

Inference code for Llama models

Language:PythonLicense:NOASSERTIONStargazers:54766Issues:0Issues:0

Emu

Emu Series: Generative Multimodal Models from BAAI

Language:PythonLicense:Apache-2.0Stargazers:1577Issues:0Issues:0

unicom

[ICLR 2023] Unicom: Universal and Compact Representation Learning for Image Retrieval

Language:PythonStargazers:209Issues:0Issues:0

LeetCode-Go

✅ Solutions to LeetCode by Go, 100% test coverage, runtime beats 100% / LeetCode 题解

Language:GoLicense:MITStargazers:32506Issues:0Issues:0

mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Language:PythonLicense:MITStargazers:276Issues:0Issues:0

insightface

State-of-the-art 2D and 3D Face Analysis Project

Language:PythonStargazers:22193Issues:0Issues:0

Efficient-Multimodal-LLMs-Survey

Efficient Multimodal Large Language Models: A Survey

License:Apache-2.0Stargazers:195Issues:0Issues:0

Awesome_Multimodel_LLM

Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Models (MLLM). It covers datasets, tuning techniques, in-context learning, visual reasoning, foundational models, and more. Stay updated with the latest advancement.

Stargazers:221Issues:0Issues:0

awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Language:PythonLicense:MITStargazers:4599Issues:0Issues:0

VLM_survey

Collection of AWESOME vision-language models for vision tasks

Stargazers:2053Issues:0Issues:0

CM3Leon

An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal AI that uses just a decoder to generate both text and images

Language:PythonLicense:MITStargazers:350Issues:0Issues:0

Subspace-Tuning

A generalized framework for subspace tuning methods in parameter efficient fine-tuning.

Language:PythonLicense:Apache-2.0Stargazers:43Issues:0Issues:0

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:4464Issues:0Issues:0

SEED-X

Multimodal Models in Real World

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:345Issues:0Issues:0

VLE

VLE: Vision-Language Encoder (VLE: 视觉-语言多模态预训练模型)

Language:PythonLicense:Apache-2.0Stargazers:176Issues:0Issues:0

lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Language:PythonLicense:NOASSERTIONStargazers:1166Issues:0Issues:0

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:18437Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:350Issues:0Issues:0

chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Language:PythonLicense:NOASSERTIONStargazers:1610Issues:0Issues:0

LAION-Face

The human face subset of LAION-400M for large-scale face pretraining.

Language:PythonStargazers:265Issues:0Issues:0

facer

Face analysis tools for modern research, equipped with state-of-the-art Face Parsing and Face Alignment

Language:PythonLicense:MITStargazers:318Issues:0Issues:0

anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Language:PythonStargazers:557Issues:0Issues:0

Open-MAGVIT2

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Language:PythonLicense:Apache-2.0Stargazers:348Issues:0Issues:0

stable-diffusion

A latent text-to-image diffusion model

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:66917Issues:0Issues:0

stablediffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Language:PythonLicense:MITStargazers:37858Issues:0Issues:0