yl3800

Yicong's starred repositories

ml-visuals

🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.

awesome-3D-gaussian-splatting

Curated list of papers and resources focused on 3D Gaussian Splatting, intended to keep pace with the anticipated surge of research in the coming months.

MIT476000

Awesome-AIGC-3D

A curated list of awesome AIGC 3D papers

MIT40700

awesome-3d-diffusion

A collection of papers on diffusion models for 3D generation.

MIT58400

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

984100

awesome-vlm-architectures

Famous Vision Language Models and Their Architectures

Language:MarkdownCC0-1.011900

awesome-3D-generation

A curated list of awesome 3d generation papers

98000

Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

78600

OneLLM

[CVPR 2024] OneLLM: One Framework to Align All Modalities with Language

Language:PythonNOASSERTION48200

Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

CC0-1.01533400

Awesome-LLM-3D

Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources

MIT75200

einops

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Language:PythonMIT803400

PointMetaBase

This is a PyTorch implementation of PointMetaBase proposed by our paper "Meta Architecure for Point Cloud Analysis"

Language:PythonMIT8400

Point-Transformers

Point Transformers

Language:PythonMIT59600

GenPromp

[ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization

Language:PythonApache-2.05300

Track-Anything

Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.

Language:PythonMIT618600

Open3D

Open3D: A Modern Library for 3D Data Processing

Language:C++NOASSERTION1066100

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonApache-2.02900600

MultiModal_BigModels_Survey

[MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models

25500

VGT

Video Graph Transformer for Video Question Answering (ECCV'22)

Language:PythonApache-2.04300

ChatReviewer

ChatReviewer: 使用ChatGPT分析论文优缺点，提出改进建议

Language:PythonNOASSERTION123100

ai-edu

AI education materials for Chinese students, teachers and IT professionals.

Language:HTMLNOASSERTION1326800

uvadlc_notebooks

Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023

Language:Jupyter NotebookMIT221900

google-research

Google Research

Language:Jupyter NotebookApache-2.03316800

InternVideo

Video Foundation Models & Data for Multimodal Understanding

Language:PythonApache-2.0102900

In-the-wild-QA

In-the-wild Question Answering

Language:Python1400

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonApache-2.03028800

paper-reading

深度学习经典、新论文逐段精读

Apache-2.02462600