pean1128's repositories
awesome-3d-reconstruction-papers
A collection of 3D reconstruction papers in the deep learning era.
Awesome-MVS
Awesome list of multi-view stereo papers
canvas-vae
Implementation of CanvasVAE: Learning to Generate Vector Graphic Documents, ICCV 2021
ChatPaper
Use ChatGPT to summarize the arXiv papers.
Chinese-BERT-wwm
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Cream
This is a collection of our NAS and Vision Transformer work.
cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
FreeReg
[Arxiv 2023] FreeReg: Image-to-Point Cloud Registration Leveraging Pretrained Diffusion Models and Monocular Depth Estimators
GLIGEN
Open-Set Grounded Text-to-Image Generation
gptrpg
A demo of an GPT-based agent existing in an RPG-like environment
GroundingDINO
The official implementation of "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
kapture
kapture is a file format as well as a set of tools for manipulating datasets, and in particular Visual Localization and Structure from Motion data.
nerf-learn
记录对nerf各种算法、应用、软件等等的学习过程
psd.js
A Photoshop PSD file parser for NodeJS and browsers
PSD2UGUI_X
Convert psd file to ugui prefab, text, image, raw image, button, slider, scroll view, dropdown, toggle, textmeshpro...
RGC
[ACM MM 2023] An official source code for paper Reinforcement Graph Clustering with Unknown Cluster Number.
rico_semantics
Consists of ~500k human annotations on the RICO dataset identifying various icons based on their shapes and semantics, and associations between selected general UI elements and their text labels. Annotations also include human annotated bounding boxes which are more accurate and have a greater coverage of UI elements.
screen_qa
ScreenQA dataset was introduced in the "ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots" paper. It contains ~86K question-answer pairs collected by human annotators for ~35K screenshots from Rico. It should be used to train and evaluate models capable of screen content understanding via question answering.
sentence-transformers
Multilingual Sentence & Image Embeddings with BERT
SimCLR
PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
StreamRF
Official implementation of our NeurIPS paper "Streaming Radiance Fields for 3D Video Synthesis"
SuperGlobal
ICCV 2023 Paper Global Features are All You Need for Image Retrieval and Reranking Official Repository
unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
visual-chatgpt
VisualChatGPT