Haonan Zhang's repositories
Awesome-Embodied-Agent-with-LLMs
This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates!
S2-Transformer
[IJCAI 2022] Official Pytorch code for paper “S2 Transformer for Image Captioning”
Multi-Modal-Large-Language-Learning
Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.
3D-Vision-and-Language
Collection of recent 3D Vision and Language research
annotated_deep_learning_paper_implementations
🧑🏫 59 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
awesome-multimodal-ml
Reading list for research topics in multimodal machine learning
Generalization-Causality
关于domain generalization,domain adaptation,causality,robutness,prompt,optimization,generative model各式各样研究的阅读笔记
LMaaS-Papers
Awesome papers on Language-Model-as-a-Service (LMaaS)
Vision-and-Language-Benchmark
Codebase for research of vision&language, including various multimodal task pipline (e.g., image captioning, VQA, video-text retrieval), customizable dataset (e.g., MS-COCO, ActivityNet, MSR-VTT), pre-trained model acquire (e.g., CLIP, BLIP-2)
zchoi.github.io
My personal homepage
distribuuuu
The pure and clear PyTorch Distributed Training Framework.