Haonan Zhang (zchoi)

zchoi

Geek Repo

Company:UESTC | TongYi Laboratory

Location:Chengdu, Sichuan

Home Page:https://zchoi.github.io/

Github PK Tool:Github PK Tool

Haonan Zhang's repositories

Awesome-Embodied-Agent-with-LLMs

This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates!

S2-Transformer

[IJCAI 2022] Official Pytorch code for paper “S2 Transformer for Image Captioning”

Language:PythonLicense:MITStargazers:77Issues:2Issues:12

PKOL

[TIP 2022] Official code of paper “Video Question Answering with Prior Knowledge and Object-sensitive Learning”

Language:PythonLicense:MITStargazers:44Issues:2Issues:1

Multi-Modal-Large-Language-Learning

Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.

Stargazers:11Issues:0Issues:0

SPT

[TCSVT23] Official code for "SPT: Spatial Pyramid Transformer for Image Captioning".

Language:PythonLicense:Apache-2.0Stargazers:9Issues:0Issues:0

3D-Vision-and-Language

Collection of recent 3D Vision and Language research

Stargazers:7Issues:0Issues:0

SNLC

[PR23] The implementation of the paper ''Learning Visual Question Answering on Controlled Semantic Noisy Labels''

Language:PythonStargazers:7Issues:0Issues:0

DAST

[MM23] Code for paper "Depth-Aware Sparse Transformer for Video-Language Learning"

Language:PythonStargazers:6Issues:1Issues:0

GLSCL

Code for "Text-Video Retrieval with Global-Local Semantic Consistent Learning"

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:6Issues:0Issues:0
Stargazers:6Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:5Issues:1Issues:1

AliceMind

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

Language:PythonLicense:Apache-2.0Stargazers:2Issues:0Issues:0

annotated_deep_learning_paper_implementations

🧑‍🏫 59 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

License:MITStargazers:2Issues:0Issues:0

awesome-multimodal-ml

Reading list for research topics in multimodal machine learning

License:MITStargazers:2Issues:0Issues:0
Language:TypeScriptLicense:MITStargazers:2Issues:0Issues:0
Language:TypeScriptLicense:MITStargazers:2Issues:1Issues:0

EMCL

[NeurIPS 2022] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations

License:MITStargazers:2Issues:0Issues:0

Generalization-Causality

关于domain generalization,domain adaptation,causality,robutness,prompt,optimization,generative model各式各样研究的阅读笔记

License:MITStargazers:2Issues:0Issues:0

LMaaS-Papers

Awesome papers on Language-Model-as-a-Service (LMaaS)

License:MITStargazers:2Issues:0Issues:0

McQuic

Repository of CVPR'22 paper "Unified Multivariate Gaussian Mixture for Efficient Neural Image Compression"

License:Apache-2.0Stargazers:2Issues:0Issues:0
Stargazers:2Issues:0Issues:0

rich

Rich is a Python library for rich text and beautiful formatting in the terminal.

Language:PythonLicense:MITStargazers:2Issues:0Issues:0

sam

SAM: Sharpness-Aware Minimization (PyTorch)

License:MITStargazers:2Issues:0Issues:0

Vision-and-Language-Benchmark

Codebase for research of vision&language, including various multimodal task pipline (e.g., image captioning, VQA, video-text retrieval), customizable dataset (e.g., MS-COCO, ActivityNet, MSR-VTT), pre-trained model acquire (e.g., CLIP, BLIP-2)

Stargazers:2Issues:0Issues:0

zchoi.github.io

My personal homepage

Language:SCSSLicense:MITStargazers:2Issues:0Issues:0
Language:PythonStargazers:1Issues:0Issues:0

distribuuuu

The pure and clear PyTorch Distributed Training Framework.

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0