hzhang57

hzhang57's repositories

hzhang57.github.io

Language:HTML000

2prime.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

MIT000

multimodal-maestro

Effective prompting for Large Multimodal Models like GPT-4 Vision or LLaVA. 🔥

MIT000

Paper-Implementation-Template

A simple reproducible template to implement AI research papers

MIT000

chain-of-thought-hub

Benchmarking large language models' complex reasoning ability with chain-of-thought prompting

000

lightning-sam

Fine-tune Segment-Anything Model with Lightning Fabric.

Apache-2.0000

GSS

[CVPR 2023] Official repository of Generative Semantic Segmentation

000

Pointcept

Pointcept: a codebase for point cloud perception research. Latest works: MSC, CeCo (CVPR 2023)

MIT000

openai-cookbook

Examples and guides for using the OpenAI API

000

X-Decoder

Official Implementation of X-Decoder for generalized decoding for pixel, image and language

MIT000

LaViLa

Code release for "Learning Video Representations from Large Language Models"

MIT000

mega

Sequence modeling with Mega.

NOASSERTION000

GLIP

Grounded Language-Image Pre-training

MIT000

coyo-dataset

COYO-700M: Large-scale Image-Text Pair Dataset

000

awesome-vision-language-pretraining-papers

Recent Advances in Vision and Language PreTrained Models (VL-PTMs)

000

pytorch_scatter

PyTorch Extension Library of Optimized Scatter Operations

MIT000

SimCLR

PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations by T. Chen et al.

MIT000

METER

METER: A Multimodal End-to-end TransformER Framework

MIT000

CogVideo

Text-to-video generation.

000

Awesome-CLIP

Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).

000

CLIP

Contrastive Language-Image Pretraining

MIT000

Neighborhood-Attention-Transformer

[Preprint] Neighborhood Attention Transformer, 2022

MIT000

VideoMAE

VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

NOASSERTION000

HowToLiveLonger

程序员延寿指南 | A programmer's guide to live longer

Unlicense000

vidt

Apache-2.0000

qna

[CVPR2022 - Oral] Official Jax Implementation of Learned Queries for Efficient Local Attention

MIT000

behave-dataset

code to access BEHAVE dataset

000

Group-Contextualization

[CVPR22] Group Contextualization for Video Recognition

Apache-2.0000

Mask2Former

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

NOASSERTION000

HowToCook

程序员在家做饭方法指南。

Unlicense000