Huiimin5

followers

following

stars

HKUST

HuiminWuHuiminWu's starred repositories

gdGPT

Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.

Language:PythonApache-2.08900

xlstm-resources

Resources about xLSTM by Sepp Hochreiter

SAMFLow

A repository of AAAI 2024 paper 'SAMFlow: Eliminating Any Fragmentation in Optical Flow with Segment Anything Model'

Language:Python400

FlowDiffusion_pytorch

Unofficial pytorch implementation of DDVM.

Language:PythonApache-2.07300

opticalflow-autoflow

Language:Jupyter NotebookApache-2.011500

ezflow

A modular PyTorch library for optical flow estimation using neural networks

Language:PythonMIT13100

MIM-Depth-Estimation

This is an official implementation of our CVPR 2023 paper "Revealing the Dark Secrets of Masked Image Modeling" on Depth Estimation.

Language:PythonMIT15900

Awesome_Prompting_Papers_in_Computer_Vision

A curated list of prompt-based paper in computer vision and vision-language learning.

nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents

Language:PythonMIT855400

pytorchviz

A small package to create visualizations of PyTorch execution graphs

Language:Jupyter NotebookMIT311900

Transformer_Relative_Position_PyTorch

Implement the paper "Self-Attention with Relative Position Representations"

Language:Python11800

video2dataset

Easily create large video dataset from video urls

Language:PythonMIT51700

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookApache-2.04588400

Painter

Painter & SegGPT Series: Vision Foundation Models from BAAI

Language:PythonMIT248100

DropMAE

Language:Python5800

gpt_academic

为GPT/GLM等LLM大语言模型提供实用化交互接口，特别优化论文阅读/润色/写作体验，模块化设计，支持自定义快捷按钮&函数插件，支持Python和C++等项目剖析&自译解功能，PDF/LaTex论文翻译&总结功能，支持并行问询多种LLM模型，支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。

Language:PythonGPL-3.06284400

MonoViT

Self-supervised monocular depth estimation with a vision transformer

Language:PythonMIT14900

InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Language:PythonApache-2.0116300

MFM

code for paper "Masked Frequency Modeling for Self-Supervised Visual Pre-Training" (https://arxiv.org/pdf/2206.07706.pdf)

Language:PythonMIT2300

openmixup

CAIRI Supervised, Semi- and Self-Supervised Visual Representation Learning Toolbox and Benchmark

Language:PythonApache-2.060000

ConMIM

Official codes for ConMIM (ICLR 2023)

Language:PythonNOASSERTION5600

mage

A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis

Language:PythonMIT49600

Transformer-in-Vision

Recent Transformer-based CV and related works.

CAE

This is a PyTorch implementation of “Context AutoEncoder for Self-Supervised Representation Learning"

Language:Python18800

mmselfsup

OpenMMLab Self-Supervised Learning Toolbox and Benchmark

Language:PythonApache-2.0314000

random_quantize

a novel data augmentation method across data modalities

Language:PythonMIT7100

demystifyssl

Language:Python700

dino

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Language:PythonApache-2.0611600

mae

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377

Language:PythonNOASSERTION702700

mae_st

Official Open Source code for "Masked Autoencoders As Spatiotemporal Learners"

Language:PythonNOASSERTION30400