Dennis Chew's repositories
attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
CelebV-Text
(CVPR 2023) CelebV-Text: A Large-Scale Facial Text-Video Dataset
coco-viewer
Simple COCO Viewer in Tkinter
CSWinTT
Transformer Tracking with Cyclic Shifting Window Attention (CSWinTT)
dalai
The simplest way to run LLaMA on your local machine
datumaro
Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage Computer Vision datasets.
dlib
A toolkit for making real world machine learning and data analysis applications in C++
GLIGEN
Open-Set Grounded Text-to-Image Generation
hands-segmentation-pytorch
A repo for training and finetuning models for hands segmentation.
mmdetection
OpenMMLab Detection Toolbox and Benchmark
monodepth
Unsupervised single image depth prediction with CNNs
Multitarget-tracker
Multiple Object Tracker, Based on Hungarian algorithm + Kalman filter.
ov-seg
This is the official PyTorch implementation of the paper Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP.
PaddleOCR
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
segmentation_models.pytorch
Segmentation models with pretrained backbones. PyTorch.
SOTS
Single object tracking and segmentation.
Stark
[ICCV'21] Learning Spatio-Temporal Transformer for Visual Tracking
Swin-Transformer
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
viper
Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"
visualDet3D
Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving / YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection
X-VLM
X-VLM: Multi-Grained Vision Language Pre-Training
Yolo-to-COCO-format-converter
Yolo to COCO annotation format converter
yolo_deepstream
yolo model qat and deploy with deepstream&tensorrt
YOLOP
You Only Look Once for Panopitic Driving Perception.(https://arxiv.org/abs/2108.11250)
yolov9
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information