zhoukang's starred repositories
segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
ai-comic-factory
Generate comic panels using a LLM + SDXL. Powered by Hugging Face 🤗
Story-to-comic-AI
create any comic page using state-of-the-art text to image and large language models with your limitless imagination
PSL-InstanceNav
official implementation for ECCV 2024 paper "Prioritized Semantic Learning for Zero-shot Instance Navigation"
Pixel-Navigator
Official GitHub Repository for Paper "Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation Skill", ICRA 2024
Layout-based-sTDE
Layout-based Causal Inference for Object Navigation (CVPR 2023)
3DAwareNav
[CVPR 2023] We propose a framework for the challenging 3D-aware ObjectNav based on two straightforward sub-policies. The two sub-polices, namely corner-guided exploration policy and category-aware identification policy, simultaneously perform by utilizing online fused 3D points as observation.
DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
VisualGLM-6B
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
ClipCap-Chinese
基于ClipCap的看图说话Image Caption模型
MatterSim_BEVBert_Docker
This is a docker which contain both MatterSim and the BEVBert.
Matterport3DSimulator
AI Research Platform for Reinforcement Learning from Real Panoramic Images.
Demand-driven-navigation
Find What You Want: Learning Demand-conditioned Object Attribute Space for Demand-driven Navigation
visualnav-transformer
Official code and checkpoint release for mobile robot foundation models: GNM, ViNT, and NoMaD.
BEV-Scene-Graph
[ICCV23] Bird’s-Eye-View Scene Graph for Vision-Language Navigation
spatial_attention
Visual Navigation with Spatial Attention