Haoran Duan's repositories
Awesome-Human-Activity-Recognition
An up-to-date & curated list of Awesome IMU-based Human Activity Recognition(Ubiquitous Computing) papers, methods & resources. Please note that most of the collections of researches are mainly based on IMU data.
Awesome-Embodied-AI
A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
Awesome-Text-to-Video-Generation
A list for Text-to-Video, Image-to-Video works
3DTopia
Text-to-3D Generation within 5 Minutes
all-seeing
[ICLR 2024] This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"
Awesome-CVPR2024-Low-Level-Vision
A Collection of Papers and Codes in CVPR2023/2022 about low level vision
awesome-described-object-detection
A curated list of papers and resources related to Described Object Detection, Open-Vocabulary/Open-World Object Detection and Referring Expression Comprehension. Updated frequently and pull requests welcomed.
Awesome-Evaluation-of-Visual-Generation
A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems
Awesome-Generative-Image-Composition
A curated list of papers, code, and resources pertaining to generative image composition.
ChatTTS
ChatTTS is a generative speech model for daily dialogue.
EasyVolcap
[SIGGRAPH Asia 2023 (Technical Communications)] EasyVolcap: Accelerating Neural Volumetric Video Research
FeatUp
Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
generative-ai-for-beginners
18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
gpt-researcher
GPT based autonomous agent that does online comprehensive research on any given topic
llama3-from-scratch
llama3 implementation one matrix multiplication at a time
llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
LMDrive
[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models
Mamba_State_Space_Model_Paper_List
[Mamba-Survey-2024] Paper list for State-Space-Model/Mamba and it's Applications
MonoGS
[CVPR'24] Gaussian Splatting SLAM
Mora
Mora: More like Sora for Generalist Video Generation
Neural-Network-Diffusion
We introduce a novel approach for parameter generation, named neural network diffusion (\textbf{p-diff}, p stands for parameter), which employs a standard latent diffusion model to synthesize a new set of parameters
pykan
Kolmogorov Arnold Networks
self-rag
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
V3D
V3D: Video Diffusion Models are Effective 3D Generators
ViDAR
[CVPR 2024 Highlight] Visual Point Cloud Forecasting
VMamba
VMamba: Visual State Space Models,code is based on mamba
World-Models-Autonomous-Driving-Latest-Survey
A curated list of world models for autonomous driving. Keep updated.
YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection