ShenXiaolei's starred repositories
Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
zero123plus
Code repository for Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.
Startup-CTO-Handbook
The Startup CTO's Handbook, a book covering leadership, management and technical topics for leaders of software engineering teams
AnimatedDrawings
Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
recognize-anything
Open-source and strong foundation image recognition models.
lean-side-bussiness
精益副业:程序员如何优雅地做副业
clip-as-service
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
VL-CheckList
Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations.
Enhance-FineGrained
[CVPR' 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding
emoji-semantic-search
Search the most relevant emojis given a natural language query
VisualGLM-6B
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Awesome-Knowledge-Distillation
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
free-for-dev
A list of SaaS, PaaS and IaaS offerings that have free tiers of interest to devops and infradev