sleepwalkeryw's starred repositories

insightface

State-of-the-art 2D and 3D Face Analysis Project

Language:PythonStargazers:22211Issues:0Issues:0

InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Language:PythonLicense:Apache-2.0Stargazers:2320Issues:0Issues:0

yolov10

YOLOv10: Real-Time End-to-End Object Detection

Language:PythonLicense:AGPL-3.0Stargazers:8629Issues:0Issues:0

Vitron

A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Language:PythonStargazers:266Issues:0Issues:0

swift

ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Language:PythonLicense:Apache-2.0Stargazers:2672Issues:0Issues:0

Yi

A series of large language models trained from scratch by developers @01-ai

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7527Issues:0Issues:0

NGD-SLAM

NGD-SLAM: Towards Real-Time SLAM for Dynamic Environments without GPU.

Language:C++License:GPL-3.0Stargazers:72Issues:0Issues:0

ml-4m

4M: Massively Multimodal Masked Modeling

Language:PythonLicense:Apache-2.0Stargazers:1464Issues:0Issues:0

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonLicense:Apache-2.0Stargazers:1617Issues:0Issues:0

VisualGLM-6B

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型

Language:PythonLicense:Apache-2.0Stargazers:4048Issues:0Issues:0

ER-NeRF

[ICCV'23] Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis

Language:PythonLicense:MITStargazers:929Issues:0Issues:0

ShareGPT4Video

An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Language:PythonStargazers:1185Issues:0Issues:0

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonLicense:MITStargazers:1914Issues:0Issues:0

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:8146Issues:0Issues:0

GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Language:PythonLicense:Apache-2.0Stargazers:4016Issues:0Issues:0
Language:PythonLicense:MITStargazers:4298Issues:0Issues:0

Unique3D

Official implementation of Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image

Language:PythonLicense:MITStargazers:2632Issues:0Issues:0

ChatTTS

A generative speech model for daily dialogue.

Language:PythonLicense:AGPL-3.0Stargazers:28590Issues:0Issues:0

MuseTalk

MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting

Language:PythonLicense:NOASSERTIONStargazers:2073Issues:0Issues:0

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Language:PythonLicense:Apache-2.0Stargazers:10725Issues:0Issues:0

fastbm25

The fast python bm25 algorithm implemented with reverted index

Language:PythonLicense:Apache-2.0Stargazers:39Issues:0Issues:0

PaddleRec

Recommendation Algorithm大规模推荐算法库,包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、Bert4Rec、DeepWalk、SSR、AITM,DSIN,SIGN,IPREC、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESMM、ESCMM, MAML、xDeepFM、DeepFEFM、NFM、AFM、RALM、DMR、GateNet、NAML、DIFM、Deep Crossing、PNN、BST、AutoInt、FGCNN、FLEN、Fibinet、ListWise、DeepRec、ENSFM,TiSAS,AutoFIS等,包含经典推荐系统数据集criteo 、movielens等

Language:PythonLicense:Apache-2.0Stargazers:4191Issues:0Issues:0

Fooocus

Focus on prompting and generating

Language:PythonLicense:GPL-3.0Stargazers:38989Issues:0Issues:0

RecAI

Bridging LLM and Recommender System.

Language:Jupyter NotebookLicense:MITStargazers:476Issues:0Issues:0

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:4473Issues:0Issues:0

label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format

Language:JavaScriptLicense:Apache-2.0Stargazers:17707Issues:0Issues:0

joytag

The JoyTag Image Tagging Model

Language:PythonLicense:Apache-2.0Stargazers:377Issues:0Issues:0

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:45958Issues:0Issues:0
Language:JavaScriptStargazers:376Issues:0Issues:0

cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

Language:TypeScriptLicense:MITStargazers:11949Issues:0Issues:0