NNNNAI

Naiyuan Liu's starred repositories

MetaGPT

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Language:PythonMIT44761 895 668

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonApache-2.035300 343 2795

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT34997 210 1292

LLM101n

LLM101n: Let's build a Storyteller

29583 22860

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION26933 222 260

graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

Language:PythonMIT18665 119 489

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonMIT11507 154 344

nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

MIT9469 287 45

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonNOASSERTION6247 44 80

IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Language:Jupyter NotebookApache-2.05213 62 390

YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Language:PythonGPL-3.04607 39 450

torchtune

PyTorch native finetuning library

Language:PythonBSD-3-Clause4244 47 680

:bouncing_ball_person: Pytorch ReID: A tiny, friendly, strong pytorch implement of person re-id / vehicle re-id baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial

Language:PythonMIT4119 77 382

T2I-Adapter

Language:PythonApache-2.03462 39 113

speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Language:PythonApache-2.03452 45 84

BEVFormer

[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.

Language:PythonApache-2.03329 70 267

recognize-anything

Open-source and strong foundation image recognition models.

Language:Jupyter NotebookApache-2.02837 27 157

EchoMimic

Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning

Language:PythonApache-2.02822 41 169

DINO

[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"

Language:PythonApache-2.02244 31 264

DWPose

"Effective Whole-body Pose Estimation with Two-stages Distillation" (ICCV 2023, CV4Metaverse Workshop)

Language:PythonApache-2.02229 29 95

Vary

[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

Language:Python1803 54 132

occupancy_networks

This repository contains the code for the paper "Occupancy Networks - Learning 3D Reconstruction in Function Space"

Language:PythonMIT1523 32 130

Grounded-SAM-2

Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2

Language:Jupyter NotebookApache-2.01008 8 45

BaiduImageSpider

一个超级轻量的百度图片爬虫

Language:PythonMIT874 24 28

Vary-toy

Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)

Language:Python600 13 37

PicImageSearch

整合图片识别 API，用于以图搜源 / Aggregator for Reverse Image Search API

Language:PythonMIT437 8 43

clip_dinoiser

Official implementation of 'CLIP-DINOiser: Teaching CLIP a few DINO tricks' paper.

Language:Jupyter NotebookApache-2.0204 10 12

COMM

Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models

MIT186 20 5

TextGenerator

OCR dataset Text-Detection dataset Font-Classification dataset generator

Language:PythonMIT137 7 20

ocr_synth_text_chinese

生成训练文本检测数据集

Language:PythonMIT9 10