Naiyuan Liu (NNNNAI)

NNNNAI

Geek Repo

Company:University of Technology Sydney

Github PK Tool:Github PK Tool

Naiyuan Liu's starred repositories

MetaGPT

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Language:PythonLicense:MITStargazers:44761Issues:895Issues:668

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:35300Issues:343Issues:2795

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:34997Issues:210Issues:1292

LLM101n

LLM101n: Let's build a Storyteller

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:26933Issues:222Issues:260

graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

Language:PythonLicense:MITStargazers:18665Issues:119Issues:489

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:MITStargazers:11507Issues:154Issues:344

nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:6247Issues:44Issues:80

IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:5213Issues:62Issues:390

YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Language:PythonLicense:GPL-3.0Stargazers:4607Issues:39Issues:450

torchtune

PyTorch native finetuning library

Language:PythonLicense:BSD-3-ClauseStargazers:4244Issues:47Issues:680

Person_reID_baseline_pytorch

:bouncing_ball_person: Pytorch ReID: A tiny, friendly, strong pytorch implement of person re-id / vehicle re-id baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial

Language:PythonLicense:MITStargazers:4119Issues:77Issues:382

T2I-Adapter

T2I-Adapter

Language:PythonLicense:Apache-2.0Stargazers:3462Issues:39Issues:113

speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Language:PythonLicense:Apache-2.0Stargazers:3452Issues:45Issues:84

BEVFormer

[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.

Language:PythonLicense:Apache-2.0Stargazers:3329Issues:70Issues:267

recognize-anything

Open-source and strong foundation image recognition models.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2837Issues:27Issues:157

EchoMimic

Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning

Language:PythonLicense:Apache-2.0Stargazers:2822Issues:41Issues:169

DINO

[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"

Language:PythonLicense:Apache-2.0Stargazers:2244Issues:31Issues:264

DWPose

"Effective Whole-body Pose Estimation with Two-stages Distillation" (ICCV 2023, CV4Metaverse Workshop)

Language:PythonLicense:Apache-2.0Stargazers:2229Issues:29Issues:95

Vary

[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

occupancy_networks

This repository contains the code for the paper "Occupancy Networks - Learning 3D Reconstruction in Function Space"

Language:PythonLicense:MITStargazers:1523Issues:32Issues:130

Grounded-SAM-2

Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1008Issues:8Issues:45

BaiduImageSpider

一个超级轻量的百度图片爬虫

Language:PythonLicense:MITStargazers:874Issues:24Issues:28

Vary-toy

Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)

PicImageSearch

整合图片识别 API,用于以图搜源 / Aggregator for Reverse Image Search API

Language:PythonLicense:MITStargazers:437Issues:8Issues:43

clip_dinoiser

Official implementation of 'CLIP-DINOiser: Teaching CLIP a few DINO tricks' paper.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:204Issues:10Issues:12

COMM

Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models

TextGenerator

OCR dataset Text-Detection dataset Font-Classification dataset generator

Language:PythonLicense:MITStargazers:137Issues:7Issues:20

ocr_synth_text_chinese

生成训练文本检测数据集

Language:PythonLicense:MITStargazers:9Issues:1Issues:0