Naiyuan Liu (NNNNAI)

NNNNAI

Geek Repo

Company:University of Technology Sydney

Github PK Tool:Github PK Tool

Naiyuan Liu's starred repositories

ComfyUI

The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.

Language:PythonLicense:GPL-3.0Stargazers:35639Issues:294Issues:2237

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:33089Issues:335Issues:2557

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:25418Issues:170Issues:809

generative-models

Generative Models by Stability AI

Language:PythonLicense:MITStargazers:22689Issues:242Issues:262

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:20878Issues:165Issues:151

ultimatevocalremovergui

GUI for a Vocal Remover that uses Deep Neural Networks.

Language:PythonLicense:MITStargazers:15313Issues:147Issues:1112

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:MITStargazers:10329Issues:152Issues:156

faster-whisper

Faster Whisper transcription with CTranslate2

Language:PythonLicense:MITStargazers:9314Issues:112Issues:578

AnimateDiff

Official implementation of AnimateDiff.

Language:PythonLicense:Apache-2.0Stargazers:9078Issues:97Issues:317

bypy

Python client for Baidu Yun (Personal Cloud Storage) 百度云/百度网盘Python客户端

Language:PythonLicense:MITStargazers:7540Issues:299Issues:566

Douyin_TikTok_Download_API

🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。

Language:PythonLicense:Apache-2.0Stargazers:7235Issues:59Issues:354

TikTokDownloader

完全免费开源,基于 AIOHTTP 模块实现:TikTok 主页/视频/图集/原声;抖音主页/视频/图集/收藏/直播/原声/合集/评论/账号/搜索/热榜数据采集工具

Language:PythonLicense:GPL-3.0Stargazers:6070Issues:40Issues:200

video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Language:PythonLicense:Apache-2.0Stargazers:5797Issues:71Issues:211

OutfitAnyone

Outfit Anyone: Ultra-high quality virtual try-on for Any Clothing and Any Person

IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4144Issues:57Issues:320

Person_reID_baseline_pytorch

:bouncing_ball_person: Pytorch ReID: A tiny, friendly, strong pytorch implement of person re-id / vehicle re-id baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial

Language:PythonLicense:MITStargazers:3973Issues:77Issues:375

YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Language:PythonLicense:GPL-3.0Stargazers:3587Issues:33Issues:291

PySceneDetect

:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.

Language:PythonLicense:NOASSERTIONStargazers:2858Issues:71Issues:300

DINO

[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"

Language:PythonLicense:Apache-2.0Stargazers:2013Issues:32Issues:254

DWPose

"Effective Whole-body Pose Estimation with Two-stages Distillation" (ICCV 2023, CV4Metaverse Workshop)

Language:PythonLicense:Apache-2.0Stargazers:1936Issues:28Issues:80

Vary

Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

CTCDecoder

Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.

Language:PythonLicense:MITStargazers:803Issues:25Issues:23

IP_LAP

CVPR2023 talking face implementation for Identity-Preserving Talking Face Generation With Landmark and Appearance Priors

Language:PythonLicense:Apache-2.0Stargazers:595Issues:18Issues:54

Vary-toy

Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)

PicImageSearch

整合图片识别 API,用于以图搜源 / Aggregator for Reverse Image Search API

Language:PythonLicense:MITStargazers:370Issues:6Issues:36
Language:PythonStargazers:195Issues:2Issues:0

llms_paper

该仓库主要记录 LLMs 算法工程师相关的顶会论文研读笔记(多模态、PEFT、小样本QA问答、RAG、LMMs可解释性、Agents、CoT)

COMM

Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models

clip_dinoiser

Official implementation of 'CLIP-DINOiser: Teaching CLIP a few DINO tricks' paper.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:138Issues:9Issues:8

CLRerNet

The official implementation of "CLRerNet: Improving Confidence of Lane Detection with LaneIoU"

Language:PythonLicense:Apache-2.0Stargazers:133Issues:1Issues:22