xiaohongzhong's starred repositories

stable-diffusion-webui

Stable Diffusion web UI

Language:PythonLicense:AGPL-3.0Stargazers:129439Issues:1026Issues:7312

llama

Inference code for LLaMA models

Language:PythonLicense:NOASSERTIONStargazers:50895Issues:499Issues:872

annotated_deep_learning_paper_implementations

🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Language:Jupyter NotebookLicense:MITStargazers:47989Issues:431Issues:119

styleguide

Style guides for Google-originated open-source projects

Language:HTMLLicense:Apache-2.0Stargazers:36548Issues:1298Issues:324

leveldb

LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.

Language:C++License:BSD-3-ClauseStargazers:35044Issues:1314Issues:743

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:32636Issues:328Issues:2504

serenity

The Serenity Operating System 🐞

Language:C++License:BSD-2-ClauseStargazers:28547Issues:349Issues:4100

mediapipe

Cross-platform, customizable ML solutions for live and streaming media.

Language:C++License:Apache-2.0Stargazers:25455Issues:493Issues:4856

video2x

A lossless video/GIF/image upscaler achieved with waifu2x, Anime4K, SRMD and RealSR. Started in Hack the Valley II, 2018.

Language:PythonLicense:AGPL-3.0Stargazers:8617Issues:121Issues:951
Language:PythonLicense:Apache-2.0Stargazers:4162Issues:42Issues:752

Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Language:PythonLicense:BSD-3-ClauseStargazers:2412Issues:30Issues:135

CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

Language:C++License:NOASSERTIONStargazers:2191Issues:47Issues:133

TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

Language:C++License:NOASSERTIONStargazers:1442Issues:41Issues:118

mlp-mixer-pytorch

An All-MLP solution for Vision, from Google AI

Language:PythonLicense:MITStargazers:966Issues:11Issues:11

Video-ChatGPT

"Video-ChatGPT" is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

Language:PythonLicense:CC-BY-4.0Stargazers:896Issues:12Issues:96

EGVSR

Efficient & Generic Video Super-Resolution

Language:PythonLicense:MITStargazers:883Issues:19Issues:24

RealBasicVSR

Official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution"

Language:PythonLicense:Apache-2.0Stargazers:836Issues:15Issues:84

hdrnet

An implementation of 'Deep Bilateral Learning for Real-Time Image Enhancement', SIGGRAPH 2017

Language:PythonLicense:Apache-2.0Stargazers:791Issues:34Issues:18

All-In-One-Deflicker

[CVPR2023] Blind Video Deflickering by Neural Filtering with a Flawed Atlas

RVRT

Recurrent Video Restoration Transformer with Guided Deformable Attention (NeurlPS2022, official repository)

Language:PythonLicense:NOASSERTIONStargazers:326Issues:23Issues:28

MIVisionX

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.

Language:C++License:MITStargazers:179Issues:24Issues:224

VR-Baseline

Video Restoration Toolbox including FGST (ICML 2022), S2SVR (ICML 2022), etc.

Language:PythonLicense:Apache-2.0Stargazers:145Issues:12Issues:24

acuity-models

Acuity Model Zoo

WACV2024-SAFA

WACV2024 - Scale-Adaptive Feature Aggregation for Efficient Space-Time Video Super-Resolution

Language:PythonLicense:MITStargazers:88Issues:8Issues:3

Shift-Net

A Simple Baseline for Video Restoration with Grouped Spatial-temporal Shift

winner-ntire22-vqe

Method and experience of winning the NTIRE'22 VQE challenge.

Real-Time-Multiple-Person-Recognition-and-Tracking-for-CCTV-Camera

a surveillance system for CCTV cameras which recognizes selected multiple target individuals and tracks in real time across multiple cameras, with detection, recognition, and kernel-based tracking modules. Facial recognition is done using HOG features and image embedding using OpenFace. We were able to perform simultaneous tracking and recognition of multiple individuals across multiple cameras in real time. Winning project, Smart India Hackathon 2019.

Language:PythonStargazers:51Issues:0Issues:0