DC-Zhou

followers

following

stars

Chengdu.China

Zhouzhou's starred repositories

interview

📚 C/C++ 技术面试基础知识总结，包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, including language, program library, data structure, algorithm, system, network, link loading library, interview experience, recruitment, recommendation, etc.

Language:C++NOASSERTION34708 863 63

simdjson

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

Language:C++Apache-2.019273 240 833

jetson-inference

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

Language:C++MIT7791 272 1819

tensorrtx

Implementation of popular deep learning networks with TensorRT network definition API

Language:C++MIT6938 104 1306

MobileSAM

This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!

Language:Jupyter NotebookApache-2.04762 43 125

CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

Language:C++NOASSERTION2361 46 162

jetson_stats

📊 Simple package for monitoring and control your NVIDIA Jetson [Orin, Xavier, Nano, TX] series

Language:PythonAGPL-3.02149 33 421

CUDALibrarySamples

CUDA Library Samples

Language:CudaNOASSERTION1578 30 197

CUDA-Learn-Notes

🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

Language:CudaGPL-3.01299 14 6

Tracking-Anything-with-DEVA

[ICCV 2023] Tracking Anything with Decoupled Video Segmentation

Language:PythonNOASSERTION1247 16 107

MatX

An efficient C++17 GPU numerical computing library with Python-like syntax

Language:C++BSD-3-Clause1204 24 197

stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

Language:PythonMIT1167 16 123

Learn-CUDA-Programming

Learn CUDA Programming, published by Packt

Language:CudaMIT1005 27 12

trt_pose

Real-time pose estimation accelerated with NVIDIA TensorRT

Language:PythonMIT974 42 171

Image-Denoising-State-of-the-art

rwkv-cpp-accelerated

A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependencies

Language:C++MIT306 10 21

ruapu

Detect CPU features with single-file

Language:CMIT292 6 29

jetson_dla_tutorial

A tutorial for getting started with the Deep Learning Accelerator (DLA) on NVIDIA Jetson

Language:PythonNOASSERTION279 7 3

ncnn-models

awesome AI models with NCNN, and how they were converted ✨✨✨

Language:C++MIT250 10 21

Parallel-Computing-Cuda-C

CUDA Learning guide

Language:Cuda224 80

RWKV-CUDA

The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )

Language:Cuda212 3 7

cuDLA-samples

YOLOv5 on Orin DLA

Language:PythonNOASSERTION184 7 42

Deep-Learning-Accelerator-SW

NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.

Language:PythonNOASSERTION179 9 10

faster-rwkv

Language:C++123 4 7

YOLOX

MegEngine implementation of YOLOX

Language:PythonApache-2.0106 5 10

Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F

Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.

Language:CGPL-3.0104 4 1

CodeFormer-ncnn

ncnn version of CodeFormer

Language:C++97 4 11

llama2.c-to-ncnn

A converter for llama2.c legacy models to ncnn models.

Language:PythonMIT82 2 3

cute-gemm

Language:C++74 2 4

arm64-linux-debugging-disassembling-reversing

Source Code for 'Foundations of ARM64 Linux Debugging, Disassembling, and Reversing' by Dmitry Vostokov

Language:C++NOASSERTION11 30