Zhouzhou (DC-Zhou)

DC-Zhou

Geek Repo

Location:Chengdu.China

Github PK Tool:Github PK Tool

Zhouzhou's starred repositories

interview

📚 C/C++ 技术面试基础知识总结,包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, including language, program library, data structure, algorithm, system, network, link loading library, interview experience, recruitment, recommendation, etc.

Language:C++License:NOASSERTIONStargazers:34708Issues:863Issues:63

simdjson

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

Language:C++License:Apache-2.0Stargazers:19273Issues:240Issues:833

jetson-inference

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

tensorrtx

Implementation of popular deep learning networks with TensorRT network definition API

MobileSAM

This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4762Issues:43Issues:125

CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

Language:C++License:NOASSERTIONStargazers:2361Issues:46Issues:162

jetson_stats

📊 Simple package for monitoring and control your NVIDIA Jetson [Orin, Xavier, Nano, TX] series

Language:PythonLicense:AGPL-3.0Stargazers:2149Issues:33Issues:421

CUDALibrarySamples

CUDA Library Samples

Language:CudaLicense:NOASSERTIONStargazers:1578Issues:30Issues:197

CUDA-Learn-Notes

🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

Language:CudaLicense:GPL-3.0Stargazers:1299Issues:14Issues:6

Tracking-Anything-with-DEVA

[ICCV 2023] Tracking Anything with Decoupled Video Segmentation

Language:PythonLicense:NOASSERTIONStargazers:1247Issues:16Issues:107

MatX

An efficient C++17 GPU numerical computing library with Python-like syntax

Language:C++License:BSD-3-ClauseStargazers:1204Issues:24Issues:197

stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

Language:PythonLicense:MITStargazers:1167Issues:16Issues:123

Learn-CUDA-Programming

Learn CUDA Programming, published by Packt

Language:CudaLicense:MITStargazers:1005Issues:27Issues:12

trt_pose

Real-time pose estimation accelerated with NVIDIA TensorRT

Language:PythonLicense:MITStargazers:974Issues:42Issues:171

rwkv-cpp-accelerated

A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependencies

Language:C++License:MITStargazers:306Issues:10Issues:21

ruapu

Detect CPU features with single-file

Language:CLicense:MITStargazers:292Issues:6Issues:29

jetson_dla_tutorial

A tutorial for getting started with the Deep Learning Accelerator (DLA) on NVIDIA Jetson

Language:PythonLicense:NOASSERTIONStargazers:279Issues:7Issues:3

ncnn-models

awesome AI models with NCNN, and how they were converted ✨✨✨

Language:C++License:MITStargazers:250Issues:10Issues:21

Parallel-Computing-Cuda-C

CUDA Learning guide

Language:CudaStargazers:224Issues:8Issues:0

RWKV-CUDA

The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )

cuDLA-samples

YOLOv5 on Orin DLA

Language:PythonLicense:NOASSERTIONStargazers:184Issues:7Issues:42

Deep-Learning-Accelerator-SW

NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.

Language:PythonLicense:NOASSERTIONStargazers:179Issues:9Issues:10

YOLOX

MegEngine implementation of YOLOX

Language:PythonLicense:Apache-2.0Stargazers:106Issues:5Issues:10

Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F

Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.

Language:CLicense:GPL-3.0Stargazers:104Issues:4Issues:1

CodeFormer-ncnn

ncnn version of CodeFormer

llama2.c-to-ncnn

A converter for llama2.c legacy models to ncnn models.

Language:PythonLicense:MITStargazers:82Issues:2Issues:3

arm64-linux-debugging-disassembling-reversing

Source Code for 'Foundations of ARM64 Linux Debugging, Disassembling, and Reversing' by Dmitry Vostokov

Language:C++License:NOASSERTIONStargazers:11Issues:3Issues:0