royinx

royinx

Geek Repo

Company:Dayta.ai

Location:HongKong

Home Page:royinx.github.io

Github PK Tool:Github PK Tool

royinx's starred repositories

PyMacroRecord

Free and Open Source Macro Recorder with a modern GUI using Python

Language:PythonLicense:GPL-3.0Stargazers:150Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:1002Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonLicense:Apache-2.0Stargazers:3247Issues:0Issues:0

compute-sanitizer-samples

Samples demonstrating how to use the Compute Sanitizer Tools and Public API

Language:CudaLicense:BSD-3-ClauseStargazers:47Issues:0Issues:0

Rust-CUDA

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

Language:RustLicense:Apache-2.0Stargazers:2956Issues:0Issues:0

candle

Minimalist ML framework for Rust

Language:RustLicense:Apache-2.0Stargazers:14428Issues:0Issues:0

llama

Inference code for Llama models

Language:PythonLicense:NOASSERTIONStargazers:54200Issues:0Issues:0

deployment

RAPIDS Deployment Documentation

Language:Jupyter NotebookStargazers:9Issues:0Issues:0

bitsandbytes

8-bit CUDA functions for PyTorch, modified to build on NVIDIA Jetson

Language:PythonLicense:MITStargazers:5Issues:0Issues:0

linkerd2

Ultralight, security-first service mesh for Kubernetes. Main repo for Linkerd 2.x.

Language:GoLicense:Apache-2.0Stargazers:10486Issues:0Issues:0

gpt-migrate

Easily migrate your codebase from one framework or language to another.

Language:PythonLicense:MITStargazers:6758Issues:0Issues:0

bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Language:PythonLicense:MITStargazers:5756Issues:0Issues:0

CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

Language:C++License:NOASSERTIONStargazers:2260Issues:0Issues:0

dbscan-cuda

并行计算作业 DBSCAN algorithm with C++ and CUDA

Language:CudaStargazers:6Issues:0Issues:0

Competitive-Programming

Competitive Programming problem solutions.

Language:C++Stargazers:4Issues:0Issues:0

Nsight-Systems-Docker-Image

Nsight Systems in Docker

Language:DockerfileLicense:MITStargazers:16Issues:0Issues:0
Language:PythonStargazers:2Issues:0Issues:0

nvjpeg-python

nvjpeg for python

Language:C++License:MITStargazers:86Issues:0Issues:0

How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

Language:CudaLicense:Apache-2.0Stargazers:756Issues:0Issues:0

nvdiffrec

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

Language:PythonLicense:NOASSERTIONStargazers:2096Issues:0Issues:0

dmls-book

Summaries and resources for Designing Machine Learning Systems book (Chip Huyen, O'Reilly 2022)

Stargazers:1986Issues:0Issues:0

transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀

Language:PythonLicense:Apache-2.0Stargazers:1633Issues:0Issues:0

cpp-cheat-sheet

C++ Syntax, Data Structures, and Algorithms Cheat Sheet

Language:C++Stargazers:4871Issues:0Issues:0

open-gpu-kernel-modules

NVIDIA Linux open GPU kernel module source

Language:CLicense:NOASSERTIONStargazers:14187Issues:0Issues:0
Language:PythonStargazers:3Issues:0Issues:0

instant-ngp

Instant neural graphics primitives: lightning fast NeRF and more

Language:CudaLicense:NOASSERTIONStargazers:15608Issues:0Issues:0

yolov7_d2

🔥🔥🔥🔥 (Earlier YOLOv7 not official one) YOLO with Transformers and Instance Segmentation, with TensorRT acceleration! 🔥🔥🔥

Language:PythonLicense:GPL-3.0Stargazers:3130Issues:0Issues:0

kernel_tuner

Kernel Tuner

Language:PythonLicense:Apache-2.0Stargazers:259Issues:0Issues:0

triton_ensemble_model_demo

triton server ensemble model demo

Language:PythonStargazers:29Issues:0Issues:0

DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

Language:C++License:Apache-2.0Stargazers:4984Issues:0Issues:0