botbw

botbw

Geek Repo

Company:None

Location:Singapore

Home Page:https://botbw.github.io/

Github PK Tool:Github PK Tool


Organizations
AoTTG-2
hpcaitech

botbw's starred repositories

alphafold3

AlphaFold 3 inference pipeline.

Language:PythonLicense:NOASSERTIONStargazers:4807Issues:0Issues:0

Triton-Puzzles-Lite

Puzzles for learning Triton, play it with minimal environment configuration!

Language:PythonLicense:Apache-2.0Stargazers:113Issues:0Issues:0

Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Language:PythonLicense:Apache-2.0Stargazers:125Issues:0Issues:0

spack

A flexible package manager that supports multiple versions, configurations, platforms, and compilers.

Language:PythonLicense:NOASSERTIONStargazers:4401Issues:0Issues:0

mpich

Official MPICH Repository

Language:CLicense:NOASSERTIONStargazers:560Issues:0Issues:0

ompi

Open MPI main development repository

Language:CLicense:NOASSERTIONStargazers:2169Issues:0Issues:0

nvitop

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Language:PythonLicense:Apache-2.0Stargazers:4823Issues:0Issues:0

semiring-einsum

Generic PyTorch implementation of einsum that supports different semirings

Language:PythonLicense:MITStargazers:46Issues:0Issues:0

SpringShell

Spring4Shell - Spring Core RCE - CVE-2022-22965

Language:PythonStargazers:127Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:8668Issues:0Issues:0

DCGM

NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs

Language:C++License:Apache-2.0Stargazers:411Issues:0Issues:0

AI-System-School

🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys, etc. 🗃️ Llama3, Mistral, etc. 🧑‍💻 Video Tutorials.

License:MITStargazers:2691Issues:0Issues:0

mase

Machine-Learning Accelerator System Exploration Tools

Language:PythonLicense:NOASSERTIONStargazers:123Issues:0Issues:0

Cute-Learning

Examples of CUDA implementations by Cutlass CuTe

Language:MakefileLicense:MITStargazers:97Issues:0Issues:0

models

The best OSS video generation models

Language:PythonLicense:Apache-2.0Stargazers:2009Issues:0Issues:0

pybind11

Seamless operability between C++11 and Python

Language:C++License:NOASSERTIONStargazers:15772Issues:0Issues:0

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)

Language:PythonLicense:Apache-2.0Stargazers:2637Issues:0Issues:0

Spring4Shell-POC

Dockerized Spring4Shell (CVE-2022-22965) PoC application and exploit

Language:PythonStargazers:312Issues:0Issues:0

CVE-2024-6387

Remote Unauthenticated Code Execution Vulnerability in OpenSSH server (CVE-2024-6387)

Language:PythonLicense:MITStargazers:45Issues:0Issues:0
Language:C++License:BSD-3-ClauseStargazers:216Issues:0Issues:0

ucx

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

Language:CLicense:NOASSERTIONStargazers:1155Issues:0Issues:0

EasyParallelLibrary

Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

Language:PythonLicense:Apache-2.0Stargazers:264Issues:0Issues:0

mirage

Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA

Language:C++License:Apache-2.0Stargazers:632Issues:0Issues:0

gemlite

Simple and fast low-bit matmul kernels in CUDA / Triton

Language:PythonLicense:Apache-2.0Stargazers:141Issues:0Issues:0

how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Language:CudaStargazers:1591Issues:0Issues:0

gpu-benches

collection of benchmarks to measure basic GPU capabilities

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:265Issues:0Issues:0

GPU-Puzzles

Solve puzzles. Learn CUDA.

Language:Jupyter NotebookLicense:MITStargazers:9913Issues:0Issues:0

thread-pool

BS::thread_pool: a fast, lightweight, and easy-to-use C++17 thread pool library

Language:C++License:MITStargazers:2207Issues:0Issues:0

TensorNVMe

A Python library transfers PyTorch tensors between CPU and NVMe

Language:C++Stargazers:98Issues:0Issues:0