imaginary-person

imaginary-person's repositories

ArchBenchSuite

low level kernels to benchmark peak compute, cache bandwidth on various levels, memory bandwidth, and some basic compute routines

Language:C++BSD-3-Clause000

Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.

Language:C++Apache-2.0000

charm

The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.

Language:C++NOASSERTION000

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++Apache-2.0000

FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

Language:C++NOASSERTION000

gdrcopy

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

Language:C++MIT000

gemm

000

leveldb

LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.

Language:C++BSD-3-Clause000

likwid

Performance monitoring and benchmarking suite

Language:CGPL-3.0000

llama2.c

Andrej Karpthy's Llama 2 inference in C

MIT000

loop_tool

A thin, highly portable C++ intermediate representation for dense loop-based computation.

Language:C++MIT000

madrona

MIT000

MonetDB

This is the official mirror of the MonetDB Mercurial repository. Please note that we do not accept pull requests on github. The regression test results can be found on the MonetDB Testweb http://monetdb.cwi.nl/testweb/web/status.php .For contributions please see: https://www.monetdb.org/Developers

Language:CNOASSERTION000

nanoGPT

Andrej Karpathy's nanoGPT

MIT000

obs-studio

OBS Studio - Free and open source software for live streaming and screen recording

Language:CGPL-2.0000

pytorch-1

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Language:C++NOASSERTION000

qBittorrent

qBittorrent BitTorrent client

Language:C++NOASSERTION000

rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.

Language:C++GPL-2.0000

Stanford_CS348K_readings

This is a list of readings for Stanford CS348K.

000

stdgpu

stdgpu: Efficient STL-like Data Structures on the GPU

Language:C++Apache-2.0000

stylegan3

Official PyTorch implementation of StyleGAN3

NOASSERTION000

TensorRT

TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.

Language:C++Apache-2.0000

torch-mlir

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

Language:C++NOASSERTION000

torcharrow

A torch.Tensor-like DataFrame library supporting multiple execution runtimes and Arrow as a common memory format

Language:PythonBSD-3-Clause000

torchdistx

Torch Distributed Experimental

Language:C++BSD-3-Clause000

torchdynamo

A Python-level JIT compiler designed to make unmodified PyTorch programs faster.

Language:PythonBSD-3-Clause000

transformers

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.0000

tuplex

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather than invoking the Python interpreter, Tuplex generates optimized LLVM bytecode for the given pipeline and input data set.

Language:C++Apache-2.0000

tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Language:PythonApache-2.0000