Tal Ben-Nun's repositories

cudnn-training

A CUDNN minimal deep learning training code sample using LeNet.

cpyke

Easy integrated Python scripting embedded in C++

Language:C++License:BSD-3-ClauseStargazers:23Issues:5Issues:0

dfsim

Dataflow architecture simulator in Python

Language:PythonLicense:BSD-3-ClauseStargazers:7Issues:2Issues:1

pylulesh

Python port of LULESH

Language:PythonLicense:NOASSERTIONStargazers:3Issues:4Issues:0

dace

DaCe - Data Centric Parallel Programming

Language:PythonLicense:BSD-3-ClauseStargazers:1Issues:3Issues:0

astunparse

An AST unparser for Python

Language:PythonLicense:NOASSERTIONStargazers:0Issues:2Issues:0

capital

Distributed-memory implementations of novel Cholesky and QR matrix factorizations

Language:C++License:BSD-2-ClauseStargazers:0Issues:2Issues:0

convNet.pytorch

ConvNet training using pytorch

Language:PythonLicense:MITStargazers:0Issues:3Issues:0

CUDALibrarySamples

CUDA Library Samples

Language:CudaLicense:NOASSERTIONStargazers:0Issues:1Issues:0
Language:C++License:Apache-2.0Stargazers:0Issues:1Issues:0

dill

serialize all of python

Language:PythonLicense:NOASSERTIONStargazers:0Issues:2Issues:0

Elemental

Distributed-memory, arbitrary-precision, dense and sparse-direct linear algebra, conic optimization, and lattice reduction

Language:C++License:NOASSERTIONStargazers:0Issues:1Issues:0

fv3core

This repository has moved, please visit https://github.com/ai2cm/pace for the latest development of fv3core.

Language:PythonLicense:GPL-3.0Stargazers:0Issues:2Issues:0

gt4py

Python API to develop performance portable applications for weather and climate.

Language:PythonLicense:GPL-3.0Stargazers:0Issues:2Issues:0

hipTT

HIP port of the fast GPU tensor transpose library cuTT

Language:C++Stargazers:0Issues:0Issues:0

lbann

Livermore Big Artificial Neural Network Toolkit

Language:C++License:NOASSERTIONStargazers:0Issues:1Issues:0

llvmlite

A lightweight LLVM python binding for writing JIT compilers

Language:PythonLicense:BSD-2-ClauseStargazers:0Issues:1Issues:0

MIOpen

AMD's Machine Intelligence Library

Language:AssemblyLicense:MITStargazers:0Issues:1Issues:0

nvbench

CUDA Kernel Benchmarking Library

Language:CudaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

omniperf

Advanced Profiling and Analytics for AMD Hardware

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

omnitrace

Omnitrace: Application Profiling, Tracing, and Analysis

Language:C++License:MITStargazers:0Issues:1Issues:0

onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Language:C++License:MITStargazers:0Issues:2Issues:0

pyhpc-benchmarks

A suite of benchmarks to test the sequential CPU and GPU performance of most popular high-performance libraries for Python.

Language:PythonLicense:UnlicenseStargazers:0Issues:2Issues:0

pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Language:C++License:NOASSERTIONStargazers:0Issues:2Issues:0

rocBLAS

Next generation BLAS implementation for ROCm platform

Language:C++License:NOASSERTIONStargazers:0Issues:1Issues:0

roctracer

ROCm Tracer Callback/Activity Library for Performance tracing AMD GPU's

Language:C++License:NOASSERTIONStargazers:0Issues:2Issues:0

seti-ui

A subtle dark colored UI theme for Atom.

Language:CSSLicense:MITStargazers:0Issues:2Issues:0

spack

A flexible package manager that supports multiple versions, configurations, platforms, and compilers.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0
Language:HTMLStargazers:0Issues:2Issues:0

vision

Datasets, Transforms and Models specific to Computer Vision

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:2Issues:0