Marco Barbone's repositories
cpu-performance-tests
This repository contains the code to benchmark CPU cache miss latency and branch misprediction penalty
morton-span
This repository implements a morton transform for mdspan
aocl-libm-ose
AMD LIBM
arrayfire
ArrayFire: a general purpose GPU library.
benchmark-elementary-functions
this repo aims to test the performance and accuracy different elementary functions (e.g. log, sin, cos..)
cpp-learning
In this repo, there are random cpp features tested
chebtest
Basic nanobench project for messing around with polynomial evaluation
cmake-minimal
A minimal cmake-based C++ project setup
cuda-variant
variant type for CUDA
ducc
Fork of https://gitlab.mpcdf.mpg.de/mtr/ducc to simplify external contributions
fft_bench
More benchmarks of various fft implementations
fftw3
DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)
highway
Performance-portable, length-agnostic SIMD with runtime dispatch
nanobind_example
A nanobind example project
online-alt-min
Source code for paper Choromanska et al. -- Beyond Backprop: Online Alternating Minimization with Auxiliary Variables -- http://proceedings.mlr.press/v97/choromanska19a.html
optimized-routines
Optimized implementations of various library functions for ARM architecture processors
xsimd
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))
yagit
Library for efficient comparison of 2D, 3D DICOM images using gamma index