Andrei Pokrovsky's repositories
adityaatluri.github.io
https://adityaatluri.github.io
atomicAddBench
Benchmarking project for atomicAdd in cuda.
blocksparse
Efficient GPU kernels for block-sparse matrix multiplication and convolution
DeepBench
Benchmarking Deep Learning operations on different hardware
demo-cuda-pybind11
How to use CUDA with Python numpy
extension-cpp
C++ extensions in PyTorch
faster-rcnn.pytorch
A faster pytorch implementation of faster r-cnn
flownet2-pytorch
Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
jitify
A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).
mmap_benchmark
benmark for different mmap prefault/prefetch methods
MutexShootout
A benchmark to measure lock overhead and compare mutex performance under varying levels of contention.
pyinn
CuPy fused PyTorch neural networks ops
pytorch-custom-cuda-tutorial
Tutorial for building a custom CUDA function for Pytorch
pytorch-mobilenet
PyTorch MobileNet Implementation of "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications"
pytorch2caffe
Convert PyTorch model to Caffemodel
pytorch_knn_cuda
K-Nearest Neighbor in Pytorch
scipy-2017-codegen-tutorial
SymPy code generation tutorial at SciPy 2017
TF-deformable-conv
Implementation of deformable convolution as an operation in tensorflow
TF_Deformable_Net
Deformable convolutional net on Tensorflow
yolo_cpp
C++ed version of Yolo