Can Goksen's starred repositories
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
stable-diffusion
A latent text-to-image diffusion model
openai-cookbook
Examples and guides for using the OpenAI API
flash-attention
Fast and memory-efficient exact attention
tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
ML-Papers-of-the-Week
🔥Highlighting the top ML papers every week.
attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
linux-surface
Linux Kernel for Surface Devices
calib_challenge
The comma.ai Calibration Challenge!
pytorch-lr-finder
A learning rate range test implementation in PyTorch
Efficient-3DCNNs
PyTorch Implementation of "Resource Efficient 3D Convolutional Neural Networks", codes and pretrained models.
fft-conv-pytorch
Implementation of 1D, 2D, and 3D FFT convolutions in PyTorch. Much faster than direct convolutions for large kernel sizes.
awesome-lifelong-continual-learning
A list of papers, blogs, datasets and software in the field of lifelong/continual machine learning
knowledge-distillation-for-unet
An implementation of Knowledge distillation for segmentation, to train a small (student) UNet from a larger (teacher) UNet thereby reducing the size of the network while achieving performance similar to the heavier model.
fftw-cufftw-benchmark
Benchmark for popular fft libaries - fftw | cufftw | cufft
GPU-research-FFT-OpenACC-CUDA
Case studies constitute a modern interdisciplinary and valuable teaching practice which plays a critical and fundamental role in the development of new skills and the formation of new knowledge. This research studies the behavior and performance of two interdisciplinary and widely adopted scientific kernels, a Fast Fourier Transform and Matrix Multiplication. Both routines are implemented in the two current most popular many-core programming models CUDA and OpenACC. A Fast Fourier Transform (FFT) samples a signal over a period of time and divides it into its frequency components, computing the Discrete Fourier Transform (DFT) of a sequence. Unlike the traditional approach to computing a DFT, FFT algorithms reduce the complexity of the problem from O(n2) to O(nLog2n). Matrix multiplication is a cornerstone routine in Mathematics, Artificial Intelligence and Machine Learning. This research also shows that the nature of the problem plays a crucial role in determining what many-core model will provide the highest benefit in performance.
DeepGalerkinMethod
Based on: https://arxiv.org/abs/1811.08782. Our writeup: https://github.com/Dahoas/DeepGalerkinMethod/blob/master/DPDEs.pdf