Pytorch implementation of spatial transformer networks (STN) and CoordConv for ConvLayers with some experiments on toy datasets.
- Spatial Transformer Networks
- An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution
- STN-OCR: A single Neural Network for Text Detection and Text Recognition
The experiments were performed using Python 3.8.5 with the following Python packages:
- numpy == 1.18.5
- torch == 1.5.1
- torchvision = 0.6.1
- matplotlib == 3.3.3
To play with my implementation, you can simply put the following command into your terminal after adjusting the necessary parameters:
python3 main.py [--seed SEED] [--use_cuda USE_CUDA]
[--batch_size BATCH_SIZE] [--lr LR]
[--num_workers NUM_WORKERS]
[--num_epochs NUM_EPOCHS]
[--optimizer OPTIMIZER] [--beta1 BETA1] [--beta2 BETA2]