Computer vision, broadly speaking, is a research field aimed to enable computers to process and interpret visual data, as sighted humans can. It is one of the most exciting areas of research in computing science and among the fastest growing technologies in today’s industry. This course provides an introduction to the fundamental principles and applications of computer vision.
- Introduction to Python for computer vision
- Convolution filtering by hand
- Gaussian filtering
- Hybrid images
- Denoising filters
- Scaled representation
- Object/face detection with normalized cross-correlation
- Gaussian and Laplacian image pyramids
- Image blending
- Texture synthesis by non-parametric sampling
- Hole filling using texture synthesis
- SIFT keypoint matching
- RANSAC sampling and model fitting for noisy image data
- Keypoint projections using homography matrices
- Panorama stitching with RANSAC homography
- Bags of features: bag of words using SIFT descriptors
- Clustering SIFT descriptors with K-means
- Representing images as histograms of visual words
- Scene recognition with KNNs
- Scene recognition with 1-vs-all linear SVMs
- Implementing PyTorch layers from scratch (ReLU, MaxPool2d, Linear, Conv2d)
- CNN hyperparameter tuning (epochs, channel sizes, hidden layers, activations)
- Image segmentation with MaskRCNN