Computer Vision (CS 763) - Spring 2018

Course Information

Instructor: Arjun Jain
Office: 216, CSE New Building
Email: ajain@cse DOT iitb DOT ac DOT in
Teaching Assistants: Rishabh Dabral, Safeer Afaque
Instructor Office Hours (in room 216 CSE New Building): Arjun is on campus only on Thursdays and Fridays. Meet him after class or fix an appointment over email.

Please note that CS663 is a hard prerequisite for this course.

Topics to be covered (tentative)

Camera geometry, camera calibration, vanishing points, important transformations, homographies
Image registration: RANSAC for point-matching, SIFT overview
Deep Learning in computer vision: the data-driven paradigm, feed forwards networks, back-propagation and chain rule; CNNs and their building blocks, generative adverserial networks (GANs)
Deep Learning applications including face detection, CNN compression, siamese and triplet networks and applications to face recognition
Algorithms for: shape from shading, optical flow, Kanade-Lucas-Tomasi algorithm, applications of optical flow
Photometric stereo - deriving shape from multiple images of an object taken under different lighting conditions; applications to illumination invariant face recognition, face relighting
Stereo (geometric binocular): epipolar geometry and fundamental matrix, the correspondence problem and shape from stereo; structure from motion

Learning materials and textbooks

Lecture slides that will be regularly posted
Computer Vision: Algorithms and Applications, by Richard Szeliski
Fundamentals of Computer Vision, by Mubarak Shah
Deep Learning, by Ian Goodfellow and Yoshua Bengio and Aaron Courville
All iTorch notebooks for topics covered in class can be found here

Grading Policy

Mid-sem exam: 20%
Final exam (cumulative): 20%
Assignments (five or six): 35% (all to be done in groups of 2-3 students)
Course project: 20% (to be done in the same group of 2-3 students)
Class participation: 5%
Course project work will be presented by the student group during a viva at the end of the course. During this viva, each student in the group will be separately questioned, not only on the project work, but also the assignments. Each student is expected to contribute to each and every assignment and the course project.
Audit requirements: You must write both exams, submit all assignments and the project, and score at least 40% to get an AU.

Other Policies

Assignments will be given out (typically) once every two or three weeks. They must be submitted on or before the deadline. No late assignments will be accepted. The programming components of the assignments will typically involve MATLAB and lua, so you must be willing to learn it quickly.
We will adopt a zero-tolerance policy against any forms of plagiarism or any other form of cheating. Just don't do it! In cases of plagiarism, givers and takers will both be considered equally responsible.
This course is (inherently) cumulative. The syllabus for the final exam will include everything taught during the semester.

Course Projects

[02/02/2018] Course projects have now been finalized.

Go to this link for the finalized list.

Assignments

[12-Jan-18] Assignment 1 has been released. The due date for submission is Friday, January 26, 2018.
[27-Jan-18] Assignment 2 has been released. The due date for submission is Sunday, February 4, 2018.
[09-Feb-18] Assignment 3 has been released. The due date for submission is Wednesday, February 21, 2018. Corresponding kaggle competition link
[06-Mar-18] Assignment 4 has been released. The due date for submission is Monday, March 19, 2018. Corresponding kaggle competition link
[24-March-18] Assignment 5 on Tracking has been released. Due date: April 2, 2018. Download the necessary files from here
[11-April-18] Assignment 6 on Multiview Geometry has been released. Due date: April 19, 2018.

Lecture Schedule:

Date	Topics	Slides	iTorch Notebooks	Extra Reading
4th Jan. 2018	Introduction to computer vision, applications and course overview	Slides	--	--
5th Jan. 2018	Camera Geometry Homogeneous coordinates and projective geometry Vanishing points, ideal line, point line duality in P2 Important 2D and 3D transformations using homogeneous coordinates Introduction to the pin-hole camera model	Slides	--	Homogeneous Representations of Points, Lines and Planes
12th Jan. 2018	Modeling the pinhole camera analytically, intinsic and extrinsic parameters World, camera, image plane and sensor plane coordinate systems and transformations between them Linear and non-linear (lens distortion) errors Homography, planar world and pure rotation of the camera	Slides	--	--
13th Jan. 2018	Iterative solutions for dealing with with non-linear (lens distortion) errors Normalized, ideal, euclidian, affine and general camera models Orthographic and weak-perspective camera models Cross ratios and its applications Camera calibration using DLT (known 3D control points)	Slides	--	Resource on SVD, how/why it can be used to solve eq. sytems of type Ax=0, \|x\|=1
18th Jan. 2018	Zhang's camera calibration method, mention of a few DL based calibration methods Image Alignment Image alignment: problem statement, physically and digitally corresponding points Motion models and degrees of freedom; non-rigid/deformable/non-parametric image alignment Control point based image alignment using least squares - derivation for pseudo-inverse Introduction to the SIFT algorithm Forward and reverse image warping - bilinear and nearest-neighbor interpolation Mention of DL based image patch descriptors	Slides(1) Slides(2)	--	--
19th Jan. 2018	Image alignment using image similarity measures: mean squared error, normalized cross-correlation Concept of field of view in image alignment using image similarity measures Monomodal and multimodal image alignment Concept of joint histograms and behaviour of joint histograms in multi-modal image alignment Concept of entropy and joint entropy, algorithm for multimodal registration by minimizing joint entropy Aspects of image registration: 2D/3D, motion model, monomodal or multimodal Application scenarios for image alignment: template matching, video stabilization, panorama generation, face recognition, 3D to 2D alignment Robust Methods in Computer Vision Least squares problems and their relation to the Gaussian distribution on the noise Examples of outliers in computer vision Explanation of why the Gaussian distribution is unsuited to handling outliers Introduction to the Laplacian distribution The importance of heavy-tailed distributions in robust statistics RANSAC (random sample consensus) algorithm	Slides(1) Slides(2)	--	--
25th Jan. 2018	Recognizing images, objects, scenes (Prof. Suyash P. Awate) Texture modeling and classification Image classification, challenges Bag of words model, dictionary learning Defining image similarity, pyramid match kernel (PMK)	Slides	--	--
1st Feb. 2018	Recognizing images, objects, scenes (Prof. Suyash P. Awate) Pyramid match kernel (PMK) Kernel coding, local coding, vector quantization, sparse coding, LcLC	Slides	--	--
2nd Feb. 2018	Robust Methods in Computer Vision RANSAC: time complexity and expected no. of iterations Using RANSAC for Homography estimation Introduction to the Laplacian distribution Mean versus median: L2 fit versus L1 fit LMeds: Least Median of Squares Deep Learning for Computer Vision History, introduction Data driven paradigm K-NN on CIFAR 10 Hyperparameters, choice of loss function, cross-validation	Slides(1) Slides(2)	KNN	Matrix calculus reminder
8th Feb. 2018	Softmax classifier, cross-entropy loss function, regularization Optimization: vanilla gradient descent, stochastic gradient descent Vanilla momentum, Nesterov momentum, AdaGrad, RMSProp, ADAM Second order optimization methods, it's issues with deep learning Good learning rate, learning rate decay	Slides	Gradient Check	ADAM, Nesterov DL optimization algorithms overview
9th Feb. 2018	Feed forward, back-propagation Fully connected layer Activation functions: sigmoid, tanh, ReLU, LeakyReLU, ELU, etc.	Slides	Linear Layer, ReLU	--
15th Feb. 2018	Convolutions: transposed, dilated, fully-connected as convolution, sliding window as convolution Max-pooling, Dropout	Slides	MaxPool, Convolution, Transposed convolution, Dropout	Convolution arithmetic for deep learning
16th Feb. 2018	SoftMax, Cross Entropy Data Augmentation, hyperparamter selection Weight initialization	Slides	Cross Entropy, Weight Initialization	--
22nd Feb. 2018	ConvNet applications ConvNet case studies	Slides	--	--
23rd Feb. 2018	RNNs, LSTMS Visualizing and understanding ConvNets	Slides	--	--
8th March 2018	Visualizing and understanding ConvNets Images that maximize ConvNet class scores, reconstructing images from ConvNet codes Deep Dream, Neural Art, Adversarial Examples Dimentionality reduction: siamese and triplet networks	Slides	--	--
9th March 2018	Other vision tasks: semantic segmentation, object localization, object detection, instance segmentation R-CNN, Mask R-CNN, Autoencoders Generative modeling: VAEs, GANs Case studies: pix2pix, CycleGAN, UNIT	Slides Slides	MNIST Vanilla GAN	--
15th March 2018	Deep Reinforcement Learning (Prof. Shivaram Kalyanakrishnan)	Slides	--	--
16th March 2018	Structure from Motion Motion as a cue to inference of 3D structure from images Motion factorization algorithm by Tomasi and Kanade for inference of (sparse) 3D structure of a fixed object being observed by a moving orthographic camera (or a rigidly moving object, being observed by a fixed orthographic camera) Aspects of the above algorithm: Rank theorem, metric constraints for inference of motion parameters and 3D structure	Slides	--	--
22nd March 2018	Kanade-Lucas-Tomasi Feature Tracking (KLT) Tracking feature-points from a template by estimating motion parameters. Finding good features to track.	Slides	--	Lucas-Kanade 20 Years On: A Unifying Framework
23rd March 2018	Geometric Stereo Orientation parameters for the camera pair and relative orientation. Coplanarity constraint for corresponding points Derivation and key properties of the Fundamental matrix	Slides	--	--
5th April 2018	Introduction to epipolar geometry Essential matrix Popular parameterizations for the relative orientation Generating the normalized stereo case from arbitrary views	Slides	--	--
6th April 2018	Direct Solutions for Computing Fundamental and Essential Matrix 8-point algorithm Triangulation	Slides(1) Slides(2)	--	--
12th April 2018	Absolute Orientation Multi-View Geometry and Bundle Adjustment	Slides	--	--
19th April 2018	Shape from Shading: Introduction Reflectance Models	Slides	--	--
20th April 2018	Photometric Stereo	Slides	--	--

vasusingla / Spring2018

Computer Vision (CS 763) - Spring 2018

Course Information

Please note that CS663 is a hard prerequisite for this course.

Topics to be covered (tentative)

Learning materials and textbooks

Grading Policy

Other Policies

Course Projects

Assignments

Lecture Schedule:

About

Languages