Introduction: evaluation and improving your neural network
- Trial and error when building a network (among others through hyperparameters):
- Explain use of a training set, a hold-out CV set and a test set, plus common proportions
- Explain bias (high = underfitting) and variance (low = overfitting)
- When overfitting → use regularization
Regularization
- Explain L2 regularization in the context of logistic regression
- L2 regularization in a neural network
- L2 regularization changes the expression used in backpropagation ("weight decay")
- Intuition of what happens (depending on size lambda)
- Dropout regularization: randomly removing nodes in your network
- How implement it? Set a dropout probability, and make sure that nodes are dropped at random
- Make sure to invert your expected value so that doesn’t change
- When testing don’t use drop-out for predictions
- Max norm technique for regularization: enforce an absolute upper bound on the magnitude of the weight vector for every neuron
- Other regularization techniques?
- Add more data (eg. alter images slightly) = data augmentation
- Early stopping: plot training error vs dev set error and stop iterating when your errors diverge