Tensorflow-probability
benoitLebreton-perso opened this issue · comments
Description of Problem:
For the existing determinist neural networks, the predict_proba
method gives a basic estimation of the probability per class.
melusine/melusine/models/train.py
Line 357 in 4a0a181
With a specific type of neural networks we are able to compute a better uncertainty estimation on the outputs of the models.
For users that give importance to uncertainty estimation (especially usefull for datasets with errors in labels), this type of model may give the same performance as deterministics neural nets but provides better uncertainty estimation.
The only drawback is we need to choose a prior on the weights of the neural net and it needs more computation to train.
Overview of the Solution:
Using the package tensorflow-probability we can setup a Neural Network to return a Distribution on the outputs (and not only point estimation).
For each prediction : this estimated distribution allows us to have :
- A point estimation (mean of the distribution for example) : quiet the same as the existing
predict_proba
method - An estimation of uncertainty around this prediction (for examepl : using a standard deviation around with gaussian assumption)
Examples:
Using the tutorial of Melusine, instead of just having the point estimation with predict
or predict_proba
method, we can have upper bounds and lower bounds on the estimated probabilities.
Blockers:
- Warning with the dependency tensorflow-probability. In my environment tf-probability is already here thanks to tensorflow. But we could make this dependency optionnal if someone doesn't want it in its environment.
- The tf-probability version of cnn_model, rnn_model, transformers_model... will look very like the existing architectures. To be compatible with
NeuralModel
, I can just propose new functions that look exactly the same but with little modifications. If the architectures were cut in macro-blocks (embedding/Conv/RNN/Transformer/Outputs) then we could avoid the ctrl-c ctrl-v that I'm about to do.
Definition of Done:
Add new models in neural_architecture (like cnn_model but with tf-probability capabilities)
melusine/melusine/models/train.py
Line 24 in 4a0a181
The predict_proba
method of this new type of model will provide a better estimation of predicted probability and upper bounds / lower bounds.
I'm currently working on it. Happy to discuss about this topic.