Improving performance of deep learning models with axiomatic attribution priors and expected gradients
Incorporating priors with feature attribution on text classification
Robust attribution regularization
Interpretations are useful: penalizing explanations to align neural networks with prior knowledge