Interpretable DL Playground

This is a repository that contains a comprehensive resource for AI practitioners and enthusiasts to explore and practice interpretable deep learning techniques, featuring a curated collection of modules, a relevant book "Interpretability in Deep Learning" (Springer 2023), and intuitive explanations.

While we do recommend basic deep learning knowledge, we've also included fundamental concepts for the convenience of our readers. Our goal is to help you understand the most prevalent and common practices of explainable AI and provide intuitive explanations of the techniques without delving too much into the mathematics at the beginning of each chapter.

About the Repository

Are you tired of black-box deep learning models that are difficult to interpret and explain? Look no further! Our repository aims to contain a curated collection of practice modules (shall be expanded over time) for interpretable AI and trustworthy model development.

Whether you're an AI practitioner or simply interested in learning more about interpretable deep learning techniques, our repository is the perfect resource for you to start. But that's not all! We've also included resources for you to further your understanding of the topic. We attempt to expand the practice modules over time in complex topics to help you apply the techniques in a hands-on manner, so you can see the benefits of interpretable AI for yourself.

Research Papers - To Read

Paper Title	Conference/Journal	Author	Description
Towards Robust Interpretability with Self-Explaining Neural Networks!	NeurIPS 2018	D. Alvarez-Melis and T.S. Jaakkola	Proposes self-explaining neural networks, which learn to provide interpretable explanations for their predictions by incorporating explicit explanatory factors into the model architecture.
Sanity Checks for Saliency Maps	NeurIPS 2018	Adebayo et al.	Highlights the limitations of saliency methods and proposes sanity checks to ensure that the explanations provided by these methods are meaningful and reliable.
On the (In)fidelity and Sensitivity of Explanations	NeurIPS 2019	Yeh at al.	Introduces two metrics, fidelity and sensitivity, to evaluate the quality of explanations provided by various interpretability methods, including saliency and CAMs.
Invariant Risk Minimization	arXiv 2019	Arjovsky et al.	Introduces Invariant Risk Minimization, a learning framework that encourages models to rely on features that are invariant across different environments, leading to more robust and interpretable predictions
Explanation by Progressive Exaggeration	ICLR 2020	Singla et al.	A new method which iteratively exaggerates the most important features in the input to generate more robust and interpretable explanations. GitHub
Relevance-CAM: Your Model Already Knows Where to Look!	CVPR 2021	Lee et al.	A method to generate class-discriminative visual explanations using pre-trained deep neural networks without additional training or modification.
Neural Prototype Trees for Interpretable Fine-Grained Image Recognition	CVPR 2021	Nauta et al.	A hierarchical approach for interpretable fine-grained image recognition that combines neural networks with decision trees.
Concept-Monitor: Understanding DNN training through individual neurons	arXiv April 2023	Khan et al.	A framework for demystifying black-box training processes using a unified embedding space and concept diversity metric, enabling interpretable visualization, improved training performance, and application to various training paradigms.

Recent and Interesting Archived Papers on Chat-GPT for Research:

[1] Differentiate ChatGPT-generated and Human-written Medical Texts | Liao et al. (2023)
[2] In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT | Shen et al. (2023)
[3] Toxicity in CHATGPT: Analyzing Persona-assigned Language Models | Deshpande et al. (2023)

Intriguing Model Improvement Application Paper:

[1] Attention-based Dropout Layer for Weakly Supervised Object Localization | Junsuk Choe, Hyunjung Shim (CVPR 2019) |

GitHub |

Citation

If you wish to cite the book "Interpretability in Deep Learning", feel free to use this BibTeX reference:

@book{somani2023interpretability,
  title={Interpretability in Deep Learning},
  author={Somani, Ayush and Horsch, Alexander and Prasad, Dilip K},
  year={2023},
  publisher={Springer Nature}
}

Contributing

Feeling like extending the range of possibilities of interpretable methods to make AI more trustable? Or perhaps submitting a paper implementation? Any sort of contribution is greatly appreciated!

About

This is a repository that contains practice module for the book 'Interpretability in Deep Learning'. This is recommended to AI practitioners and basically anyone who wants an overview of techniques to make their deep learning models more interpretable. To benefit from this book considerably, we expect the readers to be acquainted with basic deep learning terminologies. However, we attempt to introduce fundamental concepts for the convenience of the readers in the book. It should be likely to understand the intuitive explanation of the technique at the beginning of each chapter without mathematics.

Languages

Language:Jupyter Notebook 96.9%Language:Python 3.1%