Explainable AI for Flair

Question

Explainable AI for Flair

krzysztoffiok opened this issue 4 years ago · comments

Hi,

is there anyone who tried/did anyone post an idea of integrating an Explainable AI (XAI) tool with Flair?

There are more or less universal solutions that work also for DL and NLP like LIME or better SHAP.

It would really be great to be able to explain why a given model predicts what it predicts and do that either separately for each text-instance or model-wise.

Homepage of SHAP: https://github.com/slundberg/shap

Alan Akbik · Answer 1 · Wed Apr 01 2020 15:55:14 GMT+0800 (China Standard Time)

Would be interested in this as well. For character language models, there are some visualizations of hidden states that give some indication on whats happening (see for instance this blog post by Andrej Karpathy).

Allen Roush · Answer 2 · Thu Apr 02 2020 16:22:46 GMT+0800 (China Standard Time)

I've demonstrated that something like this is feasible with word embeddings in a repo on my GitHub called "active explainable classification" which uses Flair for embeddings and ELI5 for Lime

kwrobel.eth · Answer 3 · Thu Apr 02 2020 19:17:44 GMT+0800 (China Standard Time)

I would be interested too (preferring SHAP).

Karan · Answer 4 · Mon Apr 27 2020 22:11:51 GMT+0800 (China Standard Time)

Is the similar work being done for NER? Or if something is done, can you point me in the right direction?

krzysztoffiok · Answer 5 · Mon Jun 01 2020 13:43:33 GMT+0800 (China Standard Time)

Hi again. @alanakbik, I've done a quick review of literature (i wouldn't call it systematic) but what i found about XAI in NLP classification is:

If you analyze text and use features that are understandable by humans e.g. provided by lexicon-based methods like LIWC, SEANCE, Term Frequency, and feed them into an ML model, than it is easy and possible to use out of the box packages like LIME or SHAP. With this packages you can achieve either instance level or model level explanations.

For text representation created by LSTMs based on LMs that provide simple static word embeddings (i.e. not changing with context of the token in a sentence) it is possible to create instance-level visualizations of rationale for model predictions as shown in [Li, J., Chen, X., Hovy, E., & Jurafsky, D. (2015). Visualizing and understanding neural models in nlp. arXiv preprint arXiv:1506.01066.] and [Arras, L., Montavon, G., Müller, K. R., & Samek, W. (2017). Explaining recurrent neural network predictions in sentiment analysis. arXiv preprint arXiv:1706.07206.]. Also, [Karpathy, A. (2015). The unreasonable effectiveness of recurrent neural networks. Andrej Karpathy blog, 21, 23.] showed that this is possible for character-level LMs with Recurrent Neural Networks. Unfortunately these prediction models do not provide state of the art performance and there are no ready to use packages to try them on your model. In all these cases instance level explanations are presented.

Also, i found that if a more complex context-aware method creating token representations is used (like a transformer model), there are no methods that allow presentation of models rationale for predictions. The features that they produce are not interpretable and i haven't found any methods to map those embeddings back to tokens.

Do you think what i wrote is true? Did i miss something obvious?

Alan Akbik · Answer 6 · Wed Jun 03 2020 14:43:43 GMT+0800 (China Standard Time)

Thanks for sharing the overview! I think there are a few tools for visualizing attention in transformers, such as https://github.com/jessevig/bertviz - maybe they are also used for visualizing attention in transformers that have been fine-tuned to certain tasks?

krzysztoffiok · Answer 7 · Wed Jun 03 2020 14:51:46 GMT+0800 (China Standard Time)

Hmm thanks a lot for this link, i'll check it out definetely.

krzysztoffiok · Answer 8 · Wed Jun 03 2020 16:12:24 GMT+0800 (China Standard Time)

@alanakbik thanks again, i see the tool is nice because first of all it works out of the box. It allows a very detailed inspection of what is going on in the model, which is nice. At the same time it doesn't offer any aggregated view, it is rather impossible to get an answer from this tool if one posed a question: "why did my model label this sentence as class x?" Actually i don't see any relation to classification task here, it's only what the attention mechanism focuses on. So maybe future will bring some sort of aggregation/reasoning on top of this extracted knowledge...

krzysztoffiok · Answer 9 · Thu Jun 04 2020 16:49:16 GMT+0800 (China Standard Time)

Or is it that the tokens that are linked to [CLS] token can be considered as strongly influencing the classification output? For instance in the figure below the more distance tokens "wonderful" and "Dad" seem to be strongly connected with CLS. Do you think this might be the right way to interpret this?

Alan Akbik · Answer 10 · Thu Jun 04 2020 17:02:29 GMT+0800 (China Standard Time)

I think so - if the CLS token is used for classification and the model is fine-tuned then maybe it could be interpreted this way. Of course normally there are many layers of self-attention, so I am not sure how this visualization deals with that.

krzysztoffiok · Answer 11 · Thu Jun 04 2020 17:09:11 GMT+0800 (China Standard Time)

This figure was the output of head 0 layer 0, if i select differently then nothing reasonable is outputed (see below). Also its funny why did the tokenizer divide eulogy into e ul ogy...

Gregorios A Katsios · Answer 12 · Mon Aug 10 2020 01:02:30 GMT+0800 (China Standard Time)

Hello,

I am new to using FLAIR. Is still an active endeavor for the contributors and developers of FLAIR?

Other NLP toolkits already have simple gradient visualization and other interpretation methods implemented (e.g. https://allennlp.org/interpret, and a demo of these https://demo.allennlp.org/sentiment-analysis/) Links to specific literature can be found through the second link. I think these methods could be a valuable asset if integrated in FLAIR.

If these are already implemented in FLAIR, could you please explain how I could use them?

Thanks!

Alan Akbik · Answer 13 · Thu Aug 13 2020 21:57:22 GMT+0800 (China Standard Time)

It would be cool to have this in Flair - we ourselves don't currently have the capacity to integrate visualization options, but maybe someone in the community is interested to do this?

Robin van Schaik · Answer 14 · Mon Aug 31 2020 05:09:15 GMT+0800 (China Standard Time)

I think that explainable AI would be great!

Recently a Google offshoot called LIT was released.

While this repository looks nice, it is still in its infancy.

There is some documentation on how to add models to the LIT framework here.

However, I don't really have a grasp whether the implementation of adding new models will be scaleable, nor whether this progess will differentiate greatly between all the available (fine-tuned) models in FLAIR.

Robin van Schaik · Answer 15 · Thu Oct 29 2020 06:00:28 GMT+0800 (China Standard Time)

Adding to the discussion that CAPTUM has been used with FLAIR. I have not yet achieved this, but it should be possible.

pytorch/captum#414 (comment)

Robin van Schaik · Answer 16 · Sun Nov 22 2020 06:03:06 GMT+0800 (China Standard Time)

I added my work-in-progress of using Captum to explain my Flair model in this repository.

Given that I had to create a model wrapper and reverse engineer the forward function to make it work, I am not sure if the route I have taken is the optimal one or the correct one.

@alanakbik If you have any pointers, then it would be greatly appreciated. 👍

I wil also try to upload my trained text-classifier model in order to make the repo run end-to-end. Unsure whether Github LFS will be suitable as my model.pt file is around 1 Gigs.

Alan Akbik · Answer 17 · Mon Nov 23 2020 22:09:17 GMT+0800 (China Standard Time)

@robinvanschaik thanks for sharing! I'm super swamped this week but I'll try to go through at the beginning of next week!

Robin van Schaik · Answer 18 · Tue Nov 24 2020 00:20:15 GMT+0800 (China Standard Time)

@alanakbik Thank you very much. There is no rush on my end, so feel free to pick a moment which suits you.

Alan Akbik · Answer 19 · Thu Dec 03 2020 16:11:10 GMT+0800 (China Standard Time)

@robinvanschaik we checked it out and it's really helpful!

I wonder if there's a way to create a wrapper so that any Flair tagger works and not only those that use transformers? Also, I think this approach would be great for the new TARS zero-shot classifier we just released. Explainability for zero-shot predictions would be a cool feature!

Robin van Schaik · Answer 20 · Thu Dec 03 2020 16:46:21 GMT+0800 (China Standard Time)

Hi @alanakbik ,

Thanks for the feedback. Much appreciated!

I like the idea of adding CAPTUM to the TARS classifier.
Given that the FLAIR team has released a pre-trained model, it will be easier to run the examples end-to-end.

Regarding the other options, I might pick that up after TARS. I am not really experienced with the other types of models that FLAIR offers, but I think it is doable based on the tutorials that the CAMPTUM team has released.

stale · Answer 21 · Fri Apr 02 2021 19:35:22 GMT+0800 (China Standard Time)

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

krzysztoffiok · Answer 22 · Wed May 12 2021 02:29:26 GMT+0800 (China Standard Time)

I have just found that https://github.com/slundberg/shap#natural-language-example-transformers present an example of XAI for transformer models, that is way more interpretable than earlier discussed bertviz and similar.

Did anyone here try to use this new feature in SHAP on fine tuned models from Flair? Does it work?

@robinvanschaik I have tried out your solution and it is great. This is exactly something I was looking for! Thank you for contributing this.