flairNLP / flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Home Page:https://flairnlp.github.io/flair/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Explainable AI for Flair

krzysztoffiok opened this issue · comments

Hi,

is there anyone who tried/did anyone post an idea of integrating an Explainable AI (XAI) tool with Flair?

There are more or less universal solutions that work also for DL and NLP like LIME or better SHAP.

It would really be great to be able to explain why a given model predicts what it predicts and do that either separately for each text-instance or model-wise.

Homepage of SHAP: https://github.com/slundberg/shap

Would be interested in this as well. For character language models, there are some visualizations of hidden states that give some indication on whats happening (see for instance this blog post by Andrej Karpathy).

I've demonstrated that something like this is feasible with word embeddings in a repo on my GitHub called "active explainable classification" which uses Flair for embeddings and ELI5 for Lime

I would be interested too (preferring SHAP).

commented

Is the similar work being done for NER? Or if something is done, can you point me in the right direction?

Hi again. @alanakbik, I've done a quick review of literature (i wouldn't call it systematic) but what i found about XAI in NLP classification is:

If you analyze text and use features that are understandable by humans e.g. provided by lexicon-based methods like LIWC, SEANCE, Term Frequency, and feed them into an ML model, than it is easy and possible to use out of the box packages like LIME or SHAP. With this packages you can achieve either instance level or model level explanations.

For text representation created by LSTMs based on LMs that provide simple static word embeddings (i.e. not changing with context of the token in a sentence) it is possible to create instance-level visualizations of rationale for model predictions as shown in [Li, J., Chen, X., Hovy, E., & Jurafsky, D. (2015). Visualizing and understanding neural models in nlp. arXiv preprint arXiv:1506.01066.] and [Arras, L., Montavon, G., Müller, K. R., & Samek, W. (2017). Explaining recurrent neural network predictions in sentiment analysis. arXiv preprint arXiv:1706.07206.]. Also, [Karpathy, A. (2015). The unreasonable effectiveness of recurrent neural networks. Andrej Karpathy blog, 21, 23.] showed that this is possible for character-level LMs with Recurrent Neural Networks. Unfortunately these prediction models do not provide state of the art performance and there are no ready to use packages to try them on your model. In all these cases instance level explanations are presented.

Also, i found that if a more complex context-aware method creating token representations is used (like a transformer model), there are no methods that allow presentation of models rationale for predictions. The features that they produce are not interpretable and i haven't found any methods to map those embeddings back to tokens.

Do you think what i wrote is true? Did i miss something obvious?

Thanks for sharing the overview! I think there are a few tools for visualizing attention in transformers, such as https://github.com/jessevig/bertviz - maybe they are also used for visualizing attention in transformers that have been fine-tuned to certain tasks?

Hmm thanks a lot for this link, i'll check it out definetely.

@alanakbik thanks again, i see the tool is nice because first of all it works out of the box. It allows a very detailed inspection of what is going on in the model, which is nice. At the same time it doesn't offer any aggregated view, it is rather impossible to get an answer from this tool if one posed a question: "why did my model label this sentence as class x?" Actually i don't see any relation to classification task here, it's only what the attention mechanism focuses on. So maybe future will bring some sort of aggregation/reasoning on top of this extracted knowledge...

Or is it that the tokens that are linked to [CLS] token can be considered as strongly influencing the classification output? For instance in the figure below the more distance tokens "wonderful" and "Dad" seem to be strongly connected with CLS. Do you think this might be the right way to interpret this?

figure3

I think so - if the CLS token is used for classification and the model is fine-tuned then maybe it could be interpreted this way. Of course normally there are many layers of self-attention, so I am not sure how this visualization deals with that.

This figure was the output of head 0 layer 0, if i select differently then nothing reasonable is outputed (see below). Also its funny why did the tokenizer divide eulogy into e ul ogy...

layer3head3

Hello,

I am new to using FLAIR. Is still an active endeavor for the contributors and developers of FLAIR?

Other NLP toolkits already have simple gradient visualization and other interpretation methods implemented (e.g. https://allennlp.org/interpret, and a demo of these https://demo.allennlp.org/sentiment-analysis/) Links to specific literature can be found through the second link. I think these methods could be a valuable asset if integrated in FLAIR.

If these are already implemented in FLAIR, could you please explain how I could use them?

Thanks!

It would be cool to have this in Flair - we ourselves don't currently have the capacity to integrate visualization options, but maybe someone in the community is interested to do this?

I think that explainable AI would be great!

Recently a Google offshoot called LIT was released.

While this repository looks nice, it is still in its infancy.

There is some documentation on how to add models to the LIT framework here.

However, I don't really have a grasp whether the implementation of adding new models will be scaleable, nor whether this progess will differentiate greatly between all the available (fine-tuned) models in FLAIR.

Adding to the discussion that CAPTUM has been used with FLAIR. I have not yet achieved this, but it should be possible.

pytorch/captum#414 (comment)

I added my work-in-progress of using Captum to explain my Flair model in this repository.

Given that I had to create a model wrapper and reverse engineer the forward function to make it work, I am not sure if the route I have taken is the optimal one or the correct one.

@alanakbik If you have any pointers, then it would be greatly appreciated. 👍

I wil also try to upload my trained text-classifier model in order to make the repo run end-to-end. Unsure whether Github LFS will be suitable as my model.pt file is around 1 Gigs.

@robinvanschaik thanks for sharing! I'm super swamped this week but I'll try to go through at the beginning of next week!

@alanakbik Thank you very much. There is no rush on my end, so feel free to pick a moment which suits you.

@robinvanschaik we checked it out and it's really helpful!

I wonder if there's a way to create a wrapper so that any Flair tagger works and not only those that use transformers? Also, I think this approach would be great for the new TARS zero-shot classifier we just released. Explainability for zero-shot predictions would be a cool feature!

Hi @alanakbik ,

Thanks for the feedback. Much appreciated!

I like the idea of adding CAPTUM to the TARS classifier.
Given that the FLAIR team has released a pre-trained model, it will be easier to run the examples end-to-end.

Regarding the other options, I might pick that up after TARS. I am not really experienced with the other types of models that FLAIR offers, but I think it is doable based on the tutorials that the CAMPTUM team has released.

commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

I have just found that https://github.com/slundberg/shap#natural-language-example-transformers present an example of XAI for transformer models, that is way more interpretable than earlier discussed bertviz and similar.

Did anyone here try to use this new feature in SHAP on fine tuned models from Flair? Does it work?

@robinvanschaik I have tried out your solution and it is great. This is exactly something I was looking for! Thank you for contributing this.