Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explainable AI for Flair #1504

Closed
krzysztoffiok opened this issue Apr 1, 2020 · 22 comments
Closed

Explainable AI for Flair #1504

krzysztoffiok opened this issue Apr 1, 2020 · 22 comments
Labels
wontfix This will not be worked on

Comments

@krzysztoffiok
Copy link

krzysztoffiok commented Apr 1, 2020

Hi,

is there anyone who tried/did anyone post an idea of integrating an Explainable AI (XAI) tool with Flair?

There are more or less universal solutions that work also for DL and NLP like LIME or better SHAP.

It would really be great to be able to explain why a given model predicts what it predicts and do that either separately for each text-instance or model-wise.

Homepage of SHAP: https://github.com/slundberg/shap

@alanakbik
Copy link
Collaborator

Would be interested in this as well. For character language models, there are some visualizations of hidden states that give some indication on whats happening (see for instance this blog post by Andrej Karpathy).

@Hellisotherpeople
Copy link

I've demonstrated that something like this is feasible with word embeddings in a repo on my GitHub called "active explainable classification" which uses Flair for embeddings and ELI5 for Lime

@djstrong
Copy link
Contributor

djstrong commented Apr 2, 2020

I would be interested too (preferring SHAP).

@kdk2612
Copy link

kdk2612 commented Apr 27, 2020

Is the similar work being done for NER? Or if something is done, can you point me in the right direction?

@krzysztoffiok
Copy link
Author

krzysztoffiok commented Jun 1, 2020

Hi again. @alanakbik, I've done a quick review of literature (i wouldn't call it systematic) but what i found about XAI in NLP classification is:

If you analyze text and use features that are understandable by humans e.g. provided by lexicon-based methods like LIWC, SEANCE, Term Frequency, and feed them into an ML model, than it is easy and possible to use out of the box packages like LIME or SHAP. With this packages you can achieve either instance level or model level explanations.

For text representation created by LSTMs based on LMs that provide simple static word embeddings (i.e. not changing with context of the token in a sentence) it is possible to create instance-level visualizations of rationale for model predictions as shown in [Li, J., Chen, X., Hovy, E., & Jurafsky, D. (2015). Visualizing and understanding neural models in nlp. arXiv preprint arXiv:1506.01066.] and [Arras, L., Montavon, G., Müller, K. R., & Samek, W. (2017). Explaining recurrent neural network predictions in sentiment analysis. arXiv preprint arXiv:1706.07206.]. Also, [Karpathy, A. (2015). The unreasonable effectiveness of recurrent neural networks. Andrej Karpathy blog, 21, 23.] showed that this is possible for character-level LMs with Recurrent Neural Networks. Unfortunately these prediction models do not provide state of the art performance and there are no ready to use packages to try them on your model. In all these cases instance level explanations are presented.

Also, i found that if a more complex context-aware method creating token representations is used (like a transformer model), there are no methods that allow presentation of models rationale for predictions. The features that they produce are not interpretable and i haven't found any methods to map those embeddings back to tokens.

Do you think what i wrote is true? Did i miss something obvious?

@alanakbik
Copy link
Collaborator

Thanks for sharing the overview! I think there are a few tools for visualizing attention in transformers, such as https://github.com/jessevig/bertviz - maybe they are also used for visualizing attention in transformers that have been fine-tuned to certain tasks?

@krzysztoffiok
Copy link
Author

Hmm thanks a lot for this link, i'll check it out definetely.

@krzysztoffiok
Copy link
Author

@alanakbik thanks again, i see the tool is nice because first of all it works out of the box. It allows a very detailed inspection of what is going on in the model, which is nice. At the same time it doesn't offer any aggregated view, it is rather impossible to get an answer from this tool if one posed a question: "why did my model label this sentence as class x?" Actually i don't see any relation to classification task here, it's only what the attention mechanism focuses on. So maybe future will bring some sort of aggregation/reasoning on top of this extracted knowledge...

@krzysztoffiok
Copy link
Author

Or is it that the tokens that are linked to [CLS] token can be considered as strongly influencing the classification output? For instance in the figure below the more distance tokens "wonderful" and "Dad" seem to be strongly connected with CLS. Do you think this might be the right way to interpret this?

figure3

@alanakbik
Copy link
Collaborator

I think so - if the CLS token is used for classification and the model is fine-tuned then maybe it could be interpreted this way. Of course normally there are many layers of self-attention, so I am not sure how this visualization deals with that.

@krzysztoffiok
Copy link
Author

krzysztoffiok commented Jun 4, 2020

This figure was the output of head 0 layer 0, if i select differently then nothing reasonable is outputed (see below). Also its funny why did the tokenizer divide eulogy into e ul ogy...

layer3head3

@HiyaToki
Copy link

HiyaToki commented Aug 9, 2020

Hello,

I am new to using FLAIR. Is still an active endeavor for the contributors and developers of FLAIR?

Other NLP toolkits already have simple gradient visualization and other interpretation methods implemented (e.g. https://allennlp.org/interpret, and a demo of these https://demo.allennlp.org/sentiment-analysis/) Links to specific literature can be found through the second link. I think these methods could be a valuable asset if integrated in FLAIR.

If these are already implemented in FLAIR, could you please explain how I could use them?

Thanks!

@alanakbik
Copy link
Collaborator

It would be cool to have this in Flair - we ourselves don't currently have the capacity to integrate visualization options, but maybe someone in the community is interested to do this?

@robinvanschaik
Copy link
Contributor

I think that explainable AI would be great!

Recently a Google offshoot called LIT was released.

While this repository looks nice, it is still in its infancy.

There is some documentation on how to add models to the LIT framework here.

However, I don't really have a grasp whether the implementation of adding new models will be scaleable, nor whether this progess will differentiate greatly between all the available (fine-tuned) models in FLAIR.

@robinvanschaik
Copy link
Contributor

Adding to the discussion that CAPTUM has been used with FLAIR. I have not yet achieved this, but it should be possible.

pytorch/captum#414 (comment)

@robinvanschaik
Copy link
Contributor

I added my work-in-progress of using Captum to explain my Flair model in this repository.

Given that I had to create a model wrapper and reverse engineer the forward function to make it work, I am not sure if the route I have taken is the optimal one or the correct one.

@alanakbik If you have any pointers, then it would be greatly appreciated. 👍

I wil also try to upload my trained text-classifier model in order to make the repo run end-to-end. Unsure whether Github LFS will be suitable as my model.pt file is around 1 Gigs.

@alanakbik
Copy link
Collaborator

@robinvanschaik thanks for sharing! I'm super swamped this week but I'll try to go through at the beginning of next week!

@robinvanschaik
Copy link
Contributor

@alanakbik Thank you very much. There is no rush on my end, so feel free to pick a moment which suits you.

@alanakbik
Copy link
Collaborator

@robinvanschaik we checked it out and it's really helpful!

I wonder if there's a way to create a wrapper so that any Flair tagger works and not only those that use transformers? Also, I think this approach would be great for the new TARS zero-shot classifier we just released. Explainability for zero-shot predictions would be a cool feature!

@robinvanschaik
Copy link
Contributor

robinvanschaik commented Dec 3, 2020

Hi @alanakbik ,

Thanks for the feedback. Much appreciated!

I like the idea of adding CAPTUM to the TARS classifier.
Given that the FLAIR team has released a pre-trained model, it will be easier to run the examples end-to-end.

Regarding the other options, I might pick that up after TARS. I am not really experienced with the other types of models that FLAIR offers, but I think it is doable based on the tutorials that the CAMPTUM team has released.

@stale
Copy link

stale bot commented Apr 2, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Apr 2, 2021
@stale stale bot closed this as completed Apr 10, 2021
@krzysztoffiok
Copy link
Author

krzysztoffiok commented May 11, 2021

I have just found that https://github.com/slundberg/shap#natural-language-example-transformers present an example of XAI for transformer models, that is way more interpretable than earlier discussed bertviz and similar.

Did anyone here try to use this new feature in SHAP on fine tuned models from Flair? Does it work?

@robinvanschaik I have tried out your solution and it is great. This is exactly something I was looking for! Thank you for contributing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

7 participants