A repository containing recent explainable AI/Interpretable ML approaches
Title | Venue | Year | Code | Keywords | Summary |
---|---|---|---|---|---|
Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission | KDD | 2015 | N/A | `` | |
Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model | arXiv | 2015 | N/A | `` |
Title | Venue | Year | Code | Keywords | Summary |
---|---|---|---|---|---|
Interpretable Decision Sets: A Joint Framework for Description and Prediction | KDD | 2016 | N/A | `` | |
"Why Should I Trust You?": Explaining the Predictions of Any Classifier | KDD | 2016 | N/A | `` | |
Towards A Rigorous Science of Interpretable Machine Learning | arXiv | 2017 | N/A | Review Paper |
Title | Venue | Year | Code | Keywords | Summary |
---|---|---|---|---|---|
Transparency: Motivations and Challenges | arXiv | 2017 | N/A | Review Paper |
|
A Unified Approach to Interpreting Model Predictions | NeurIPS | 2017 | N/A | `` | |
SmoothGrad: removing noise by adding noise | ICML (Workshop) | 2017 | Github | `` | |
Axiomatic Attribution for Deep Networks | ICML | 2017 | N/A | `` | |
Learning Important Features Through Propagating Activation Differences | ICML | 2017 | N/A | `` | |
Understanding Black-box Predictions via Influence Functions | ICML | 2017 | N/A | `` | |
Network Dissection: Quantifying Interpretability of Deep Visual Representations | CVPR | 2017 | N/A | `` |
Title | Venue | Year | Code | Keywords | Summary |
---|---|---|---|---|---|
Explainable Prediction of Medical Codes from Clinical Text | ACL | 2018 | N/A | `` | |
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) | ICML | 2018 | N/A | `` | |
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR | HJTL | 2018 | N/A | `` | |
Sanity Checks for Saliency Maps | NeruIPS | 2018 | N/A | `` | |
Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions | AAAI | 2018 | N/A | `` | |
The Mythos of Model Interpretability | arXiv | 2018 | N/A | Review Paper |
|
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead | Nature Machine Intelligence | 2018 | N/A | `` |
Title | Venue | Year | Code | Keywords | Summary |
---|---|---|---|---|---|
Human Evaluation of Models Built for Interpretability | AAAI | 2019 | N/A | Human in the loop |
|
Data Shapley: Equitable Valuation of Data for Machine Learning | ICML | 2019 | N/A | `` | |
Attention is not Explanation | ACL | 2019 | N/A | `` | |
Actionable Recourse in Linear Classification | FAccT | 2019 | N/A | `` | |
Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead | Nature | 2019 | N/A | `` | |
Explanations can be manipulated and geometry is to blame | NeurIPS | 2019 | N/A | `` | |
Learning Optimized Risk Scores | JMLR | 2019 | N/A | `` | |
Explain Yourself! Leveraging Language Models for Commonsense Reasoning | ACL | 2019 | N/A | `` | |
Deep Neural Networks Constrained by Decision Rules | AAAI | 2018 | N/A | `` | |
Towards Automatic Concept-based Explanations | NeurIPS | 2019 | Github | `` |
Title | Venue | Year | Code | Keywords | Summary |
---|---|---|---|---|---|
A Learning Theoretic Perspective on Local Explainability | ICLR (Poster) | 2021 | N/A | `` | |
A Learning Theoretic Perspective on Local Explainability | ICLR | 2021 | N/A | `` | |
Do Input Gradients Highlight Discriminative Features? | NeurIPS | 2021 | N/A | `` | |
Explaining by Removing: A Unified Framework for Model Explanation | JMLR | 2021 | N/A | `` | |
Explainable Active Learning (XAL): An Empirical Study of How Local Explanations Impact Annotator Experience | PACMHCI | 2021 | N/A | `` | |
Towards Robust and Reliable Algorithmic Recourse | NeurIPS | 2021 | N/A | `` | |
A Framework to Learn with Interpretation | NeurIPS | 2021 | N/A | `` | |
Algorithmic Recourse: from Counterfactual Explanations to Interventions | FAccT | 2021 | N/A | `` | |
Manipulating and Measuring Model Interpretability | CHI | 2021 | N/A | `` | |
Explainable Reinforcement Learning via Model Transforms | NeurIPS | 2021 | N/A | `` | |
Aligning Artificial Neural Networks and Ontologies towards Explainable AI | AAAI | 2021 | N/A | `` |