Recent Publications in Explainable AI

A repository containing recent explainable AI/Interpretable ML approaches

2015

Title	Venue	Year	Code	Keywords	Summary
Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission	KDD	2015	N/A	``
Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model	arXiv	2015	N/A	``

2016

Title	Venue	Year	Code	Keywords
Interpretable Decision Sets: A Joint Framework for Description and Prediction	KDD	2016	N/A	``
"Why Should I Trust You?": Explaining the Predictions of Any Classifier	KDD	2016	N/A	``
Towards A Rigorous Science of Interpretable Machine Learning	arXiv	2017	N/A	`Review Paper`

2017

Title	Venue	Year	Code	Keywords
Transparency: Motivations and Challenges	arXiv	2017	N/A	`Review Paper`
A Unified Approach to Interpreting Model Predictions	NeurIPS	2017	N/A	``
SmoothGrad: removing noise by adding noise	ICML (Workshop)	2017	Github	``
Axiomatic Attribution for Deep Networks	ICML	2017	N/A	``
Learning Important Features Through Propagating Activation Differences	ICML	2017	N/A	``
Understanding Black-box Predictions via Influence Functions	ICML	2017	N/A	``
Network Dissection: Quantifying Interpretability of Deep Visual Representations	CVPR	2017	N/A	``

2018

Title	Venue	Year	Code	Keywords
Explainable Prediction of Medical Codes from Clinical Text	ACL	2018	N/A	``
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)	ICML	2018	N/A	``
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR	HJTL	2018	N/A	``
Sanity Checks for Saliency Maps	NeruIPS	2018	N/A	``
Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions	AAAI	2018	N/A	``
The Mythos of Model Interpretability	arXiv	2018	N/A	`Review Paper`
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead	Nature Machine Intelligence	2018	N/A	``

2019

Title	Venue	Year	Code	Keywords
Human Evaluation of Models Built for Interpretability	AAAI	2019	N/A	`Human in the loop`
Data Shapley: Equitable Valuation of Data for Machine Learning	ICML	2019	N/A	``
Attention is not Explanation	ACL	2019	N/A	``
Actionable Recourse in Linear Classification	FAccT	2019	N/A	``
Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead	Nature	2019	N/A	``
Explanations can be manipulated and geometry is to blame	NeurIPS	2019	N/A	``
Learning Optimized Risk Scores	JMLR	2019	N/A	``
Explain Yourself! Leveraging Language Models for Commonsense Reasoning	ACL	2019	N/A	``
Deep Neural Networks Constrained by Decision Rules	AAAI	2018	N/A	``
Towards Automatic Concept-based Explanations	NeurIPS	2019	Github	``

2020

Title	Venue	Year	Code	Keywords
Interpreting the Latent Space of GANs for Semantic Face Editing	CVPR	2020	N/A	``
GANSpace: Discovering Interpretable GAN Controls	NeurIPS	2020	N/A	``
Explainability for fair machine learning	arXiv	2020	N/A	``
An Introduction to Circuits	Distill	2020	N/A	`Tutorial`
Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses	NeurIPS	2020	N/A	``
Learning Model-Agnostic Counterfactual Explanations for Tabular Data	WWW	2020	N/A	``
Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods	AIES (AAAI)	2020	N/A	``
Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning	CHI	2020	N/A	`Review Paper`
Human Factors in Model Interpretability: Industry Practices, Challenges, and Needs	arXiv	2020	N/A	`Review Paper`
Human-Driven FOL Explanations of Deep Learning	IJCAI	2020	N\A	'Logic Explanations'
A Constraint-Based Approach to Learning and Explanation	AAAI	2020	N\A	'Mutual Information'

2021

Title	Venue	Year	Code	Keywords
A Learning Theoretic Perspective on Local Explainability	ICLR (Poster)	2021	N/A	``
A Learning Theoretic Perspective on Local Explainability	ICLR	2021	N/A	``
Do Input Gradients Highlight Discriminative Features?	NeurIPS	2021	N/A	``
Explaining by Removing: A Unified Framework for Model Explanation	JMLR	2021	N/A	``
Explainable Active Learning (XAL): An Empirical Study of How Local Explanations Impact Annotator Experience	PACMHCI	2021	N/A	``
Towards Robust and Reliable Algorithmic Recourse	NeurIPS	2021	N/A	``
A Framework to Learn with Interpretation	NeurIPS	2021	N/A	``
Algorithmic Recourse: from Counterfactual Explanations to Interventions	FAccT	2021	N/A	``
Manipulating and Measuring Model Interpretability	CHI	2021	N/A	``
Explainable Reinforcement Learning via Model Transforms	NeurIPS	2021	N/A	``
Aligning Artificial Neural Networks and Ontologies towards Explainable AI	AAAI	2021	N/A	``

2022

Title	Venue	Year	Code	Keywords
GlanceNets: Interpretabile, Leak-proof Concept-based Models	CRL	2022	N/A	``
Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases	Transformer Circuit Thread	2022	N/A	`Tutorial`
Can language models learn from explanations in context?	EMNLP	2022	N/A	`DeepMind`
Interpreting Language Models with Contrastive Explanations	EMNLP	2022	N/A	``
Acquisition of Chess Knowledge in AlphaZero	PNAS	2022	N/A	`DeepMind` `GoogleBrain`
What the DAAM: Interpreting Stable Diffusion Using Cross Attention	arXiv	2022	Github	``
Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis	AISTATS	2022	N/A	``
Use-Case-Grounded Simulations for Explanation Evaluation	NeurIPS	2022	N/A	``
The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective	arXiv	2022	N/A	``
What Makes a Good Explanation?: A Harmonized View of Properties of Explanations	arXiv	2022	N/A	``
NoiseGrad — Enhancing Explanations by Introducing Stochasticity to Model Weights	AAAI	2022	Github	``
Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations	AIES (AAAI)	2022	N/A	``
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Models	arXiv	2022	Github	``
Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off	NuerIPS	2022	Github	`CBM`, `CEM`
Self-explaining deep models with logic rule reasoning	NeurIPS	2022	N/A	``
What You See is What You Classify: Black Box Attributions	NeurIPS	2022	N/A	``
Concept Activation Regions: A Generalized Framework For Concept-Based Explanations	NeurIPS	2022	N/A	``
What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods	NeurIPS	2022	N/A	``
Scalable Interpretability via Polynomials	NeurIPS	2022	N/A	``
Learning to Scaffold: Optimizing Model Explanations for Teaching	NeurIPS	2022	N/A	``
Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF	NeurIPS	2022	N/A	``
WeightedSHAP: analyzing and improving Shapley based feature attribution	NeurIPS	2022	N/A	``
Visual correspondence-based explanations improve AI robustness and human-AI team accuracy	NeurIPS	2022	N/A	``
VICE: Variational Interpretable Concept Embeddings	NeurIPS	2022	N/A	``
Robust Feature-Level Adversaries are Interpretability Tools	NeurIPS	2022	N/A	``
ProtoX: Explaining a Reinforcement Learning Agent via Prototyping	NeurIPS	2022	N/A	``
ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model	NeurIPS	2022	N/A	``
Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability	NeurIPS	2022	N/A	``
Neural Basis Models for Interpretability	NeurIPS	2022	N/A	``
Implications of Model Indeterminacy for Explanations of Automated Decisions	NeurIPS	2022	N/A	``
Explainability Via Causal Self-Talk	NeurIPS	2022	N/A	`DeepMind`
TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations	NeurIPS	2022	N/A	``
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models	NeurIPS	2022	N/A	`GoogleBrain`
OpenXAI: Towards a Transparent Evaluation of Model Explanations	NeurIPS	2022	N/A	``
Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations	NeurIPS	2022	N/A	``
Foundations of Symbolic Languages for Model Interpretability	NeurIPS	2022	N/A	``
The Utility of Explainable AI in Ad Hoc Human-Machine Teaming	NeurIPS	2022	N/A	``
Addressing Leakage in Concept Bottleneck Models	NeurIPS	2022	N/A	``
Interpreting Language Models with Contrastive Explanations	EMNLP	2022	N/A	``
Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models	EMNLP	2022	N/A	``
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations	EMNLP	2022	N/A	``
MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure	EMNLP	2022	N/A	``
Towards Interactivity and Interpretability: A Rationale-based Legal Judgment Prediction Framework	EMNLP	2022	N/A	``
Explainable Question Answering based on Semantic Graph by Global Differentiable Learning and Dynamic Adaptive Reasoning	EMNLP	2022	N/A	``
Faithful Knowledge Graph Explanations in Commonsense Question Answering	EMNLP	2022	N/A	``
Optimal Interpretable Clustering Using Oblique Decision Trees	KDD	2022	N/A	``
ExMeshCNN: An Explainable Convolutional Neural Network Architecture for 3D Shape Analysis	KDD	2022	N/A	``
Learning Differential Operators for Interpretable Time Series Modeling	KDD	2022	N/A	``
Compute Like Humans: Interpretable Step-by-step Symbolic Computation with Deep Neural Network	KDD	2022	N/A	``
Causal Attention for Interpretable and Generalizable Graph Classification	KDD	2022	N/A	``
Group-wise Reinforcement Feature Generation for Optimal and Explainable Representation Space Reconstruction	KDD	2022	N/A	``
Label-Free Explainability for Unsupervised Models	ICML	2022	N/A	``
Rethinking Attention-Model Explainability through Faithfulness Violation Test	ICML	2022	N/A	``
Hierarchical Shrinkage: Improving the Accuracy and Interpretability of Tree-Based Methods	ICML	2022	N/A	``
A Functional Information Perspective on Model Interpretation	ICML	2022	N/A	``
Inducing Causal Structure for Interpretable Neural Networks	ICML	2022	N/A	``
ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder	ICML	2022	N/A	``
Interpretable Neural Networks with Frank-Wolfe: Sparse Relevance Maps and Relevance Orderings	ICML	2022	N/A	``
Interpretable and Generalizable Graph Learning via Stochastic Attention Mechanism	ICML	2022	N/A	``
Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers	ICML	2022	N/A	``
Robust Models Are More Interpretable Because Attributions Look Normal	ICML	2022	N/A	``
Latent Diffusion Energy-Based Model for Interpretable Text Modelling	ICML	2022	N/A	``
Crowd, Expert & AI: A Human-AI Interactive Approach Towards Natural Language Explanation based COVID-19 Misinformation Detection	IJCAI	2022	N/A	``
AttExplainer: Explain Transformer via Attention by Reinforcement Learning	IJCAI	2022	N/A	``
Investigating and explaining the frequency bias in classification	IJCAI	2022	N/A	``
Counterfactual Interpolation Augmentation (CIA): A Unified Approach to Enhance Fairness and Explainability of DNN	IJCAI	2022	N/A	``
Axiomatic Foundations of Explainability	IJCAI	2022	N/A	``
Explaining Soft-Goal Conflicts through Constraint Relaxations	IJCAI	2022	N/A	``
Robust Interpretable Text Classification against Spurious Correlations Using AND-rules with Negation	IJCAI	2022	N/A	``
Interpretable AMR-Based Question Decomposition for Multi-hop Question Answering	IJCAI	2022	N/A	``
Toward Policy Explanations for Multi-Agent Reinforcement Learning	IJCAI	2022	N/A	``
“My nose is running.” “Are you also coughing?”: Building A Medical Diagnosis Agent with Interpretable Inquiry Logics	IJCAI	2022	N/A	``
Model Stealing Defense against Exploiting Information Leak Through the Interpretation of Deep Neural Nets	IJCAI	2022	N/A	``
Learning by Interpreting	IJCAI	2022	N/A	``
Using Constraint Programming and Graph Representation Learning for Generating Interpretable Cloud Security Policies	IJCAI	2022	N/A	``
Explanations for Negative Query Answers under Inconsistency-Tolerant Semantics	IJCAI	2022	N/A	``
On Preferred Abductive Explanations for Decision Trees and Random Forests	IJCAI	2022	N/A	``
Adversarial Explanations for Knowledge Graph Embeddings	IJCAI	2022	N/A	``
Looking Inside the Black-Box: Logic-based Explanations for Neural Networks	KR	2022	N/A	``
Entropy-Based Logic Explanations of Neural Networks	AAAI	2022	N/A	``
Explainable Neural Rule Learning	WWW	2022	N/A	``
Explainable Deep Learning: A Field Guide for the Uninitiated	JAIR	2022	N/A	``
			N/A	``

2023

Title	Venue	Year	Code	Keywords
On the Privacy Risks of Algorithmic Recourse	AISTATS	2023	N/A	``
Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten	ICML	2023	N/A	``
Tracr: Compiled Transformers as a Laboratory for Interpretability	arXiv	2023	Github	`DeepMind`
Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse	ICLR	2023	N/A	``
Concept-level Debugging of Part-Prototype Networks	ICLR	2023	N/A	``
Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning	ICLR	2023	N/A	``
Re-calibrating Feature Attributions for Model Interpretation	ICLR	2023	N/A	``
Post-hoc Concept Bottleneck Models	ICLR	2023	N/A	``
Quantifying Memorization Across Neural Language Models	ICLR	2023	N/A	``
STREET: A Multi-Task Structured Reasoning and Explanation Benchmark	ICLR	2023	N/A	``
PIP-Net: Patch-Based Intuitive Prototypes for Interpretable Image Classification	CVPR	2023	N/A	``
EVAL: Explainable Video Anomaly Localization	CVPR	2023	N/A	``
Overlooked Factors in Concept-based Explanations: Dataset Choice, Concept Learnability, and Human Capability	CVPR	2023	Github	``
Spatial-Temporal Concept Based Explanation of 3D ConvNets	CVPR	2023	Github	``
Adversarial Counterfactual Visual Explanations	CVPR	2023	N/A	``
Bridging the Gap Between Model Explanations in Partially Annotated Multi-Label Classification	CVPR	2023	N/A	``
Explaining Image Classifiers With Multiscale Directional Image Representation	CVPR	2023	N/A	``
CRAFT: Concept Recursive Activation FacTorization for Explainability	CVPR	2023	N/A	``
SketchXAI: A First Look at Explainability for Human Sketches	CVPR	2023	N/A	``
Don't Lie to Me! Robust and Efficient Explainability With Verified Perturbation Analysis	CVPR	2023	N/A	``
Gradient-Based Uncertainty Attribution for Explainable Bayesian Deep Learning	CVPR	2023	N/A	``
Learning Bottleneck Concepts in Image Classification	CVPR	2023	N/A	``
Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification	CVPR	2023	N/A	``
Interpretable Neural-Symbolic Concept Reasoning	ICML	2023	Github
Identifying Interpretable Subspaces in Image Representations	ICML	2023	N/A	``
Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat	ICML	2023	N/A	``
Explainability as statistical inference	ICML	2023	N/A	``
On the Impact of Knowledge Distillation for Model Interpretability	ICML	2023	N/A	``
NA2Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning	ICML	2023	N/A	``
Explaining Reinforcement Learning with Shapley Values	ICML	2023	N/A	``
Explainable Data-Driven Optimization: From Context to Decision and Back Again	ICML	2023	N/A	``
Causal Proxy Models for Concept-based Model Explanations	ICML	2023	N/A	``
Learning Perturbations to Explain Time Series Predictions	ICML	2023	N/A	``
Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph Matching	ICML	2023	N/A	``
Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat	ICML	2023	Github	``
Representer Point Selection for Explaining Regularized High-dimensional Models	ICML	2023	N/A	``
Towards Explaining Distribution Shifts	ICML	2023	N/A	``
Relevant Walk Search for Explaining Graph Neural Networks	ICML	2023	Github	``
Concept-based Explanations for Out-of-Distribution Detectors	ICML	2023	N/A	``
GLOBE-CE: A Translation Based Approach for Global Counterfactual Explanations	ICML	2023	Github	``
Robust Explanation for Free or At the Cost of Faithfulness	ICML	2023	N/A	``
Learn to Accumulate Evidence from All Training Samples: Theory and Practice	ICML	2023	N/A	``
Towards Trustworthy Explanation: On Causal Rationalization	ICML	2023	N/A	``
Theoretical Behavior of XAI Methods in the Presence of Suppressor Variables	ICML	2023	N/A	``
Probabilistic Concept Bottleneck Models	ICML	2023	N/A	``
What do CNNs Learn in the First Layer and Why? A Linear Systems Perspective	ICML	2023	N/A	``
Towards credible visual model interpretation with path attribution	ICML	2023	N/A	``
Trainability, Expressivity and Interpretability in Gated Neural ODEs	ICML	2023	N/A	``
Discover and Cure: Concept-aware Mitigation of Spurious Correlation	ICML	2023	N/A	``
PWSHAP: A Path-Wise Explanation Model for Targeted Variables	ICML	2023	N/A	``
A Closer Look at the Intervention Procedure of Concept Bottleneck Models	ICML	2023	N/A	``
Counterfactual Analysis in Dynamic Latent-State Models	ICML	2023	N/A	``
Tackling Shortcut Learning in Deep Neural Networks: An Iterative Approach with Interpretable Models	ICML Workshop	2023	N/A	``
Rethinking Interpretation: Input-Agnostic Saliency Mapping of Deep Visual Classifiers	AAAI	2023	N/A	``
TopicFM: Robust and Interpretable Topic-Assisted Feature Matching	AAAI	2023	N/A	``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy	AAAI	2023	N/A	``
PEN: Prediction-Explanation Network to Forecast Stock Price Movement with Better Explainability	AAAI	2023	N/A	``
KerPrint: Local-Global Knowledge Graph Enhanced Diagnosis Prediction for Retrospective and Prospective Interpretations	AAAI	2023	N/A	``
Beyond Graph Convolutional Network: An Interpretable Regularizer-Centered Optimization Framework	AAAI	2023	N/A	``
Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling	AAAI	2023	N/A	``
Learning Interpretable Temporal Properties from Positive Examples Only	AAAI	2023	N/A	``
Symbolic Metamodels for Interpreting Black-Boxes Using Primitive Functions	AAAI	2023	N/A	``
Towards More Robust Interpretation via Local Gradient Alignment	AAAI	2023	N/A	``
Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network	AAAI	2023	N/A	``
XClusters: Explainability-First Clustering	AAAI	2023	N/A	``
Global Concept-Based Interpretability for Graph Neural Networks via Neuron Analysis	AAAI	2023	N/A	``
Fairness and Explainability: Bridging the Gap towards Fair Model Explanations	AAAI	2023	N/A	``
Explaining Model Confidence Using Counterfactuals	AAAI	2023	N/A	``
SEAT: Stable and Explainable Attention	AAAI	2023	N/A	``
Factual and Informative Review Generation for Explainable Recommendation	AAAI	2023	N/A	``
Improving Interpretability via Explicit Word Interaction Graph Layer	AAAI	2023	N/A	``
Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing	AAAI	2023	N/A	``
Improving Interpretability of Deep Sequential Knowledge Tracing Models with Question-centric Cognitive Representations	AAAI	2023	N/A	``
Targeted Knowledge Infusion To Make Conversational AI Explainable and Safe	AAAI	2023	N/A	``
eForecaster: Unifying Electricity Forecasting with Robust, Flexible, and Explainable Machine Learning Algorithms	AAAI	2023	N/A	``
SolderNet: Towards Trustworthy Visual Inspection of Solder Joints in Electronics Manufacturing Using Explainable Artificial Intelligence	AAAI	2023	N/A	``
Xaitk-Saliency: An Open Source Explainable AI Toolkit for Saliency	AAAI	2023	N/A	``
Ripple: Concept-Based Interpretation for Raw Time Series Models in Education	AAAI	2023	N/A	``
Semantics, Ontology and Explanation	arXiv	2023	N/A	`Ontological Unpacking`
Post Hoc Explanations of Language Models Can Improve Language Models	arXiv	2023	N/A	``
TopicFM: Robust and Interpretable Topic-Assisted Feature Matching	AAAI	2023	N/A	``
Beyond Graph Convolutional Network: An Interpretable Regularizer-Centered Optimization Framework	AAAI	2023	N/A	``
KerPrint: Local-Global Knowledge Graph Enhanced Diagnosis Prediction for Retrospective and Prospective Interpretations	AAAI	2023	N/A	``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy	AAAI	2023	N/A	``
PEN: Prediction-Explanation Network to Forecast Stock Price Movement with Better Explainability	AAAI	2023	N/A	``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy	AAAI	2023	N/A	``
Multi-Aspect Explainable Inductive Relation Prediction by Sentence Transformer	AAAI	2023	N/A	``
Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling	AAAI	2023	N/A	``
Learning Interpretable Temporal Properties from Positive Examples Only	AAAI	2023	N/A	``
Unfooling Perturbation-Based Post Hoc Explainers	AAAI	2023	N/A	``
Very Fast, Approximate Counterfactual Explanations for Decision Forests	AAAI	2023	N/A	``
Symbolic Metamodels for Interpreting Black-Boxes Using Primitive Functions	AAAI	2023	N/A	``
Towards More Robust Interpretation via Local Gradient Alignment	AAAI	2023	N/A	``
Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network	AAAI	2023	N/A	``
Local Explanations for Reinforcement Learning	AAAI	2023	N/A	``
ConceptX: A Framework for Latent Concept Analysis	AAAI	2023	N/A	``
XClusters: Explainability-First Clustering	AAAI	2023	N/A	``
Explaining Random Forests Using Bipolar Argumentation and Markov Networks	AAAI	2023	N/A	``
Global Concept-Based Interpretability for Graph Neural Networks via Neuron Analysis	AAAI	2023	N/A	``
Fairness and Explainability: Bridging the Gap towards Fair Model Explanations	AAAI	2023	N/A	``
Explaining Model Confidence Using Counterfactuals	AAAI	2023	N/A	``
XRand: Differentially Private Defense against Explanation-Guided Attacks	AAAI	2023	N/A	``
Unsupervised Explanation Generation via Correct Instantiations	AAAI	2023	N/A	``
SEAT: Stable and Explainable Attention	AAAI	2023	N/A	``
Disentangled CVAEs with Contrastive Learning for Explainable Recommendation	AAAI	2023	N/A	``
Factual and Informative Review Generation for Explainable Recommendation	AAAI	2023	N/A	``
Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing	AAAI	2023	N/A	``
Improving Interpretability via Explicit Word Interaction Graph Layer	AAAI	2023	N/A	``
Improving Interpretability of Deep Sequential Knowledge Tracing Models with Question-centric Cognitive Representations	AAAI	2023	N/A	``
Interpretable Chirality-Aware Graph Neural Network for Quantitative Structure Activity Relationship Modeling in Drug Discovery	AAAI	2023	N/A	``
Monitoring Model Deterioration with Explainable Uncertainty Estimation via Non-parametric Bootstrap	AAAI	2023	N/A	``
Interactive Concept Bottleneck Models	AAAI	2023	N/A	``
Data-Efficient and Interpretable Tabular Anomaly Detection	KDD	2023	N/A	``
Counterfactual Learning on Heterogeneous Graphs with Greedy Perturbation	KDD	2023	N/A	``
Hands-on Tutorial: "Explanations in AI: Methods, Stakeholders and Pitfalls"	KDD	2023	N/A	``
Feature-based Learning for Diverse and Privacy-Preserving Counterfactual Explanations	KDD	2023	N/A	``
Generative AI meets Responsible AI: Practical Challenges and Opportunities	KDD	2023	N/A	``
Empower Post-hoc Graph Explanations with Information Bottleneck: A Pre-training and Fine-tuning Perspective	KDD	2023	N/A	``
MixupExplainer: Generalizing Explanations for Graph Neural Networks with Data Augmentation	KDD	2023	N/A	``
CounterNet: End-to-End Training of Prediction Aware Counterfactual Explanations	KDD	2023	N/A	``
Fire: An Optimization Approach for Fast Interpretable Rule Extraction	KDD	2023	N/A	``
ESSA: Explanation Iterative Supervision via Saliency-guided Data Augmentation	KDD	2023	N/A	``
A Causality Inspired Framework for Model Interpretation	KDD	2023	N/A	``
Path-Specific Counterfactual Fairness for Recommender Systems	KDD	2023	N/A	``
SURE: Robust, Explainable, and Fair Classification without Sensitive Attributes	KDD	2023	N/A	``
Learning for Counterfactual Fairness from Observational Data	KDD	2023	N/A	``
Interpretable Sparsification of Brain Graphs: Better Practices and Effective Designs for Graph Neural Networks	KDD	2023	N/A	``
ExplainableFold: Understanding AlphaFold Prediction with Explainable AI	KDD	2023	N/A	``
FLAMES2Graph: An Interpretable Federated Multivariate Time Series Classification Framework	KDD	2023	N/A	``
Feature-based Learning for Diverse and Privacy-Preserving Counterfactual Explanations	KDD	2023	N/A	``
ESSA: Explanation Iterative Supervision via Saliency-guided Data Augmentation	KDD	2023	N/A	``
Counterfactual Explanations and Model Multiplicity: a Relational Verification View	Proceedings of KR	2023	N/A	``
Explainable Representations for Relation Prediction in Knowledge Graphs	Proceedings of KR	2023	N/A	``
Region-based Saliency Explanations on the Recognition of Facial Genetic Syndromes	PMLR	2023	N/A	``
FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods	arXiv	2023	N/A	``
Diffusion-based Visual Counterfactual Explanations - Towards Systematic Quantitative Evaluation	arXiv	2023	N/A	``
Testing methods of neural systems understanding	Cognitive Systems Research	2023	N/A	``
Understanding CNN Hidden Neuron Activations Using Structured Background Knowledge and Deductive Reasoning	arXiv	2023	N/A	``
An Explainable Federated Learning and Blockchain based Secure Credit Modeling Method	EJOR	2023	N/A	``
i-Align: an interpretable knowledge graph alignment model	DMKD	2023	N/A	``
Goodhart’s Law Applies to NLP’s Explanation Benchmarks	arXiv	2023	N/A	``
DELELSTM: DECOMPOSITION-BASED LINEAR EXPLAINABLE LSTM TO CAPTURE INSTANTANEOUS AND LONG-TERM EFFECTS IN TIME SERIES	arXiv	2023	N/A	``
BEYOND DISCRIMINATIVE REGIONS: SALIENCY MAPS AS ALTERNATIVES TO CAMS FOR WEAKLY SU- PERVISED SEMANTIC SEGMENTATION	arXiv	2023	N/A	``
SEA: Shareable and Explainable Attribution for Query-based Black-box Attacks	arXiv	2023	N/A	``
Sparse Linear Concept Discovery Models	arXiv	2023	N/A	``
Revisiting the Performance-Explainability Trade-Off in Explainable Artificial Intelligence (XAI)	arXiv	2023	N/A	``
KGTN: Knowledge Graph Transformer Network for explainable multi-category item recommendation	KBS	2023	N/A	``
SAFE: Saliency-Aware Counterfactual Explanations for DNN-based Automated Driving Systems	arXiv	2023	N/A	``
Explainable Multi-Agent Reinforcement Learning for Temporal Queries	IJCAI	2023	N/A	``
Advancing Post-Hoc Case-Based Explanation with Feature Highlighting	IJCAI	2023	N/A	``
Explanation-Guided Reward Alignment	IJCAI	2023	N/A	``
FEAMOE: Fair, Explainable and Adaptive Mixture of Experts	IJCAI	2023	N/A	``
Statistically Significant Concept-based Explanation of Image Classifiers via Model Knockoffs	IJCAI	2023	N/A	``
Learning Prototype Classifiers for Long-Tailed Recognition	IJCAI	2023	N/A	``
On Translations between ML Models for XAI Purposes	IJCAI	2023	N/A	``
The Parameterized Complexity of Finding Concise Local Explanations	IJCAI	2023	N/A	``
Neuro-Symbolic Class Expression Learning	IJCAI	2023	N/A	``
A Logic-based Approach to Contrastive Explainability for Neurosymbolic Visual Question Answering	IJCAI	2023	N/A	``
Cardinality-Minimal Explanations for Monotonic Neural Networks	IJCAI	2023	N/A	``
Unveiling Concepts Learned by a World-Class Chess-Playing Agent	IJCAI	2023	N/A	``
Explainable Text Classification via Attentive and Targeted Mixing Data Augmentation	IJCAI	2023	N/A	``
On the Complexity of Counterfactual Reasoning	IJCAI	2023	N/A	``
Interpretable Local Concept-based Explanation with Human Feedback to Predict All-cause Mortality (Extended Abstract)	IJCAI	2023	N/A	``
Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing	arXiv	2023	N/A	``
Counterfactual Explanations via Locally-guided Sequential Algorithmic Recourse	arXiv	2023	N/A	``
Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations	CIKM	2023	N/A	``
A Function Interpretation Benchmark for Evaluating Interpretability Methods	arXiv	2023	N/A	``
Explaining through Transformer Input Sampling	arXiv	2023	N/A	``
Backtracking Counterfactuals	CLeaR	2023	N/A	``
Text2Concept: Concept Activation Vectors Directly from Text	CVPR Workshop	2023	N/A	``
A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation	arXiv	2023	N/A	``
Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance	NeurIPS	2023	Github	``
CLIP-DISSECT: AUTOMATIC DESCRIPTION OF NEU- RON REPRESENTATIONS IN DEEP VISION NETWORKS	ICLR	2023	Github	``
Label-free Concept Bottleneck Models	ICLR	2023	N/A	``
Concept-level Debugging of Part-Prototype Networks	ICLR	2023	N/A	``
Towards Interpretable Deep Reinforcement Learning with Human-Friendly Prototypes	ICLR	2023	N/A	``
Re-calibrating Feature Attributions for Model Interpretation	ICLR	2023	N/A	``
Post-hoc Concept Bottleneck Models	ICLR	2023	N/A	``
Information Maximization Perspective of Orthogonal Matching Pursuit with Applications to Explainable AI	NeurIPS	2023	N/A	``
Explaining Predictive Uncertainty with Information Theoretic Shapley Values	NeurIPS	2023	N/A	``
REASONER: An Explainable Recommendation Dataset with Comprehensive Labeling Ground Truths	NeurIPS	2023	N/A	``
Explain Any Concept: Segment Anything Meets Concept-Based Explanation	NeurIPS	2023	N/A	``
VeriX: Towards Verified Explainability of Deep Neural Networks	NeurIPS	2023	N/A	``
Explainable and Efficient Randomized Voting Rules	NeurIPS	2023	N/A	``
TempME: Towards the Explainability of Temporal Graph Neural Networks via Motif Discovery	NeurIPS	2023	N/A	``
Explaining the Uncertain: Stochastic Shapley Values for Gaussian Process Models	NeurIPS	2023	N/A	``
V-InFoR: A Robust Graph Neural Networks Explainer for Structurally Corrupted Graphs	NeurIPS	2023	N/A	``
Explainable Brain Age Prediction using coVariance Neural Networks	NeurIPS	2023	N/A	``
TempME: Towards the Explainability of Temporal Graph Neural Networks via Motif Discovery	NeurIPS	2023	N/A	``
D4Explainer: In-distribution Explanations of Graph Neural Network via Discrete Denoising Diffusion	NeurIPS	2023	N/A	``
StateMask: Explaining Deep Reinforcement Learning through State Mask	NeurIPS	2023	N/A	``
LICO: Explainable Models with Language-Image COnsistency	NeurIPS	2023	N/A	``
On the explainable properties of 1-Lipschitz Neural Networks: An Optimal Transport Perspective	NeurIPS	2023	N/A	``
Interpretable and Explainable Logical Policies via Neurally Guided Symbolic Abstraction	NeurIPS	2023	N/A	``
Discriminative Feature Attributions: Bridging Post Hoc Explainability and Inherent Interpretability	NeurIPS	2023	N/A	``
Train Once and Explain Everywhere: Pre-training Interpretable Graph Neural Networks	NeurIPS	2023	N/A	``
Accountability in Offline Reinforcement Learning: Explaining Decisions with a Corpus of Examples	NeurIPS	2023	N/A	``
HiBug: On Human-Interpretable Model Debug	NeurIPS	2023	N/A	``
Towards Self-Interpretable Graph-Level Anomaly Detection	NeurIPS	2023	N/A	``
Interpretable Graph Networks Formulate Universal Algebra Conjectures	NeurIPS	2023	N/A	``
Towards Automated Circuit Discovery for Mechanistic Interpretabilit	NeurIPS	2023	N/A	``
Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach	NeurIPS	2023	N/A	``
DISCOVER: Making Vision Networks Interpretable via Competition and Dissection	NeurIPS	2023	N/A	``
MultiMoDN—Multimodal, Multi-Task, Interpretable Modular Networks	NeurIPS	2023	N/A	``
Causal Interpretation of Self-Attention in Pre-Trained Transformers	NeurIPS	2023	N/A	``
Tracr: Compiled Transformers as a Laboratory for Interpretability	NeurIPS	2023	N/A	``
Learning Interpretable Low-dimensional Representation via Physical Symmetry	NeurIPS	2023	N/A	``
Scale Alone Does not Improve Mechanistic Interpretability in Vision Models	NeurIPS	2023	N/A	``
Transitivity Recovering Decompositions: Interpretable and Robust Fine-Grained Relationships	NeurIPS	2023	N/A	``
GRAND-SLAMIN’ Interpretable Additive Modeling with Structural Constraints	NeurIPS	2023	N/A	``
Interpreting Unsupervised Anomaly Detection in Security via Rule Extraction	NeurIPS	2023	N/A	``
GPEX, A Framework For Interpreting Artificial Neural Networks	NeurIPS	2023	N/A	``
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers	NeurIPS	2023	N/A	``
ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP	NeurIPS	2023	N/A	``
On the Identifiability and Interpretability of Gaussian Process Models	NeurIPS	2023	N/A	``
BasisFormer: Attention-based Time Series Forecasting with Learnable and Interpretable Basis	NeurIPS	2023	N/A	``
Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance	NeurIPS	2023	N/A	``
Evaluating Neuron Interpretation Methods of NLP Models	NeurIPS	2023	N/A	``
FIND: A Function Description Benchmark for Evaluating Interpretability Methods	NeurIPS	2023	N/A	``
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model	NeurIPS	2023	N/A	``
Interpretable Prototype-based Graph Information Bottleneck	NeurIPS	2023	N/A	``
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca	NeurIPS	2023	N/A	``
M4: A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models	NeurIPS	2023	N/A	``
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning	EMNLP	2023	N/A	``
Towards Explainable and Accessible AI	EMNLP	2023	N/A	``
KEBAP: Korean Error Explainable Benchmark Dataset for ASR and Post-processing	EMNLP	2023	N/A	``
INSTRUCTSCORE: Towards Explainable Text Generation Evaluation with Automatic Feedback	EMNLP	2023	N/A	``
Goal-Driven Explainable Clustering via Language Descriptions	EMNLP	2023	N/A	``
VECHR: A Dataset for Explainable and Robust Classification of Vulnerability Type in the European Court of Human Rights	EMNLP	2023	N/A	``
COFFEE: Counterfactual Fairness for Personalized Text Generation in Explainable Recommendation	EMNLP	2023	N/A	``
Hop, Union, Generate: Explainable Multi-hop Reasoning without Rationale Supervision	EMNLP	2023	N/A	``
GenEx: A Commonsense-aware Unified Generative Framework for Explainable Cyberbullying Detection	EMNLP	2023	N/A	``
DRGCoder: Explainable Clinical Coding for the Early Prediction of Diagnostic-Related Groups	EMNLP	2023	N/A	``
LLM4Vis: Explainable Visualization Recommendation using ChatGPT	EMNLP	2023	N/A	``
Harnessing LLMs for Temporal Data - A Study on Explainable Financial Time Series Forecasting	EMNLP	2023	N/A	``
HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning	EMNLP	2023	N/A	``
Distilling ChatGPT for Explainable Automated Student Answer Assessment	EMNLP	2023	N/A	``
Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models	EMNLP	2023	N/A	``
Leveraging Structured Information for Explainable Multi-hop Question Answering and Reasoning	EMNLP	2023	N/A	``
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning	EMNLP	2023	N/A	``
Deep Integrated Explanations	CIKM	2023	N/A	``
KG4Ex: An Explainable Knowledge Graph-Based Approach for Exercise Recommendation	CIKM	2023	N/A	``
Interpretable Fake News Detection with Graph Evidence	CIKM	2023	N/A	``
PriSHAP: Prior-guided Shapley Value Explanations for Correlated Features	CIKM	2023	N/A	``
A Model-Agnostic Method to Interpret Link Prediction Evaluation of Knowledge Graph Embeddings	CIKM	2023	N/A	``
ACGAN-GNNExplainer: Auxiliary Conditional Generative Explainer for Graph Neural Networks	CIKM	2023	N/A	``
Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries	CIKM	2023	N/A	``
Explainable Spatio-Temporal Graph Neural Networks	CIKM	2023	N/A	``
Towards Deeper, Lighter and Interpretable Cross Network for CTR Prediction	CIKM	2023	N/A	``
Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations	CIKM	2023	N/A	``
NOVO: Learnable and Interpretable Document Identifiers for Model-Based IR	CIKM	2023	N/A	``
Counterfactual Monotonic Knowledge Tracing for Assessing Students' Dynamic Mastery of Knowledge Concepts	CIKM	2023	N/A	``
Contrastive Counterfactual Learning for Causality-aware Interpretable Recommender Systems	CIKM	2023	N/A	``
		2023	N/A	``

2024

Title	Venue	Year	Code	Keywords
Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models	AAAI	2024	N/A	``
Evaluating Pre-trial Programs Using Interpretable Machine Learning Matching Algorithms for Causal Inference	AAAI	2024	N/A	``
On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods	AAAI	2024	N/A	``
A Framework for Data-Driven Explainability in Mathematical Optimization	AAAI	2024	N/A	``
Q-SENN: Quantized Self-Explaining Neural Networks	AAAI	2024	N/A	``
LR-XFL: Logical Reasoning-Based Explainable Federated Learning	AAAI	2024	N/A	``
Trade-Offs in Fine-Tuned Diffusion Models between Accuracy and Interpretability	AAAI	2024	N/A	``
π-Light: Programmatic Interpretable Reinforcement Learning for Resource-Limited Traffic Signal Control	AAAI	2024	N/A	``
Interpretability Benchmark for Evaluating Spatial Misalignment of Prototypical Parts Explanations	AAAI	2024	N/A	``
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention	AAAI	2024	N/A	``
LimeAttack: Local Explainable Method for Textual Hard-Label Adversarial Attack	AAAI	2024	N/A	``
Learning Robust Rationales for Model Explainability: A Guidance-Based Approach	AAAI	2024	N/A	``
Explaining Generalization Power of a DNN Using Interactive Concepts	AAAI	2024	N/A	``
Federated Causality Learning with Explainable Adaptive Optimizatio	AAAI	2024	N/A	``
Learning Performance Maximizing Ensembles with Explainability Guarantees	AAAI	2024	N/A	``
Towards Modeling Uncertainties of Self-Explaining Neural Networks via Conformal Prediction	AAAI	2024	N/A	``
Towards Learning and Explaining Indirect Causal Effects in Neural Networks	AAAI	2024	N/A	``
GINN-LP: A Growing Interpretable Neural Network for Discovering Multivariate Laurent Polynomial Equations	AAAI	2024	N/A	``
Pantypes: Diverse Representatives for Self-Explainable Models	AAAI	2024	N/A	``
Factorized Explainer for Graph Neural Networks	AAAI	2024	N/A	``
Self-Interpretable Graph Learning with Sufficient and Necessary Explanations	AAAI	2024	N/A	``
Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning	AAAI	2024	N/A	``
A General Theoretical Framework for Learning Smallest Interpretable Models	AAAI	2024	N/A	``
Knowledge-Aware Explainable Reciprocal Recommendation	AAAI	2024	N/A	``
Fine-Tuning Large Language Model Based Explainable Recommendation with Explainable Quality Reward	AAAI	2024	N/A	``
Finding Interpretable Class-Specific Patterns through Efficient Neural Search	AAAI	2024	N/A	``
Enhance Sketch Recognition’s Explainability via Semantic Component-Level Parsing	AAAI	2024	N/A	``
B-spine: Learning B-spline Curve Representation for Robust and Interpretable Spinal Curvature Estimation	AAAI	2024	N/A	``
A Convolutional Neural Network Interpretable Framework for Human Ventral Visual Pathway Representation	AAAI	2024	N/A	``
NeSyFOLD: A Framework for Interpretable Image Classification	AAAI	2024	N/A	``
Knowledge-Aware Neuron Interpretation for Scene Classification	AAAI	2024	N/A	``
MICA: Towards Explainable Skin Lesion Diagnosis via Multi-Level Image-Concept Alignment	AAAI	2024	N/A	``
Interpretable3D: An Ad-Hoc Interpretable Classifier for 3D Point Clouds	AAAI	2024	N/A	``
Unifying Interpretability and Explainability for Alzheimer's Disease Progression Prediction	Arxiv	2024	Code	``
		2024	N/A	``
A Brain-Inspired Way of Reducing the Network Complexity via Concept-Regularized Coding for Emotion Recognition	AAAI	2024	N/A	``
Visual Chain-of-Thought Prompting for Knowledge-Based Visual Reasoning	AAAI	2024	N/A	``
Beyond Prototypes: Semantic Anchor Regularization for Better Representation Learning	AAAI	2024	N/A	``
PICNN: A Pathway towards Interpretable Convolutional Neural Networks	AAAI	2024	N/A	``
MagiCapture: High-Resolution Multi-Concept Portrait Customization	AAAI	2024	N/A	``
AMD: Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion	AAAI	2024	N/A	``
Towards More Faithful Natural Language Explanation Using Multi-Level Contrastive Learning in VQA	AAAI	2024	N/A	``
ViTree: Single-Path Neural Tree for Step-Wise Interpretable Fine-Grained Visual Categorization	AAAI	2024	N/A	``
Text-to-Image Generation for Abstract Concepts	AAAI	2024	N/A	``
Boosting Multiple Instance Learning Models for Whole Slide Image Classification: A Model-Agnostic Framework Based on Counterfactual Inference	AAAI	2024	N/A	``
Set Prediction Guided by Semantic Concepts for Diverse Video Captioning	AAAI	2024	N/A	``
Understanding the Role of the Projector in Knowledge Distillation	AAAI	2024	N/A	``
Concept-Guided Prompt Learning for Generalization in Vision-Language Models	AAAI	2024	N/A	``
Automatic Core-Guided Reformulation via Constraint Explanation and Condition Learning	AAAI	2024	N/A	``
Learning to Pivot as a Smart Expert	AAAI	2024	N/A	``
Explainable Origin-Destination Crowd Flow Interpolation via Variational Multi-Modal Recurrent Graph Auto-Encoder	AAAI	2024	N/A	``
Explaining Reinforcement Learning Agents through Counterfactual Action Outcomes	AAAI	2024	N/A	``
Understanding Distributed Representations of Concepts in Deep Neural Networks without Supervision	AAAI	2024	N/A	``
Unsupervised Object Interaction Learning with Counterfactual Dynamics Models	AAAI	2024	N/A	``
NoiseCLR: A Contrastive Learning Approach for Unsupervised Discovery of Interpretable Directions in Diffusion Models	CVPR	2024	N/A	``
ExMap: Leveraging Explainability Heatmaps for Unsupervised Group Robustness to Spurious Correlations	CVPR	2024	N/A	``
Interpretable Measures of Conceptual Similarity by Complexity-Constrained Descriptive Auto-Encoding	CVPR	2024	N/A	``
Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions	CVPR	2024	N/A	``
Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models	CVPR	2024	N/A	``
Link-Context Learning for Multimodal LLMs	CVPR	2024	N/A	``
Explaining CLIP's Performance Disparities on Data from Blind/Low Vision Users	CVPR	2024	N/A	``
Learning Structure-from-Motion with Graph Attention Networks	CVPR	2024	N/A	``
GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding	CVPR	2024	N/A	``
Building Optimal Neural Architectures using Interpretable Knowledge	CVPR	2024	N/A	``
Understanding Video Transformers via Universal Concept Discovery	CVPR	2024	N/A	``
A Unified and Interpretable Emotion Representation and Expression Generation	CVPR	2024	N/A	``
Data Poisoning based Backdoor Attacks to Contrastive Learning	CVPR	2024	N/A	``
Are Logistic Models Really Interpretable?	IJCAI	2024	N/A	``
Detecting and Understanding Vulnerabilities in Language Models via Mechanistic Interpretability	IJCAI	2024	N/A	``
SGDCL: Semantic-Guided Dynamic Correlation Learning for Explainable Autonomous Driving	IJCAI	2024	N/A	``
ProtoPFormer: Concentrating on Prototypical Parts in Vision Transformers for Interpretable Image Recognition	IJCAI	2024	N/A	``
Concept-Level Causal Explanation Method for Brain Function Network Classification	IJCAI	2024	N/A	``
Capturing Knowledge Graphs and Rules with Octagon Embeddings	IJCAI	2024	N/A	``
Constructive Interpolation and Concept-Based Beth Definability for Description Logics via Sequents	IJCAI	2024	N/A	``
"NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning"	IJCAI	2024	N/A	``
Cutting the Black Box: Conceptual Interpretation of a Deep Neural Net with Multi-Modal Embeddings and Multi-Criteria Decision Aid	IJCAI	2024	N/A	``
Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image Classification	IJCAI	2024	N/A	``
Learning Causally Disentangled Representations via the Principle of Independent Causal Mechanisms	IJCAI	2024	N/A	``
EMOTE: An Explainable Architecture for Modelling the Other through Empathy	IJCAI	2024	N/A	``
SEMANTIFY: Unveiling Memes with Robust Interpretability beyond Input Attribution	IJCAI	2024	N/A	``
Learning Label Dependencies for Visual Information Extraction	IJCAI	2024	N/A	``
Unsupervised Concept Discovery Mitigates Spurious Correlations	ICML	2024	N/A	``
Contextualized Policy Recovery: Modeling and Interpreting Medical Decisions with Adaptive Imitation Learning	ICML	2024	N/A	``
Removing Spurious Concepts from Neural Network Representations via Joint Subspace Estimation	ICML	2024	N/A	``
An Image is Worth Multiple Words: Discovering Object Level Concepts using Multi-Concept Prompt Learning	ICML	2024	N/A	``
Understanding Inter-Concept Relationships in Concept-Based Models	ICML	2024	N/A	``
Towards Compositionality in Concept Learning	ICML	2024	N/A	``
Learning to Intervene on Concept Bottlenecks	ICML	2024	N/A	``
Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models	ICML	2024	N/A	``
Position: Explain to Question not to Justify	ICML	2024	N/A	``
Explaining Graph Neural Networks via Structure-aware Interaction Index	ICML	2024	N/A	``
How Interpretable Are Interpretable Graph Neural Networks?	ICML	2024	N/A	``
SelfIE: Self-Interpretation of Large Language Model Embeddings	ICML	2024	N/A	``
On Mechanistic Knowledge Localization in Text-to-Image Generative Models	ICML	2024	N/A	``
Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective	ICML	2024	N/A	``
How Learning by Reconstruction Produces Uninformative Features For Perception	ICML	2024	N/A	``
SelfIE: Self-Interpretation of Large Language Model Embeddings	ICML	2024	N/A	``
Generating In-Distribution Proxy Graphs for Explaining Graph Neural Networks	ICML	2024	N/A	``
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations	ICML	2024	N/A	``
RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation	ICML	2024	N/A	``
Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments	ICML	2024	N/A	``
Explaining Probabilistic Models with Distributional Values	ICML	2024	N/A	``
Neuro-Visualizer: A Novel Auto-Encoder-Based Loss Landscape Visualization Method With an Application in Knowledge-Guided Machine Learning	ICML	2024	N/A	``
FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning	ICML	2024	N/A	``
Interpretability Illusions in the Generalization of Simplified Models	ICML	2024	N/A	``
Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning	ICML	2024	N/A	``
Improving Interpretation Faithfulness for Vision Transformers	ICML	2024	N/A	``
Human-like Category Learning by Injecting Ecological Priors from Large Language Models into Neural Networks	ICML	2024	N/A	``
Understanding the Learning Dynamics of Alignment with Human Feedback	ICML	2024	N/A	``
An Information-Theoretic Analysis of In-Context Learning	ICML	2024	N/A	``
Learning to Infer Generative Template Programs for Visual Concepts	ICML	2024	N/A	``
Learning Decision Trees and Forests with Algorithmic Recourse	ICML	2024	N/A	``
From Neurons to Neutrons: A Case Study in Interpretability	ICML	2024	N/A	``
Learning Causal Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition	ICML	2024	N/A	``
KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning	ICML	2024	N/A	``
Attention Meets Post-hoc Interpretability: A Mathematical Perspective	ICML	2024	N/A	``
End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations	ICML	2024	N/A	``
Finding NEM-U: Explaining unsupervised representation learning through neural network generated explanation masks	ICML	2024	N/A	``
Learning High-Order Relationships of Brain Regions	ICML	2024	N/A	``
Mechanistic Neural Networks for Scientific Machine Learning	ICML	2024	N/A	``
Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input	ICML	2024	N/A	``
Codebook Features: Sparse and Discrete Interpretability for Neural Networks	ICML	2024	N/A	``
Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features	ICML	2024	N/A	``
Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions	ICML	2024	N/A	``
Explain Temporal Black-Box Models via Functional Decomposition	ICML	2024	N/A	``
Analysis for Abductive Learning and Neural-Symbolic Reasoning Shortcuts	ICML	2024	N/A	``
Learning Causal Dynamics Models in Object-Oriented Environments	ICML	2024	N/A	``
ContPhy: Continuum Physical Concept Learning and Reasoning from Videos	ICML	2024	N/A	``
Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions	ICLR	2024	N/A	``
The mechanistic basis of data dependence and abrupt learning in an in-context classification task	ICLR	2024	N/A	``
Provable Compositional Generalization for Object-Centric Learning	ICLR	2024	N/A	``
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts	ICLR	2024	N/A	``
"What Data Benefits My Classifier?" Enhancing Model Performance and Interpretability through Influence-Based Data Selection	ICLR	2024	N/A	``
Vision Transformers Need Registers	ICLR	2024	N/A	``
Robust agents learn causal world models	ICLR	2024	N/A	``
Detecting, Explaining, and Mitigating Memorization in Diffusion Models	ICLR	2024	N/A	``
LLMCarbon: Modeling the End-to-End Carbon Footprint of Large Language Models	ICLR	2024	N/A	``
Interpreting CLIP's Image Representation via Text-Based Decomposition	ICLR	2024	N/A	``
Spot Check Equivalence: An Interpretable Metric for Information Elicitation Mechanisms	WWW	2024	N/A	``
EXGC: Bridging Efficiency and Explainability in Graph Condensation	WWW	2024	N/A	``
Adversarial Mask Explainer for Graph Neural Networks	WWW	2024	N/A	``
Globally Interpretable Graph Learning via Distribution Matching	WWW	2024	N/A	``
Back to the Future: Towards Explainable Temporal Reasoning with Large Language Models	WWW	2024	N/A	``
A Method for Assessing Inference Patterns Captured by Embedding Models in Knowledge Graphs	WWW	2024	N/A	``
Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models	WWW	2024	N/A	``
NETEVOLVE: Social Network Forecasting using Multi-Agent Reinforcement Learning with Interpretable Features	WWW	2024	N/A	``
Invariant Graph Learning for Causal Effect Estimation	WWW	2024	N/A	``
Interpretable Knowledge Tracing with Multiscale State Representation	WWW	2024	N/A	``
Towards the Identifiability and Explainability for Personalized Learner Modeling: An Inductive Paradigm	WWW	2024	N/A	``
A Counterfactual Framework for Learning and Evaluating Explanations for Recommender Systems	WWW	2024	N/A	``
Learning Audio Concepts from Counterfactual Natural Language	ICASSP	2024	N/A	``
An Explainable Proxy Model for Multilabel Audio Segmentation	ICASSP	2024	N/A	``
Learning Ontology Informed Representations with Constraints for Acoustic Event Detection	ICASSP	2024	N/A	``
Predict and Interpret Health Risk Using Ehr Through Typical Patients	ICASSP	2024	N/A	``
Learning a Convex Patch-Based Synthesis Model via Deep Equilibrium	ICASSP	2024	N/A	``
Implicit-Knowledge-Guided Align Before Understanding for KB-VQA	ICASSP	2024	N/A	``
Unlocking Deep Learning: A BP-Free Approach for Parallel Block-Wise Training of Neural Networks	ICASSP	2024	N/A	``
Improved Image Captioning Via Knowledge Graph-Augmented Models	ICASSP	2024	N/A	``
Interpretable Multimodal Out-of-Context Detection with Soft Logic Regularization	ICASSP	2024	N/A	``

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recent Publications in Explainable AI

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

About

Releases

Packages

Contributors 6

rushrukh/awesome-explainable-ai

Folders and files

Latest commit

History

Repository files navigation

Recent Publications in Explainable AI

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Packages