Fit interpretable models. Explain blackbox machine learning.
-
Updated
Aug 7, 2024 - C++
Fit interpretable models. Explain blackbox machine learning.
Model interpretability and understanding for PyTorch
TrustyAI Explainability Toolkit
For OpenMOSS Mechanistic Interpretability Team's Sparse Autoencoder (SAE) research.
A game theoretic approach to explain the output of any machine learning model.
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
An attribution library for LLMs
A JAX research toolkit for building, editing, and visualizing neural networks.
Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.
The website for NDIF, the National Deep Inference Fabric
For calculating global feature importance using Shapley values.
PyTorch library to compare similarity between NN representations
A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.
ReFT: Representation Finetuning for Language Models
Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions
Implementation of Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning (Kohler, Delfosse, et. al. 2024).
Different SHAP algorithms
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
👋 Xplique is a Neural Networks Explainability Toolbox
Robust multimodal image registration via keypoints
Add a description, image, and links to the interpretability topic page so that developers can more easily learn about it.
To associate your repository with the interpretability topic, visit your repo's landing page and select "manage topics."