Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removes ACV from shapash and fixes dependencies #505

Merged
merged 6 commits into from
Nov 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,6 @@
| 2.0.x | Refactoring Shapash <br> | Refactoring attributes of compile methods and init. Refactoring implementation for new backends | [<img src="https://raw.githubusercontent.com/MAIF/shapash/master/docs/_static/modular.png" width="50" title="modular">](https://github.com/MAIF/shapash/blob/master/tutorial/explainer_and_backend/tuto-expl06-Shapash-custom-backend.ipynb)
| 1.7.x | Variabilize Colors <br> | Giving possibility to have your own colour palette for outputs adapted to your design | [<img src="https://raw.githubusercontent.com/MAIF/shapash/master/docs/_static/variabilize-colors.png" width="50" title="variabilize-colors">](https://github.com/MAIF/shapash/blob/master/tutorial/common/tuto-common02-colors.ipynb)
| 1.6.x | Explainability Quality Metrics <br> [Article](https://towardsdatascience.com/building-confidence-on-explainability-methods-66b9ee575514) | To help increase confidence in explainability methods, you can evaluate the relevance of your explainability using 3 metrics: **Stability**, **Consistency** and **Compacity** | [<img src="https://raw.githubusercontent.com/MAIF/shapash/master/docs/_static/quality-metrics.png" width="50" title="quality-metrics">](https://github.com/MAIF/shapash/blob/master/tutorial/explainability_quality/tuto-quality01-Builing-confidence-explainability.ipynb)
| 1.5.x | ACV Backend <br> | A new way of estimating Shapley values using ACV. [More info about ACV here](https://towardsdatascience.com/the-right-way-to-compute-your-shapley-values-cfea30509254). | [<img src="https://raw.githubusercontent.com/MAIF/shapash/master/docs/_static/wheel.png" width="50" title="wheel-acv-backend">](tutorial/explainer_and_backend/tuto-expl03-Shapash-acv-backend.ipynb) |
| 1.4.x | Groups of features <br> [Demo](https://shapash-demo2.ossbymaif.fr/) | You can now regroup features that share common properties together. <br>This option can be useful if your model has a lot of features. | [<img src="https://raw.githubusercontent.com/MAIF/shapash/master/docs/_static/groups_features.gif" width="120" title="groups-features">](https://github.com/MAIF/shapash/blob/master/tutorial/common/tuto-common01-groups_of_features.ipynb) |
| 1.3.x | Shapash Report <br> [Demo](https://shapash.readthedocs.io/en/latest/report.html) | A standalone HTML report that constitutes a basis of an audit document. | [<img src="https://raw.githubusercontent.com/MAIF/shapash/master/docs/_static/report-icon.png" width="50" title="shapash-report">](https://github.com/MAIF/shapash/blob/master/tutorial/generate_report/tuto-shapash-report01.ipynb) |

Expand Down Expand Up @@ -287,7 +286,6 @@ This github repository offers many tutorials to allow you to easily get started

- [Compute Shapley Contributions using **Shap**](tutorial/explainer_and_backend/tuto-expl01-Shapash-Viz-using-Shap-contributions.ipynb)
- [Use **Lime** to compute local explanation, Summarize-it with **Shapash**](tutorial/explainer_and_backend/tuto-expl02-Shapash-Viz-using-Lime-contributions.ipynb)
- [Use **ACV backend** to compute Active Shapley Values and SDP global importance](tutorial/explainer_and_backend/tuto-expl03-Shapash-acv-backend.ipynb)
- [Compile faster Lime and consistency of contributions](tutorial/explainer_and_backend/tuto-expl04-Shapash-compute-Lime-faster.ipynb)
- [Use **FastTreeSHAP** or add contributions from another backend](tutorial/explainer_and_backend/tuto-expl05-Shapash-using-Fasttreeshap.ipynb)
- [Use Class Shapash Backend](tutorial/explainer_and_backend/tuto-expl06-Shapash-custom-backend.ipynb)
Expand Down
2 changes: 1 addition & 1 deletion docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@
<div class="details">
<h1>Features</h1>
<ul>
<li>Compatible with Shap, Lime and ACV</li>
<li>Compatible with Shap and Lime</li>
<li>Uses shap backend to display results in a few lines of code</li>
<li>Encoders objects and features dictionaries used for clear results</li>
<li>Compatible with category_encoders & Sklearn ColumnTransformer</li>
Expand Down
5 changes: 2 additions & 3 deletions requirements.dev.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
pip>=23.2.0
numpy==1.21.6
numpy>1.18.0
dash==2.3.1
catboost>=1.0.1
category-encoders>=2.6.0
Expand Down Expand Up @@ -32,13 +32,12 @@ numba>=0.53.1
nbconvert>=6.0.7
papermill>=2.0.0
matplotlib>=3.3.0
seaborn>=0.12.2
seaborn==0.12.2
scipy>=0.19.1
notebook>=6.0.0
jupyter-client<8.0.0
Jinja2>=2.11.0
phik>=0.12.0
skranger>=0.8.0
acv-exp>=1.2.3
lime>=0.2.0.0
regex
3 changes: 1 addition & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
'nbconvert>=6.0.7',
'papermill>=2.0.0',
'jupyter-client>=7.4.0',
'seaborn>=0.12.2',
'seaborn==0.12.2',
'notebook',
'Jinja2>=2.11.0',
'phik'
Expand All @@ -53,7 +53,6 @@
extras['xgboost'] = ['xgboost>=1.0.0']
extras['lightgbm'] = ['lightgbm>=2.3.0']
extras['catboost'] = ['catboost>=1.0.1']
extras['acv'] = ['acv-exp>=1.2.0']
extras['lime'] = ['lime>=0.2.0.0']

setup_requirements = ['pytest-runner', ]
Expand Down
1 change: 0 additions & 1 deletion shapash/backend/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@

from .base_backend import BaseBackend
from .shap_backend import ShapBackend
from .acv_backend import AcvBackend
from .lime_backend import LimeBackend


Expand Down
122 changes: 0 additions & 122 deletions shapash/backend/acv_backend.py

This file was deleted.

4 changes: 0 additions & 4 deletions shapash/decomposition/contributions.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,6 @@ def inverse_transform_contributions(contributions, preprocessing=None, agg_colum
The processing apply to the original data.
agg_columns : str (default: 'sum')
Type of aggregation performed. For Shap we want so sum contributions of one hot encoded variables.
For ACV we want to take any value as ACV computes contributions of coalition of variables (like
one hot encoded variables) differently from Shap and then give the same value to each variable of the
coalition. As a result we just need to take the value of one of these variables to get the contribution
value of the group.

Returns
-------
Expand Down
70 changes: 8 additions & 62 deletions shapash/explainer/consistency.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,18 +33,19 @@ def tuning_colorscale(self, values):
color_scale = list(map(list, (zip(desc_pct_df.values.flatten(), self._style_dict["init_contrib_colorscale"]))))
return color_scale

def compile(self, x=None, model=None, preprocessing=None, contributions=None, methods=["shap", "acv", "lime"]):
"""If not provided, compute contributions according to provided methods (default are shap, acv, lime).
If provided, check whether they respect the correct format:
def compile(self, contributions, x=None, preprocessing=None):
"""Check whether the contributions respect the correct format:
contributions = {"method_name_1": contrib_1, "method_name_2": contrib_2, ...}
where each contrib_i is a pandas DataFrame

Parameters
----------
contributions : dict
Contributions provided by the user if no compute is required.
Format must be {"method_name_1": contrib_1, "method_name_2": contrib_2, ...}
where each contrib_i is a pandas DataFrame. By default None
x : DataFrame, optional
Dataset on which to compute consistency metrics, by default None
model : model object, optional
Model used to compute contributions, by default None
preprocessing : category_encoders, ColumnTransformer, list, dict, optional (default: None)
--> Differents types of preprocessing are available:

Expand All @@ -54,72 +55,17 @@ def compile(self, x=None, model=None, preprocessing=None, contributions=None, me
- A list with a single ColumnTransformer with optional (dict, list of dict)
- A dict
- A list of dict
contributions : dict, optional
Contributions provided by the user if no compute is required.
Format must be {"method_name_1": contrib_1, "method_name_2": contrib_2, ...}
where each contrib_i is a pandas DataFrame. By default None
methods : list
Methods used to compute contributions, by default ["shap", "acv", "lime"]
"""
self.x = x
self.preprocessing = preprocessing
if contributions is None:
if (self.x is None) or (model is None):
raise ValueError('If no contributions are provided, parameters "x" and "model" must be defined')
contributions = self.compute_contributions(self.x, model, methods, self.preprocessing)
else:
if not isinstance(contributions, dict):
raise ValueError('Contributions must be a dictionary')
if not isinstance(contributions, dict):
raise ValueError('Contributions must be a dictionary')
self.methods = list(contributions.keys())
self.weights = list(contributions.values())

self.check_consistency_contributions(self.weights)
self.index = self.weights[0].index

def compute_contributions(self, x, model, methods, preprocessing):
"""
Compute contributions based on specified methods

Parameters
----------
x : pandas.DataFrame
Prediction set.
IMPORTANT: this should be the raw prediction set, whose values are seen by the end user.
x is a preprocessed dataset: Shapash can apply the model to it
model : model object
Model used to consistency check. model object can also be used by some method to compute
predict and predict_proba values
methods : list, optional
When contributions is None, list of methods to use to calculate contributions, by default ["shap", "acv"]
preprocessing : category_encoders, ColumnTransformer, list, dict
--> Differents types of preprocessing are available:

- A single category_encoders (OrdinalEncoder/OnehotEncoder/BaseNEncoder/BinaryEncoder/TargetEncoder)
- A single ColumnTransformer with scikit-learn encoding or category_encoders transformers
- A list with multiple category_encoders with optional (dict, list of dict)
- A list with a single ColumnTransformer with optional (dict, list of dict)
- A dict
- A list of dict

Returns
-------
contributions : dict
Dict whose keys are method names and values are the corresponding contributions
"""
contributions = {}

for backend in methods:
xpl = SmartExplainer(model=model, preprocessing=preprocessing, backend=backend)
xpl.compile(x=x)
if xpl._case == "classification" and len(xpl._classes) == 2:
contributions[backend] = xpl.contributions[1]
elif xpl._case == "classification" and len(xpl._classes) > 2:
raise AssertionError("Multi-class classification is not supported")
else:
contributions[backend] = xpl.contributions

return contributions

def check_consistency_contributions(self, weights):
"""
Assert contributions calculated from different methods are dataframes
Expand Down
2 changes: 1 addition & 1 deletion shapash/explainer/smart_explainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ class SmartExplainer:
predict and predict_proba values
backend : str or shpash.backend object (default: 'shap')
Select which computation method to use in order to compute contributions
and feature importance. Possible values are 'shap', 'acv' or 'lime'. Default is 'shap'.
and feature importance. Possible values are 'shap' or 'lime'. Default is 'shap'.
It is also possible to pass a backend class inherited from shpash.backend.BaseBackend.
preprocessing : category_encoders, ColumnTransformer, list, dict, optional (default: None)
--> Differents types of preprocessing are available:
Expand Down
4 changes: 0 additions & 4 deletions shapash/explainer/smart_state.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,10 +62,6 @@ def inverse_transform_contributions(self, contributions, preprocessing, agg_colu
Single step of preprocessing, typically a category encoder.
agg_columns : str (default: 'sum')
Type of aggregation performed. For Shap we want so sum contributions of one hot encoded variables.
For ACV we want to take any value as ACV computes contributions of coalition of variables (like
one hot encoded variables) differently from Shap and then give the same value to each variable of the
coalition. As a result we just need to take the value of one of these variables to get the contribution
value of the group.

Returns
-------
Expand Down
4 changes: 0 additions & 4 deletions shapash/utils/category_encoder_backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -198,10 +198,6 @@ def calc_inv_contrib_ce(x_contrib, encoding, agg_columns):
The processing apply to the original data.
agg_columns : str (default: 'sum')
Type of aggregation performed. For Shap we want so sum contributions of one hot encoded variables.
For ACV we want to take any value as ACV computes contributions of coalition of variables (like
one hot encoded variables) differently from Shap and then give the same value to each variable of the
coalition. As a result we just need to take the value of one of these variables to get the contribution
value of the group.

Returns
-------
Expand Down
4 changes: 0 additions & 4 deletions shapash/utils/columntransformer_backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,10 +195,6 @@ def calc_inv_contrib_ct(x_contrib, encoding, agg_columns):
The processing apply to the original data.
agg_columns : str (default: 'sum')
Type of aggregation performed. For Shap we want so sum contributions of one hot encoded variables.
For ACV we want to take any value as ACV computes contributions of coalition of variables (like
one hot encoded variables) differently from Shap and then give the same value to each variable of the
coalition. As a result we just need to take the value of one of these variables to get the contribution
value of the group.

Returns
-------
Expand Down
2 changes: 1 addition & 1 deletion shapash/utils/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,7 @@ def compute_sorted_variables_interactions_list_indices(interaction_values):
for i in range(tmp.shape[0]):
tmp[i, i:] = 0

interaction_contrib_sorted_indices = np.dstack(np.unravel_index(np.argsort(tmp.ravel()), tmp.shape))[0][::-1]
interaction_contrib_sorted_indices = np.dstack(np.unravel_index(np.argsort(tmp.ravel(), kind="stable"), tmp.shape))[0][::-1]
return interaction_contrib_sorted_indices


Expand Down
Loading