Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Bias Mitigation and Direction Methods #5130

Merged
merged 25 commits into from
May 11, 2021
Merged

Conversation

ArjunSubramonian
Copy link
Contributor

Additions proposed in this pull request:

  • Added four bias direction methods (PCABiasDirection, PairedPCABiasDirection, TwoMeansBiasDirection, ClassificationNormalBiasDirection) and four bias mitigation methods (LinearBiasMitigator, HardBiasMitigator, INLPBiasMitigator, OSCaRBiasMitigator)

@ArjunSubramonian ArjunSubramonian self-assigned this Apr 20, 2021

with torch.set_grad_enabled(self.requires_grad):
# pca_lowrank centers the embeddings by default
_, _, V = torch.pca_lowrank(seed_embeddings, q=2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we set q=2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the VERB implementation + paper. I think the intuition behind this is that there will be two dimensions when applying PCA to definitionally-gendered words: 1) the gender direction, 2) all other directions, with the gender direction being principal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment in the file itself

bias_direction : `torch.Tensor`
A unit tensor of size (dim, ) representing the concept subspace. The words
that are used to define the bias direction are considered definitionally
gendered and not modified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"definitionally gendered" is for the specific example of concept "gender", right? Words like "king", "queen", "he", "she", etc.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!


class HardBiasMitigator(BiasMitigator):
"""
Hard bias mitigator. Mitigates bias in embeddings by:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should mention explicitly that this is applicable for binary concepts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added note at top of both mitigator and direction files.


2. Equalizing: ensuring that protected variable-related words are averaged
out to have the same norm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some conceptual examples of what "Neutralizing" and "Equalizing" mean? It makes sense mathematically, but for someone getting started and looking to use this, it might be more helpful to give practical examples for making it "click". The examples in the VERB paper are good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each mitigation method, I just linked the appropriate figure in the VERB paper, as I think the pictures are the most helpful.

All tensors are expected to be on the same device.

!!! Note
This bias direction method is NOT differentiable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we intend to allow users to specify bias direction (and mitigator) methods in config, perhaps we should make "is_differentiable" a field, so that the list of methods which can be used can be obtained programmatically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is part of the bias mitigators and direction wrappers PR - this PR is just the functional API.

expected_bias_mitigated_embeddings
).reshape(2, 2, -1)

def teardown_method(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't :) I just forgot to call the parent setup_method(), so the tmp dir wasn't being deleted.

# Want to adjust first 2 coordinates and leave d - 2
# other orthogonal components fixed
fixed_rotated_evaluation_embeddings = rotated_evaluation_embeddings[..., 2:]
# Restrict attention to subspace S
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where subspace S is ...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the subspace spanned by the bias directions (made a comment in the file)

@AkshitaB
Copy link
Contributor

AkshitaB commented May 8, 2021

@ArjunSubramonian I've left some comments; mostly regarding docs (which are fairly extensive, btw; great job!)

@AkshitaB AkshitaB merged commit d9b19b6 into main May 11, 2021
@AkshitaB AkshitaB deleted the arjuns/post-processing-debiasing branch May 11, 2021 17:44
Abhishek-P pushed a commit to Abhishek-P/allennlp that referenced this pull request Aug 11, 2021
* added linear and hard debiasers

* worked on documentation

* committing changes before branch switch

* committing changes before switching branch

* finished bias direction, linear and hard debiasers, need to write tests

* finished bias direction test

* Commiting changes before switching branch

* finished hard and linear debiasers

* finished OSCaR

* bias mitigators tests and bias metrics remaining

* added bias mitigator tests

* added bias mitigator tests

* finished tests for bias mitigation methods

* fixed gpu issues

* fixed gpu issues

* fixed gpu issues

* resolve issue with count_nonzero not being differentiable

* added more references

* responded to Akshita's comments

Co-authored-by: Arjun Subramonian <[email protected]>
Co-authored-by: Arjun Subramonian <[email protected]>
Co-authored-by: Arjun Subramonian <[email protected]>
Co-authored-by: Arjun Subramonian <[email protected]>
Co-authored-by: Michael Schmitz <[email protected]>
Co-authored-by: Akshita Bhagia <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants