-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Bias Mitigation and Direction Methods #5130
Conversation
…s/post-processing-debiasing
|
||
with torch.set_grad_enabled(self.requires_grad): | ||
# pca_lowrank centers the embeddings by default | ||
_, _, V = torch.pca_lowrank(seed_embeddings, q=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we set q=2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed the VERB implementation + paper. I think the intuition behind this is that there will be two dimensions when applying PCA to definitionally-gendered words: 1) the gender direction, 2) all other directions, with the gender direction being principal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment in the file itself
bias_direction : `torch.Tensor` | ||
A unit tensor of size (dim, ) representing the concept subspace. The words | ||
that are used to define the bias direction are considered definitionally | ||
gendered and not modified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"definitionally gendered" is for the specific example of concept "gender", right? Words like "king", "queen", "he", "she", etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes!
|
||
class HardBiasMitigator(BiasMitigator): | ||
""" | ||
Hard bias mitigator. Mitigates bias in embeddings by: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should mention explicitly that this is applicable for binary concepts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added note at top of both mitigator and direction files.
|
||
2. Equalizing: ensuring that protected variable-related words are averaged | ||
out to have the same norm. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some conceptual examples of what "Neutralizing" and "Equalizing" mean? It makes sense mathematically, but for someone getting started and looking to use this, it might be more helpful to give practical examples for making it "click". The examples in the VERB paper are good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For each mitigation method, I just linked the appropriate figure in the VERB paper, as I think the pictures are the most helpful.
All tensors are expected to be on the same device. | ||
|
||
!!! Note | ||
This bias direction method is NOT differentiable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we intend to allow users to specify bias direction (and mitigator) methods in config, perhaps we should make "is_differentiable" a field, so that the list of methods which can be used can be obtained programmatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is part of the bias mitigators and direction wrappers PR - this PR is just the functional API.
expected_bias_mitigated_embeddings | ||
).reshape(2, 2, -1) | ||
|
||
def teardown_method(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't :) I just forgot to call the parent setup_method(), so the tmp dir wasn't being deleted.
allennlp/fairness/bias_mitigators.py
Outdated
# Want to adjust first 2 coordinates and leave d - 2 | ||
# other orthogonal components fixed | ||
fixed_rotated_evaluation_embeddings = rotated_evaluation_embeddings[..., 2:] | ||
# Restrict attention to subspace S |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where subspace S is ...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the subspace spanned by the bias directions (made a comment in the file)
@ArjunSubramonian I've left some comments; mostly regarding docs (which are fairly extensive, btw; great job!) |
…/allenai/allennlp into arjuns/post-processing-debiasing
* added linear and hard debiasers * worked on documentation * committing changes before branch switch * committing changes before switching branch * finished bias direction, linear and hard debiasers, need to write tests * finished bias direction test * Commiting changes before switching branch * finished hard and linear debiasers * finished OSCaR * bias mitigators tests and bias metrics remaining * added bias mitigator tests * added bias mitigator tests * finished tests for bias mitigation methods * fixed gpu issues * fixed gpu issues * fixed gpu issues * resolve issue with count_nonzero not being differentiable * added more references * responded to Akshita's comments Co-authored-by: Arjun Subramonian <arjuns@Arjuns-MacBook-Pro.local> Co-authored-by: Arjun Subramonian <arjuns@ip-192-168-0-106.us-west-2.compute.internal> Co-authored-by: Arjun Subramonian <arjuns@ip-192-168-0-108.us-west-2.compute.internal> Co-authored-by: Arjun Subramonian <arjuns@ip-192-168-1-108.us-west-2.compute.internal> Co-authored-by: Michael Schmitz <MichaelS@allenai.org> Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com>
Additions proposed in this pull request:
PCABiasDirection
,PairedPCABiasDirection
,TwoMeansBiasDirection
,ClassificationNormalBiasDirection
) and four bias mitigation methods (LinearBiasMitigator
,HardBiasMitigator
,INLPBiasMitigator
,OSCaRBiasMitigator
)