Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Social bias and knowledge of social bias #5

Closed
StellaAthena opened this issue Jan 1, 2022 · 12 comments
Closed

Social bias and knowledge of social bias #5

StellaAthena opened this issue Jan 1, 2022 · 12 comments
Assignees
Labels
feature request New feature or request
Projects

Comments

@StellaAthena
Copy link
Member

Pretty straight forward… need to implement the following in the eval harness:

  1. WinoGender Schemas (also called AX-g under SuperGLUE)
  2. CrowS-Pairs
  3. WinoBias

For a discussion of “recognizing bias” vs “reproducing bias” check out here. The primary goal is to look at how the development of understanding of bias correlates with the development of tendency to produce biased content.

It would also be interesting to look at correlations across categories of bias, e.g., does the model learn to reproduce and/or identify all types of bias at an equal rate? And if not, can we identify specific subsets of the Pile that are “biased in how they are biased” so to speak.

@StellaAthena StellaAthena added the feature request New feature or request label Jan 1, 2022
@aflah02
Copy link
Contributor

aflah02 commented Dec 27, 2022

@StellaAthena Can I take this up?

@aflah02
Copy link
Contributor

aflah02 commented Dec 27, 2022

Also the CrowS-Pairs dataset has been deprecated by the authors after future works found some issues described in the README. It can still be used though but just not trusted significantly i guess

@StellaAthena
Copy link
Member Author

@aflah02 Thanks for letting us know!

@haileyschoelkopf has already launched interventions on model trainings, where we change the pronoun distribution in the text for a tail portion of training. These should be done pretty soon, and we'll need help analyzing them.

Hailey: can you explain precisely the intervention you've run, and we can plan how precisely we are going to analyze the results given this new info?

@haileyschoelkopf
Copy link
Collaborator

Yes! I have rerun the last 5k steps = ~15% of training of our Pythia-1.3b-deduped model with specific tokens in the GPT-NeoX-20b tokenizer for male pronouns swapped out to female pronouns (see full mapping here: EleutherAI/gpt-neox@df1bdca )

Was running into issues because of a bug affecting evals which was now fixed, I intend to run evaluations on this intervened model on:

  • general 0 and few-shot performance
  • Winobias, CrowS-Pairs, AXg
  • Potentially Winogenerated, a new Winogender-style dataset from Anthropic which is much larger than Winogender

Would love to get input on any other evaluations that might be useful, or any similar papers worth reading! Particularly interested in anything to run with this model that is not just a 0-shot benchmark.

@aflah02
Copy link
Contributor

aflah02 commented Dec 31, 2022

Thanks @haileyschoelkopf and sorry for the late reply

I'm not particularly aware of non zero shot benchmarks but I can help out with the zero shot ones if there is any need for the same!

Alsoo Happy New Year and Holidays @StellaAthena @haileyschoelkopf 🎉

@StellaAthena
Copy link
Member Author

I did some poking around, and it seems that some of the leading (aka only) work on this topic is The Birth of Bias: A case study on the evolution of gender bias in an English language model. A couple notable papers about bias evaluation and amplification include Trustworthy Social Bias Measurement and Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints. See also the critical paper Undesirable biases in NLP: Averting a crisis of measurement
.

This is a complicated and nuanced issue and our goal is not to meaningfully advance the body of knowledge. It's to motivate why this question is interesting and pitch Pythia as a platform for studying this question. This changes how we frame things: we can embrace the criticality of particular measurement techniques and calls for better analytic methods that would undermine papers seeking to solve the problem directly. We also don't need to get an answer let alone the answer. We just need to show that someone with more time and subject matter expertise should seriously consider this as a platform for doing the work.

The first order of business is a lit review: we want to make a list of papers talking about bias amplification in NLP or bias and training dynamics in deep learning. We especially want to note any papers that argue for studying how bias evolves over time, notes it as an interesting or worthwhile problem, or note it as an avenue for future work. Extra bonus points for papers that mention this but say that they can't study it due to insufficient model / data access (since that's the actual contribution of the paper).

@aflah02 I know this is a bit different from what you probably envisioned, but would you be interested in taking the lead on this lit review?

@aflah02
Copy link
Contributor

aflah02 commented Jan 1, 2023

Hi @StellaAthena
This looks really interesting and I would be up for it!
Can you share additional details too like what is the expected timeline and any references for previous lit surveys so that I can format mine in a similar way?

@StellaAthena
Copy link
Member Author

@aflah02 I don’t care about the format at all. It can be bullets, it can be paragraphs, it can be delivered orally. What matters is that you’re able to take the research you read and explain it to @haileyschoelkopf so she can decide what the exact experiment we are going to run is, and then to me so I can help write about why people interested in this question should use our model suite to answer it.

This would need to have a pretty quick turn-around. We are targeting submission to ICML, which has a deadline on Jan 26th, and I think that we’ll need this info by the 15th to bring able to incorporate it properly. Do you think you can do that?

@aflah02
Copy link
Contributor

aflah02 commented Jan 1, 2023

@StellaAthena that seems reasonable! I think i can do it and I'll get to work on this and share periodic updates here after I cover each paper! I also wrote a survey paper some months ago for a writing class on debiasing methods it's not written in the best way but should give some ideas, I'll share that as well

@StellaAthena
Copy link
Member Author

StellaAthena commented Jan 1, 2023

Awesome! Again, I want to stress that the goal is not to write a survey paper: it’s to survey the field so we can pitch our model suite to people in the field effectively.

Let’s plan on synching up on Friday about your progress? Can you post in #interp-across-time in the discord about setting that up?

@haileyschoelkopf
Copy link
Collaborator

Hey @aflah02 , sorry, been partially AFK the past 2 days due to the holiday so haven't caught up with the literature Stella found so far!

Would be great to follow up with you in our discord so we can discuss this further.

@aflah02
Copy link
Contributor

aflah02 commented Jan 2, 2023

@StellaAthena Sure! I'll make a post there

@haileyschoelkopf No worries hope you had fun! I'll post updates there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
Experiments
In progress
Development

No branches or pull requests

3 participants