Social bias and knowledge of social bias #5

StellaAthena · 2022-01-01T16:29:24Z

Pretty straight forward… need to implement the following in the eval harness:

WinoGender Schemas (also called AX-g under SuperGLUE)
CrowS-Pairs
WinoBias

For a discussion of “recognizing bias” vs “reproducing bias” check out here. The primary goal is to look at how the development of understanding of bias correlates with the development of tendency to produce biased content.

It would also be interesting to look at correlations across categories of bias, e.g., does the model learn to reproduce and/or identify all types of bias at an equal rate? And if not, can we identify specific subsets of the Pile that are “biased in how they are biased” so to speak.

aflah02 · 2022-12-27T06:38:06Z

@StellaAthena Can I take this up?

aflah02 · 2022-12-27T06:39:27Z

Also the CrowS-Pairs dataset has been deprecated by the authors after future works found some issues described in the README. It can still be used though but just not trusted significantly i guess

StellaAthena · 2022-12-27T15:36:36Z

@aflah02 Thanks for letting us know!

@haileyschoelkopf has already launched interventions on model trainings, where we change the pronoun distribution in the text for a tail portion of training. These should be done pretty soon, and we'll need help analyzing them.

Hailey: can you explain precisely the intervention you've run, and we can plan how precisely we are going to analyze the results given this new info?

haileyschoelkopf · 2022-12-28T00:58:36Z

Yes! I have rerun the last 5k steps = ~15% of training of our Pythia-1.3b-deduped model with specific tokens in the GPT-NeoX-20b tokenizer for male pronouns swapped out to female pronouns (see full mapping here: EleutherAI/gpt-neox@df1bdca )

Was running into issues because of a bug affecting evals which was now fixed, I intend to run evaluations on this intervened model on:

general 0 and few-shot performance
Winobias, CrowS-Pairs, AXg
Potentially Winogenerated, a new Winogender-style dataset from Anthropic which is much larger than Winogender

Would love to get input on any other evaluations that might be useful, or any similar papers worth reading! Particularly interested in anything to run with this model that is not just a 0-shot benchmark.

aflah02 · 2022-12-31T04:47:41Z

Thanks @haileyschoelkopf and sorry for the late reply

I'm not particularly aware of non zero shot benchmarks but I can help out with the zero shot ones if there is any need for the same!

Alsoo Happy New Year and Holidays @StellaAthena @haileyschoelkopf 🎉

StellaAthena · 2023-01-01T15:20:03Z

I did some poking around, and it seems that some of the leading (aka only) work on this topic is The Birth of Bias: A case study on the evolution of gender bias in an English language model. A couple notable papers about bias evaluation and amplification include Trustworthy Social Bias Measurement and Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints. See also the critical paper Undesirable biases in NLP: Averting a crisis of measurement
.

This is a complicated and nuanced issue and our goal is not to meaningfully advance the body of knowledge. It's to motivate why this question is interesting and pitch Pythia as a platform for studying this question. This changes how we frame things: we can embrace the criticality of particular measurement techniques and calls for better analytic methods that would undermine papers seeking to solve the problem directly. We also don't need to get an answer let alone the answer. We just need to show that someone with more time and subject matter expertise should seriously consider this as a platform for doing the work.

The first order of business is a lit review: we want to make a list of papers talking about bias amplification in NLP or bias and training dynamics in deep learning. We especially want to note any papers that argue for studying how bias evolves over time, notes it as an interesting or worthwhile problem, or note it as an avenue for future work. Extra bonus points for papers that mention this but say that they can't study it due to insufficient model / data access (since that's the actual contribution of the paper).

@aflah02 I know this is a bit different from what you probably envisioned, but would you be interested in taking the lead on this lit review?

aflah02 · 2023-01-01T15:47:04Z

Hi @StellaAthena
This looks really interesting and I would be up for it!
Can you share additional details too like what is the expected timeline and any references for previous lit surveys so that I can format mine in a similar way?

StellaAthena · 2023-01-01T21:08:45Z

@aflah02 I don’t care about the format at all. It can be bullets, it can be paragraphs, it can be delivered orally. What matters is that you’re able to take the research you read and explain it to @haileyschoelkopf so she can decide what the exact experiment we are going to run is, and then to me so I can help write about why people interested in this question should use our model suite to answer it.

This would need to have a pretty quick turn-around. We are targeting submission to ICML, which has a deadline on Jan 26th, and I think that we’ll need this info by the 15th to bring able to incorporate it properly. Do you think you can do that?

aflah02 · 2023-01-01T21:12:38Z

@StellaAthena that seems reasonable! I think i can do it and I'll get to work on this and share periodic updates here after I cover each paper! I also wrote a survey paper some months ago for a writing class on debiasing methods it's not written in the best way but should give some ideas, I'll share that as well

StellaAthena · 2023-01-01T22:22:30Z

Awesome! Again, I want to stress that the goal is not to write a survey paper: it’s to survey the field so we can pitch our model suite to people in the field effectively.

Let’s plan on synching up on Friday about your progress? Can you post in #interp-across-time in the discord about setting that up?

haileyschoelkopf · 2023-01-02T01:06:28Z

Hey @aflah02 , sorry, been partially AFK the past 2 days due to the holiday so haven't caught up with the literature Stella found so far!

Would be great to follow up with you in our discord so we can discuss this further.

aflah02 · 2023-01-02T04:28:51Z

@StellaAthena Sure! I'll make a post there

@haileyschoelkopf No worries hope you had fun! I'll post updates there

StellaAthena added the feature request New feature or request label Jan 1, 2022

StellaAthena assigned aflah02 Jan 1, 2023

StellaAthena assigned haileyschoelkopf Jan 1, 2023

StellaAthena added this to In progress in Experiments Jan 1, 2023

haileyschoelkopf closed this as completed Jan 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Social bias and knowledge of social bias #5

Social bias and knowledge of social bias #5

StellaAthena commented Jan 1, 2022

aflah02 commented Dec 27, 2022

aflah02 commented Dec 27, 2022 •

edited

Loading

StellaAthena commented Dec 27, 2022

haileyschoelkopf commented Dec 28, 2022

aflah02 commented Dec 31, 2022

StellaAthena commented Jan 1, 2023

aflah02 commented Jan 1, 2023

StellaAthena commented Jan 1, 2023

aflah02 commented Jan 1, 2023

StellaAthena commented Jan 1, 2023 •

edited

Loading

haileyschoelkopf commented Jan 2, 2023

aflah02 commented Jan 2, 2023

Social bias and knowledge of social bias #5

Social bias and knowledge of social bias #5

Comments

StellaAthena commented Jan 1, 2022

aflah02 commented Dec 27, 2022

aflah02 commented Dec 27, 2022 • edited Loading

StellaAthena commented Dec 27, 2022

haileyschoelkopf commented Dec 28, 2022

aflah02 commented Dec 31, 2022

StellaAthena commented Jan 1, 2023

aflah02 commented Jan 1, 2023

StellaAthena commented Jan 1, 2023

aflah02 commented Jan 1, 2023

StellaAthena commented Jan 1, 2023 • edited Loading

haileyschoelkopf commented Jan 2, 2023

aflah02 commented Jan 2, 2023

aflah02 commented Dec 27, 2022 •

edited

Loading

StellaAthena commented Jan 1, 2023 •

edited

Loading