Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable word2vec and add explanations #1239

Merged
merged 4 commits into from
Oct 6, 2021
Merged

Disable word2vec and add explanations #1239

merged 4 commits into from
Oct 6, 2021

Conversation

bomanimc
Copy link
Member

@bomanimc bomanimc commented Oct 3, 2021

We are disabling the word2Vec function to reduce the risk that it produces harmful language using model files we included with our examples. See issue #1238 for additional context on this change.

Reviewers: Please help me catch any spelling or grammatical issues in the notices. I've also left some inline questions.

Images of Updates

Screen Shot 2021-10-03 at 8 23 27 AM

Screen Shot 2021-10-03 at 8 41 07 AM

Copy link
Member

@shiffman shiffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I added the word "pre-trained" in one spot as a suggestion but this is minor point, it is fine either way! Thank you for your thoughtful approach to this critical work!

docs/reference/word2vec.md Outdated Show resolved Hide resolved
@bomanimc bomanimc requested a review from tlsaeger October 4, 2021 12:41
@tlsaeger
Copy link
Member

tlsaeger commented Oct 4, 2021

We could maybe add the link to the Twitter thread instead of the whole Twitter account. It's a little bit a snake biting its own tail, so we need to wait till we have this thread so we can update again? Does that make sense?

Copy link
Contributor

@joeyklee joeyklee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bomanimc ! Whoa - thanks for catching this and for taking action. I didn't realize that these were parts of the pre-trained models!

✅ on my side to disabling and reevaluating how we might continue.

A few non-blocking comments/questions:

  1. Would it make sense at all to delete the pre-model files and prompt the user: "in order to use word2vec you must supply your own pre-trained models?" instead of removing the word2vec feature all together? -- this assumes we keep the word2vec feature as is with the assumption that it is the models that are problematic and not the ml5 functionality itself.
  2. Another provocation would be to imagine what word2vec looks like w/ some filtering built in where we use something like https://github.com/dariusk/wordfilter to remove words we don't really want to show up in a "bot" like environment. This could be something we integrate on the ml5 side 🤔

Generally speaking, I think this approach of disabling a feature > making space to think and evaluate > reapproaching the topic/feature feels like it reflects the values of the project to forefront the "thinking and processing" part of this work over the "just doing" part.

Thank you! 🙏

Co-authored-by: Daniel Shiffman <[email protected]>
@bomanimc
Copy link
Member Author

bomanimc commented Oct 6, 2021

Thanks for comments!

@tlsaeger the suggestion to link directly to the thread makes sense! Maybe we can come back and update these links after posting, since it'd be good to have something in place before we post.

@joeyklee thanks for these comments. These were ideas we consider as well (the first discussed in the Discord #maintenance channel) and the second @c-dacanay and I discussed with Allison Parrish. Overall, it seems like the team leans to this approach of disabling to reduce the immediate harm that could be posed by cases where people are using the example pre-trained model that includes slurs, but would love to continue the discussion with your helpful thoughts!

@bomanimc bomanimc merged commit 99c70f4 into main Oct 6, 2021
@bomanimc bomanimc deleted the bm-disable-word2vec branch October 6, 2021 14:01
@joeyklee
Copy link
Contributor

@bomanimc -- Thanks so much for the update and the explanation. Apologies on my part for not being tuning in! This sounds good! 🙏 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants