Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the NeoX implementation of GPT2Tokenizer #1042

Merged
merged 3 commits into from
Sep 25, 2023

Conversation

dashstander
Copy link
Contributor

As was noted in #999, the name GPT2Tokenizer was shadowed, imported from both HuggingFace transformers and from our own implementation of the same class. It would only have been used if someone set "tokenizer_type": "GPT2BPETokenizer", as otherwise the HuggingFace "fast" tokenizer would be used. In my tests they appear to be equivalent, so this means one less file to maintain.

@dashstander dashstander requested a review from a team as a code owner September 25, 2023 13:59
@dashstander dashstander linked an issue Sep 25, 2023 that may be closed by this pull request
@Quentin-Anthony Quentin-Anthony merged commit 2ab05be into main Sep 25, 2023
2 checks passed
@Quentin-Anthony Quentin-Anthony deleted the 999-tokenizer-double-import branch September 25, 2023 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The class with the same name was imported twice
2 participants