Skip to content

Commit

Permalink
default get_tokenizer to set padding token
Browse files Browse the repository at this point in the history
  • Loading branch information
sdtblck committed Jan 13, 2021
1 parent 0b1a862 commit 2252c90
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion gpt_neox/data_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def natural_sort(l):
return sorted(l, key=alphanum_key)


def get_tokenizer(tokenizer_type=None, from_pretrained=True, add_padding_token=False):
def get_tokenizer(tokenizer_type=None, from_pretrained=True, add_padding_token=True):
if tokenizer_type is None or (tokenizer_type.lower() == "hf_gpt2tokenizerfast" and from_pretrained):
tok = GPT2TokenizerFast.from_pretrained('gpt2')
if add_padding_token:
Expand Down

0 comments on commit 2252c90

Please sign in to comment.