Skip to content

Commit

Permalink
add descriptive errors for unimplemented class methods
Browse files Browse the repository at this point in the history
  • Loading branch information
haileyschoelkopf committed Jan 31, 2023
1 parent 9879ef2 commit 3463ca4
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions megatron/tokenizer/tokenizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -368,11 +368,12 @@ def vocab_size(self):

@property
def vocab(self):
raise NotImplementedError
raise NotImplementedError("TiktokenTokenizer does not implement vocabulary access.")

@property
def inv_vocab(self):
raise NotImplementedError
raise NotImplementedError("TiktokenTokenizer does not implement vocabulary access. \
To get the idx-th token in vocabulary, use tokenizer.decode([idx]) .")

def tokenize(self, text: str):
return self.tokenizer.encode(text) #, allowed_special="all")
Expand Down

0 comments on commit 3463ca4

Please sign in to comment.