Skip to content

Commit

Permalink
update LazyPretrainDataset
Browse files Browse the repository at this point in the history
  • Loading branch information
yangjianxin1 committed Jan 8, 2024
1 parent a93978f commit f65243d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion component/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -502,7 +502,7 @@ def __iter__(self):
load_from_cache_file=True,
keep_in_memory=False,
cache_file_names={k: os.path.join(self.cache_dir, file_name, 'tokenized.arrow') for k in raw_dataset},
desc=f"Tokenizing data: {file}",
desc=f"Tokenizing data",
)

# 拼接所有token
Expand Down

0 comments on commit f65243d

Please sign in to comment.