Merge pull request #783 from Keith-Hon/patch-1

Update README.md
EleutherAI · Feb 7, 2023 · e48b0c4 · e48b0c4
2 parents 26ef16d + c9bc330
commit e48b0c4
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/README.md b/README.md
@@ -181,7 +181,7 @@ The tokenized data will be saved out to two files: `[data-dir]/[dataset-name]/[d
 
 ## Using Custom Data
 
-To prepare your own dataset for training with custom data, format it as one large [jsonl](https://jsonlines.org/)-formatted file with each item in the list of dictionaries being a separate document. The document text should be grouped under one JSON key, i.e `"text"`. Any auxiliary data stored in other fields will not be
+To prepare your own dataset for training with custom data, format it as one large [jsonl](https://jsonlines.org/)-formatted file with each item in the list of dictionaries being a separate document. The document text should be grouped under one JSON key, i.e `"text"`. Any auxiliary data stored in other fields will not be used.
 
 Next make sure to download the GPT2 tokenizer vocab, and merge files from the following links: