Skip to content

Commit

Permalink
remove enwik8 data from repository
Browse files Browse the repository at this point in the history
  • Loading branch information
lucidrains committed Dec 26, 2020
1 parent f38c371 commit f376295
Show file tree
Hide file tree
Showing 4 changed files with 4 additions and 9 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
## Data
data/

## Python
__pycache__/

Expand Down
6 changes: 0 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,6 @@ An implementation of model parallel GPT-3-like models on GPUs, based on the Deep
$ pip install -r requirements.txt
```

Test locally

```bash
$ python train_enwik8.py
```

Test deepspeed locally

```bash
Expand Down
3 changes: 0 additions & 3 deletions data/README.md

This file was deleted.

1 change: 1 addition & 0 deletions gpt_neox/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@

def prepare_enwik8_data():
if not os.path.isfile('./data/enwik8.gz'):
os.system('mkdir -p ./data')
os.system('wget http:https://eaidata.bmk.sh/data/enwik8.gz -O ./data/enwik8.gz')

with gzip.open('./data/enwik8.gz') as file:
Expand Down

0 comments on commit f376295

Please sign in to comment.