Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mistake in readme #81

Closed
zplizzi opened this issue Mar 28, 2023 · 1 comment
Closed

Mistake in readme #81

zplizzi opened this issue Mar 28, 2023 · 1 comment

Comments

@zplizzi
Copy link

zplizzi commented Mar 28, 2023

In the readme it says

To download and use the deduplicated Pile training data, run:

git lfs clone https://huggingface.co/datasets/EleutherAI/pythia_pile_idxmaps

python utils/unshard_memmap.py --input_file ./pythia_pile_idxmaps/pile_0.87_deduped_text_document-00000-of-00082.bin --num_shards 83 --output_dir ./pythia_pile_idxmaps/```

But it should actually point to EleutherAI/pythia_deduped_pile_idxmaps.

@haileyschoelkopf
Copy link
Collaborator

Thanks for raising :) ! resolved in latest readme update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants