Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation #392

Merged
merged 32 commits into from
Aug 21, 2021
Merged

Update documentation #392

merged 32 commits into from
Aug 21, 2021

Conversation

StellaAthena
Copy link
Member

No description provided.

@StellaAthena StellaAthena requested a review from a team as a code owner August 20, 2021 05:38
@StellaAthena StellaAthena linked an issue Aug 20, 2021 that may be closed by this pull request
@ShivanshuPurohit ShivanshuPurohit merged commit 1d46283 into main Aug 21, 2021
@ShivanshuPurohit ShivanshuPurohit deleted the documentation branch August 21, 2021 07:23
sdtblck added a commit that referenced this pull request Aug 21, 2021
* optimize data preprocessing

semaphore is a little too small and slows down tokenizing

* Make killall.sh less bruteforce

* [temporary] fix to index errors

* [temporary] fix to index errors

* print sizes of tensors when inspecting checkpoint (#382)

Co-authored-by: Samuel Weinbach <[email protected]>

* Use lru_cache for GPT2Tokenizer.bpe (#383)

GPT2Tokenizer currently uses an unbounded cache, which causes very
high memory usage with tools/preprocess_data.py

* Fix bug with number of evaluation steps (#384)

we were running way to many evaluation steps if the model is pipe parallel + has g.a.s on because of this line

```python
            for _ in range(neox_args.gradient_accumulation_steps):
```

- fixing this to 1 if the model is pipe parallel fixes the issue, as .eval_batch() already takes gradient accumulation steps into account.

* Create CITATION.cff

* Update CITATION.cff

* Update documentation (#392)

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* add info about installing fused kernels

* Update README.md

* Update README.md

* sparsity + minor typos

add the instructions to install triton

* change path to ssd-1

* typo

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

Co-authored-by: Shivanshu Purohit <[email protected]>

Co-authored-by: Stella Biderman <[email protected]>
Co-authored-by: Samuel Weinbach <[email protected]>
Co-authored-by: Samuel Weinbach <[email protected]>
Co-authored-by: iczero <[email protected]>
Co-authored-by: Shivanshu Purohit <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Write Sampling Documentation
2 participants