forked from EleutherAI/gpt-neox
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull] main from EleutherAI:main #1
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Code formatting Signed-off-by: Dashiell Stander <[email protected]> * Import it differently Signed-off-by: Dashiell Stander <[email protected]> * Bump up triton dependency Signed-off-by: Dashiell Stander <[email protected]> * No kwargs Signed-off-by: Dashiell Stander <[email protected]> * No kwargs Signed-off-by: Dashiell Stander <[email protected]> * Get the signature right Signed-off-by: Dashiell Stander <[email protected]> * Get the signature right Signed-off-by: Dashiell Stander <[email protected]> * Add dim for num heads to bias Signed-off-by: Dashiell Stander <[email protected]> * Add dim for num heads to bias Signed-off-by: Dashiell Stander <[email protected]> * Add dim for num heads to bias Signed-off-by: Dashiell Stander <[email protected]> * Think I have the shape right now? Signed-off-by: Dashiell Stander <[email protected]> * blegh shapes Signed-off-by: Dashiell Stander <[email protected]> * blegh shapes Signed-off-by: Dashiell Stander <[email protected]> * Need to get the triton version just right Signed-off-by: Dashiell Stander <[email protected]> * Remove debug print statements Signed-off-by: Dashiell Stander <[email protected]> * Need to permute the dimensions before returning Signed-off-by: Dashiell Stander <[email protected]> * Update SparseAttention signature Signed-off-by: Dashiell Stander <[email protected]> * Clean up code. Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* pass iteration to setup_model_and_optimizer * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* fix list[tensor] typing in both scripts * Update NeoXArgs docs automatically * add bf16 saving to conversion scripts * make precision check more complex for v1.0 * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
gh: move `CODEOWNERS` inside the `.github/` dir
This PR makes a few tiny changes to improve the overall quality of the docker image 🐳 . For reference more annotations can be found [here](https://github.com/opencontainers/image-spec/blob/main/annotations.md)
feat(docker): Add opencontainers image-spec to `Dockerfile`
Pythia configs
Fix README formatting
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
* Update README.md Fix broken link * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* Fix bugs so we can use bf16 with zero > 0 Signed-off-by: Dashiell Stander <[email protected]> * Typo Signed-off-by: Dashiell Stander <[email protected]> * Typo Signed-off-by: Dashiell Stander <[email protected]> * With the DeepSpeed updates there may be no need to do grad_accum in fp32 Signed-off-by: Dashiell Stander <[email protected]> * Add warning about necessity of fp32 grad_accum with bf16, pp>0, and zero1 Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]>
* Remove lazy dataset implementation option Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* Fix SequentialGeneration * Fix SequentialGeneration
* Fix register_buffer parameter * Fix register_buffer parameter
* Add flash 2.x message to README.md * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
* add s3 checkpoint syncing * Update NeoXArgs docs automatically * remove CPCargo requirement * Update NeoXArgs docs automatically * Make s3 imports try-except and separate requirements to s3 file * Update NeoXArgs docs automatically * Announce feature * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* Try out just using the HF implementation Signed-off-by: Dashiell Stander <[email protected]> * Rely solely on HF tokenizer. Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]>
* Pre-commit Signed-off-by: Dashiell Stander <[email protected]> * Sequence dimension is 0 Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* Ensure that LR annealing is correct even after loading from checkpoint. Patch from Eric Nguyen Co-authored-by: Eric Nguyen <[email protected]> Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Test whether we need the whole patch Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Turns out we do not need the entire patch, just one line Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: Eric Nguyen <[email protected]> Co-authored-by: github-actions <[email protected]>
* Use Megatron-DeepSpeed flops calculation Signed-off-by: Dashiell Stander <[email protected]> * Use Megatron-DeepSpeed flops calculation Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Direct comparison of FLOPS calculations Signed-off-by: Dashiell Stander <[email protected]> * Remove test logging Signed-off-by: Dashiell Stander <[email protected]> --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* adding boilerplate coverity scan to submit to public analysis * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
) * Add documentation about kicking off distributed jobs Signed-off-by: Dashiell Stander <[email protected]> * Add documentation about kicking off distributed jobs Signed-off-by: Dashiell Stander <[email protected]> * Add documentation about kicking off distributed jobs Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Added more info on run command modification and cleaned up a bit * slight cleanup * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* Fix readme typo * Update NeoXArgs docs automatically * More typos * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
* Update CITATION.cff * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
* Re-organize the folder Co-authored-by: Stella Biderman <[email protected]> Signed-off-by: Dashiell Stander <[email protected]> * Add README.md files for each subdirectory. Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Clarify the difference between HF scripts Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Fix tools paths * Update NeoXArgs docs automatically * flesh out ckpts README * Update NeoXArgs docs automatically * Fix tools paths for megatron imports * Update NeoXArgs docs automatically * Delete tools/ckpts/merge_mp_partitions.py since it's based on a very old Megatron * Update NeoXArgs docs automatically * Add blurb to bash tools README * Update NeoXArgs docs automatically * Flesh out datasets README * Update NeoXArgs docs automatically * formatting * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: Stella Biderman <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* Add documentation and an informative error Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]>
* Add lr_scheduler check for inference. Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]>
* initial commit * test set, fixed readme and docstring * Refactor Lion implementation --------- Co-authored-by: kamathis4 <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )