forked from EleutherAI/gpt-neox
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update branch #38
Merged
Merged
Update branch #38
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Reqs correction
* llama * spm tokenizer * pipeline * llama to neox conversion script * llama checkin * weights script update and pp reversion * revert for PR * configs * 7B-specific tweak * LLaMA updates * PR feedback * initialize multiple_of --------- Co-authored-by: Quentin-Anthony <[email protected]>
…n calculate_derived
Fix minor issues
Disable row-parallelism for now
) * [bug-fix] enable finetuning option(set optimizer params correctly) * change load_checkpoint --------- Co-authored-by: logan.eo <[email protected]>
[Bug] Make Configs Consistent
* fix list[tensor] typing in both scripts * Update NeoXArgs docs automatically * add bf16 saving to conversion scripts * make precision check more complex for v1.0 * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: haileyschoelkopf <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]>
remove password based login for root
* add bf16 configuration Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * pre commit Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Rework deriving precision Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Belt and suspenders Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Make the default setup (of only using fp16 dict) work Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Got rid of bf16 argument Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Re-add detailed bf16 message * Update NeoXArgs docs automatically * Remove unused import * Update NeoXArgs docs automatically * remove useless newline * Update NeoXArgs docs automatically * re-add detailed bf16 message to deepspeed_args * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* update torch and cuda * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Remove duplicate deepspeed config and allow forced multinode
* Pre-commit Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Do not check for overflow if not using fp16 Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]>
Fix yml error
* added a simple script for multi-node data preparation. * added a simple script for multi-node data preparation. * fixed minor bugs regarding prefixing of the .bin and .idx files * fixed minor bugs regarding prefixing of the .bin and .idx files * fixed minor bugs regarding prefixing of the .bin and .idx files
…heck (#959) * update conversion script instructions in readme * rename v1.0 script (now default for 2.0) to module_to_hf * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
* added HF to NeoX 2.0 conversion script with mp and pp sharding * (1) added missing curly brace to pythial/1-4B config; (2) fixed a bug related to a hardcoded value withing the conversion script (3) fixed possible bugs in the conversion script wrt the mp sharding convention * fill in minimal possible mask values * initialize tensor on the target device --------- Co-authored-by: Quentin Anthony <[email protected]>
* added HF to NeoX 2.0 conversion script with mp and pp sharding * (1) added missing curly brace to pythial/1-4B config; (2) fixed a bug related to a hardcoded value withing the conversion script (3) fixed possible bugs in the conversion script wrt the mp sharding convention * added GeLU fast for HF model, added barriers to enable conversion across multiple nodes, removed partially hardcoded pythia model name * commented unecessary logging and timers --------- Co-authored-by: Quentin Anthony <[email protected]>
* add an optional `label` field passed in parallel with training data. * minor fix; Add doc * fix * fix data can be None * prevent loading optimizer * add script * Remove some print() stmts, make mask documentation clearer * Add documentation for preprocess_data_with_mask.py --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
* fix tensorboard version * add setup wandb tensorboard
* Update gpt2_dataset.py * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Dataload fix
For BPE tokenization, `tokenizer_type` argument should be a string instead of the list.
default tokenizer_type should be string
Replaced all torch.concat with torch.cat
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.