Update branch #38

kshitijkg · 2023-08-04T18:53:42Z

No description provided.

Reqs correction

* llama * spm tokenizer * pipeline * llama to neox conversion script * llama checkin * weights script update and pp reversion * revert for PR * configs * 7B-specific tweak * LLaMA updates * PR feedback * initialize multiple_of --------- Co-authored-by: Quentin-Anthony <[email protected]>

…n calculate_derived

Fix minor issues

Disable row-parallelism for now

) * [bug-fix] enable finetuning option(set optimizer params correctly) * change load_checkpoint --------- Co-authored-by: logan.eo <[email protected]>

[Bug] Make Configs Consistent

* fix list[tensor] typing in both scripts * Update NeoXArgs docs automatically * add bf16 saving to conversion scripts * make precision check more complex for v1.0 * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: haileyschoelkopf <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]>

remove password based login for root

* add bf16 configuration Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * pre commit Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Rework deriving precision Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Belt and suspenders Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Make the default setup (of only using fp16 dict) work Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Got rid of bf16 argument Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Re-add detailed bf16 message * Update NeoXArgs docs automatically * Remove unused import * Update NeoXArgs docs automatically * remove useless newline * Update NeoXArgs docs automatically * re-add detailed bf16 message to deepspeed_args * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

…ode exec

* update torch and cuda * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

Remove duplicate deepspeed config and allow forced multinode

* Pre-commit Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Do not check for overflow if not using fp16 Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]>

Fix yml error

* added a simple script for multi-node data preparation. * added a simple script for multi-node data preparation. * fixed minor bugs regarding prefixing of the .bin and .idx files * fixed minor bugs regarding prefixing of the .bin and .idx files * fixed minor bugs regarding prefixing of the .bin and .idx files

…heck (#959) * update conversion script instructions in readme * rename v1.0 script (now default for 2.0) to module_to_hf * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

* added HF to NeoX 2.0 conversion script with mp and pp sharding * (1) added missing curly brace to pythial/1-4B config; (2) fixed a bug related to a hardcoded value withing the conversion script (3) fixed possible bugs in the conversion script wrt the mp sharding convention * fill in minimal possible mask values * initialize tensor on the target device --------- Co-authored-by: Quentin Anthony <[email protected]>

* added HF to NeoX 2.0 conversion script with mp and pp sharding * (1) added missing curly brace to pythial/1-4B config; (2) fixed a bug related to a hardcoded value withing the conversion script (3) fixed possible bugs in the conversion script wrt the mp sharding convention * added GeLU fast for HF model, added barriers to enable conversion across multiple nodes, removed partially hardcoded pythia model name * commented unecessary logging and timers --------- Co-authored-by: Quentin Anthony <[email protected]>

* add an optional `label` field passed in parallel with training data. * minor fix; Add doc * fix * fix data can be None * prevent loading optimizer * add script * Remove some print() stmts, make mask documentation clearer * Add documentation for preprocess_data_with_mask.py --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* fix tensorboard version * add setup wandb tensorboard

* Update gpt2_dataset.py * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

Dataload fix

For BPE tokenization, `tokenizer_type` argument should be a string instead of the list.

default tokenizer_type should be string

Replaced all torch.concat with torch.cat

StellaAthena and others added 30 commits April 21, 2023 20:07

Added DeeperSpeed to requirements.txt

beaf960

Update NeoXArgs docs automatically

1b1e4eb

Update NeoXArgs docs automatically

945c0ce

Merge pull request #895 from EleutherAI/reqs-correction

ae9f81f

Reqs correction

Update requirements.txt

586f514

Disable row-parallelism for now because it's broken (#905)

25e1fb0

Update NeoXArgs docs automatically

8fc7521

Fix minor issues with config option defaults, and indentation error i…

507ad04

…n calculate_derived

Update NeoXArgs docs automatically

979e3c5

Merge pull request #917 from EleutherAI/fix-minor

dee7528

Fix minor issues

Merge branch 'main' into disable-row-parallel

5d2d78a

Update NeoXArgs docs automatically

d47a207

Merge pull request #915 from EleutherAI/disable-row-parallel

b608043

Disable row-parallelism for now

[Bug] Make Configs Consistent

9900071

[bug-fix] enable finetuning option(set optimizer params correctly) (#927

befd133

) * [bug-fix] enable finetuning option(set optimizer params correctly) * change load_checkpoint --------- Co-authored-by: logan.eo <[email protected]>

fix bug for flash attention (#910)

dc05783

[Fix] pre-commit and update-documentation checks

a4e9f24

Merge pull request #928 from austinburnett/bug/consistentConfigs

3719533

[Bug] Make Configs Consistent

remove password based login for root

b192e18

Update NeoXArgs docs automatically

5c4b51b

Merge pull request #936 from EleutherAI/remove-default-password

9a18727

remove password based login for root

Remove duplicate deepspeed config arg and allow users to force multin…

b130d58

…ode exec

Update NeoXArgs docs automatically

b8cbc7d

update torch and cuda (#937)

162ea36

* update torch and cuda * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

Merge branch 'main' into force_multi

21a43a2

Merge pull request #938 from EleutherAI/force_multi

1a43a58

Remove duplicate deepspeed config and allow forced multinode

xu-song and others added 29 commits May 27, 2023 12:30

Fix yml error

a6e22cc

Merge pull request #955 from xu-song/patch-4

3355142

Fix yml error

fix issue with data

fba32d5

fix issues with dataloading

919011a

fixes

f46542d

change configs

713160e

change configs

8dc649d

Update NeoXArgs docs automatically

a2b1d51

Fix setup README and tensorboard version (#975)

cfce548

* fix tensorboard version * add setup wandb tensorboard

Update gpt2_dataset.py (#974)

2534e3d

* Update gpt2_dataset.py * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

Merge branch 'main' into dataloadFix

9bc653c

Update NeoXArgs docs automatically

a681065

Delete enwik8.zip

b7e57df

Update NeoXArgs docs automatically

6cfc1bb

Update data_utils.py

c2924f9

Update NeoXArgs docs automatically

a312f11

Update training.py

e16af33

Update NeoXArgs docs automatically

2a3b238

Merge pull request #979 from EleutherAI/dataloadFix

700219b

Dataload fix

default tokenizer_type should be string

90df174

For BPE tokenization, `tokenizer_type` argument should be a string instead of the list.

Merge pull request #993 from mayankjobanputra/patch-1

303d7be

default tokenizer_type should be string

changed all instances of torch.concat to torch.cat

edfa39b

Merge pull request #996 from mariebiscuit/main

408e29d

Replaced all torch.concat with torch.cat

format docs/s and MB/s (#1002)

8b3f1c2

kshitijkg merged commit 6c1f7bc into CERC-AAI:main Aug 4, 2023
0 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update branch #38

Update branch #38

kshitijkg commented Aug 4, 2023

Update branch #38

Update branch #38

Conversation

kshitijkg commented Aug 4, 2023