[pull] main from EleutherAI:main #2

pull · 2023-10-31T07:27:14Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

* fix lion optimizer documentation * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

* Fix preprocess_data.py link * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

* edge-casing for multiGPU hf to sequential case * cleanup whitespace * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

* Pin lm_eval version * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

…anks are not reflected, so strange results always appear when tp_ranks is greater than 1.

�Fixing convert neox to huggingface bug

* Update neox_args.py These attention configuration options were missing from the docs. This will fix that. * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

* Update README.md * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

* Use `.yml` extensions in README to reflect extensions used in `configs/` folder * Rename `save_interval` -> `checkpoint_factor` * Mark expected failures in existing tests * Fix minor typos * Allow creation of checkpoint at iteration 0 when `do_train=False` Helpful for unit tests because it allows use of a randomly initialised model * Delete duplicated `test_fused_kernels.py` Primary version lives in `tests/model/test_fused_kernels.py` * Avoid initializing CUDA whenever `megatron` is imported Resolves `Cannot re-initialize CUDA in forked subprocess` error when running distributed unit tests * Extend suite of unit tests

* Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml update build command to avert empty cwd in build metrics * Update coverity_scan.yml * Update coverity_scan.yml adding verbose to debug curl * Update coverity_scan.yml debug print trace to examine build metrics xml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

* Update logging.py * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

Remove myself as a code owner as I shouldn't be approving PRs.

Bumps [transformers](https://github.com/huggingface/transformers) from 4.30.2 to 4.36.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](huggingface/transformers@v4.30.2...v4.36.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Pins old DeeperSpeed until bug is fixed There is a bug in upstream DeepSpeed detailed [here](microsoft/DeepSpeed#4781) that we didn't catch before synching with main. This pins the prior commit so the bug doesn't impact users. * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

* add qk normalization * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

* add cuda support for flash attn w/ alibi, warn of deprecation of triton * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

* TP works! * merge TP mamba changes with most current MambaLayer * cleanup TP, confirmed working still * make shapes with TP>1 work with conversion * tested and PP works, so no need for assert blocking it in arguments * update comment * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

* added ds zero.Init() to get_model * Clean up conditional with block * pre-commit --------- Co-authored-by: Quentin Anthony <[email protected]>

ENH Small typo in the README

* making PR triggered CPU test for changes to megatron * Update NeoXArgs docs automatically * pre-commit * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

* initial JIT load functions * passing neox_arge to load() as optional for easy testing * modified headers for correct copyright statements

… init (#1191) * added ds zero.Init() to get_model * Clean up conditional with block * pre-commit * ensured deepspeed configs are passed to init --------- Co-authored-by: Quentin Anthony <[email protected]>

* Fixes a weird typo * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

Bumps [transformers](https://github.com/huggingface/transformers) from 4.36.0 to 4.38.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](huggingface/transformers@v4.36.0...v4.38.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* split PR and CPU tests into separate work; adjust references to env variables in workflow * tweaking to pull compose file from CPU test dir * adding post-cleanup for portability; adding workflow_dispatch to test * fixing mapping * forgot shell declaration in composite run * make sure all steps run even if first CPU tests fail * adding workflow dispatch to manually call workflow; removing httpserver * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

* Add megablocks dropless MoE * pre-commit --------- Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

…_size (#1209) In tools/ckpts/convert_neox_to_hf.py, for neox architecture the 'intermediate_size' argument is not explicitly set, so it defaults to 24576 from: https://github.com/huggingface/transformers/blob/9fe3f585bb4ea29f209dc705d269fbe292e1128f/src/transformers/models/gpt_neox/configuration_gpt_neox.py#L48 Proposed solution: set intermediate-size to 4 * hidden-size

* add rwkv support * Update init_functions.py * rwkv model files * configs * kernels * Cleanup * Update 760M.yml * remove preffn and mishglu * Update NeoXArgs docs automatically * Add RWKV parallelism assertions * Update NeoXArgs docs automatically * pre-commit and config cleanup * Update NeoXArgs docs automatically * rwkv logging * Update NeoXArgs docs automatically * Add rwkv version dirname, make hdim 3.5x * pre-commit * Update NeoXArgs docs automatically * fix bug and set batch size to 32 * Update NeoXArgs docs automatically --------- Co-authored-by: Quentin Anthony <[email protected]> Co-authored-by: github-actions <[email protected]>

Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.4. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](pallets/jinja@3.1.3...3.1.4) --- updated-dependencies: - dependency-name: jinja2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* misc changes to neox_args * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

* misc changes to neox_args * allow rwkv pp * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

* format: flagged on pre-commit * feat: add pytorch profiling * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

* Tolerate no fused kernels * Fix requirements file syntax * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

* Update README.md * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

* add workflow_dispatch to gh actions pr so we can run on command * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

* init changes to README * Update NeoXArgs docs automatically * Update README.md * Update NeoXArgs docs automatically * Update README.md * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

* Fix changed behavior of pipe_parallel * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

* update is_pipe_parallel logic ; handle tied-embeddings case correctly * Update NeoXArgs docs automatically * revert PP to be consistent * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

* fix python version and pytest install * Update NeoXArgs docs automatically * python3 * Update NeoXArgs docs automatically * pip not pip3 * Update NeoXArgs docs automatically * python3 pip * Update NeoXArgs docs automatically * python3 -m pip * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * add docker setup to workflow * Update NeoXArgs docs automatically * python setup * Update NeoXArgs docs automatically * python setup v2 * Update NeoXArgs docs automatically * python setup v3 * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Add hash back to deep speed version * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

jahatef and others added 2 commits October 30, 2023 21:48

fix lion optimizer documentation (#1067)

e277bc7

* fix lion optimizer documentation * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

Fix preprocess_data.py link (#1064)

f574f22

* Fix preprocess_data.py link * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

pull bot added the ⤵️ pull label Oct 31, 2023

haileyschoelkopf and others added 27 commits October 31, 2023 20:46

Create tools __init__.py for import (#1068)

8c9fc00

Pin version of lm_eval (#1070)

a10f69c

* Pin lm_eval version * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

fixed case when ntasks_per_node is used instead (#1069)

41f019e

Update README.md

90aa131

When processing mlp.dense_4h_to_h.bias and attention.dense.bias, tp_r…

04dc2ba

…anks are not reflected, so strange results always appear when tp_ranks is greater than 1.

Merge pull request #1072 from kyuheejang/Fixing-neox-to-huggingface

f214358

�Fixing convert neox to huggingface bug

Resolve error in the test_neoxargs_usage unit test (#1074)

d8028f8

Update neox_args.py (#1081)

10bf788

* Update neox_args.py These attention configuration options were missing from the docs. This will fix that. * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

Update README.md (#1082)

f48d3a6

* Update README.md * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

Update README.md

efea81f

Corrects FLOPs formula as per 1093 (#1094)

050f560

* Update logging.py * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

Update CODEOWNERS

f19b2ec

Remove myself as a code owner as I shouldn't be approving PRs.

Update README.md

9eef954

Update README.md

a48e09e

Update NeoXArgs docs automatically

613e5a6

Update README.md

be7eeda

Update README.md

2117afc

Update NeoXArgs docs automatically

8dba5b6

Add QK Normalization (#1100)

f161245

* add qk normalization * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

Update README.md

7fb3b3c

Update README.md

a7509f0

Merge branch 'main' into StellaAthena-patch-4-1

8eaac4e

haileyschoelkopf and others added 30 commits March 12, 2024 21:33

Switch to using Cuda Flash Attn for Alibi (#1183)

03186de

* add cuda support for flash attn w/ alibi, warn of deprecation of triton * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

[ZeRO-3] Partitioned init with deepspeed.zero.Init() (#1190)

7267a74

* added ds zero.Init() to get_model * Clean up conditional with block * pre-commit --------- Co-authored-by: Quentin Anthony <[email protected]>

Small typo in the README

e6b5261

Merge pull request #1196 from edouardoyallon/typo_readme

4085302

ENH Small typo in the README

Added more papers

1960b66

Update README.md

3616658

[AMD] Supporting fused kernels build using JIT (#1188)

51a7de9

* initial JIT load functions * passing neox_arge to load() as optional for easy testing * modified headers for correct copyright statements

[ZeRO-3] Ensured passing neox deepspeed_config when using partitioned…

01657aa

… init (#1191) * added ds zero.Init() to get_model * Clean up conditional with block * pre-commit * ensured deepspeed configs are passed to init --------- Co-authored-by: Quentin Anthony <[email protected]>

Fix flash config for llama2/70B.yml config (#1206)

703d02f

Fixes a weird typo (#1207)

838d5bf

* Fixes a weird typo * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

Add megablocks dropless MoE (#1192)

916c883

* Add megablocks dropless MoE * pre-commit --------- Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

Run document update again (#1216)

d037756

* misc changes to neox_args * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

Rwkv pipeline parallelism (#1221)

153e732

* misc changes to neox_args * allow rwkv pp * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

Add Torch Profiler Support (#1226)

2746d43

* format: flagged on pre-commit * feat: add pytorch profiling * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

fixed fused_rope naming in JIT + added readme for amd support (#1224)

1d55708

Fix markdown formatting error (#1217)

dfc6722

* Update README.md * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>

add workflow_dispatch to gh actions pr so we can run on command (#1233)

b5c0afe

* add workflow_dispatch to gh actions pr so we can run on command * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>

fix summed biases not being divided by mp size (#1220)

90a6cdb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from EleutherAI:main #2

[pull] main from EleutherAI:main #2

pull bot commented Oct 31, 2023 •

edited

[pull] main from EleutherAI:main #2

Are you sure you want to change the base?

[pull] main from EleutherAI:main #2

Conversation

pull bot commented Oct 31, 2023 • edited

pull bot commented Oct 31, 2023 •

edited