forked from EleutherAI/gpt-neox
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull] main from EleutherAI:main #2
Open
pull
wants to merge
103
commits into
ishandutta2007:main
Choose a base branch
from
EleutherAI:main
base: main
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* fix lion optimizer documentation * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
* Fix preprocess_data.py link * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
* edge-casing for multiGPU hf to sequential case * cleanup whitespace * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* Pin lm_eval version * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
…anks are not reflected, so strange results always appear when tp_ranks is greater than 1.
�Fixing convert neox to huggingface bug
* Update neox_args.py These attention configuration options were missing from the docs. This will fix that. * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
* Update README.md * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
* Use `.yml` extensions in README to reflect extensions used in `configs/` folder * Rename `save_interval` -> `checkpoint_factor` * Mark expected failures in existing tests * Fix minor typos * Allow creation of checkpoint at iteration 0 when `do_train=False` Helpful for unit tests because it allows use of a randomly initialised model * Delete duplicated `test_fused_kernels.py` Primary version lives in `tests/model/test_fused_kernels.py` * Avoid initializing CUDA whenever `megatron` is imported Resolves `Cannot re-initialize CUDA in forked subprocess` error when running distributed unit tests * Extend suite of unit tests
* Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml update build command to avert empty cwd in build metrics * Update coverity_scan.yml * Update coverity_scan.yml adding verbose to debug curl * Update coverity_scan.yml debug print trace to examine build metrics xml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update coverity_scan.yml * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* Update logging.py * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Remove myself as a code owner as I shouldn't be approving PRs.
Bumps [transformers](https://github.com/huggingface/transformers) from 4.30.2 to 4.36.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](huggingface/transformers@v4.30.2...v4.36.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Pins old DeeperSpeed until bug is fixed There is a bug in upstream DeepSpeed detailed [here](microsoft/DeepSpeed#4781) that we didn't catch before synching with main. This pins the prior commit so the bug doesn't impact users. * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
* add qk normalization * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* add cuda support for flash attn w/ alibi, warn of deprecation of triton * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
* TP works! * merge TP mamba changes with most current MambaLayer * cleanup TP, confirmed working still * make shapes with TP>1 work with conversion * tested and PP works, so no need for assert blocking it in arguments * update comment * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* added ds zero.Init() to get_model * Clean up conditional with block * pre-commit --------- Co-authored-by: Quentin Anthony <[email protected]>
ENH Small typo in the README
* making PR triggered CPU test for changes to megatron * Update NeoXArgs docs automatically * pre-commit * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* initial JIT load functions * passing neox_arge to load() as optional for easy testing * modified headers for correct copyright statements
… init (#1191) * added ds zero.Init() to get_model * Clean up conditional with block * pre-commit * ensured deepspeed configs are passed to init --------- Co-authored-by: Quentin Anthony <[email protected]>
* Fixes a weird typo * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
Bumps [transformers](https://github.com/huggingface/transformers) from 4.36.0 to 4.38.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](huggingface/transformers@v4.36.0...v4.38.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* split PR and CPU tests into separate work; adjust references to env variables in workflow * tweaking to pull compose file from CPU test dir * adding post-cleanup for portability; adding workflow_dispatch to test * fixing mapping * forgot shell declaration in composite run * make sure all steps run even if first CPU tests fail * adding workflow dispatch to manually call workflow; removing httpserver * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* Add megablocks dropless MoE * pre-commit --------- Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
…_size (#1209) In tools/ckpts/convert_neox_to_hf.py, for neox architecture the 'intermediate_size' argument is not explicitly set, so it defaults to 24576 from: https://github.com/huggingface/transformers/blob/9fe3f585bb4ea29f209dc705d269fbe292e1128f/src/transformers/models/gpt_neox/configuration_gpt_neox.py#L48 Proposed solution: set intermediate-size to 4 * hidden-size
* add rwkv support * Update init_functions.py * rwkv model files * configs * kernels * Cleanup * Update 760M.yml * remove preffn and mishglu * Update NeoXArgs docs automatically * Add RWKV parallelism assertions * Update NeoXArgs docs automatically * pre-commit and config cleanup * Update NeoXArgs docs automatically * rwkv logging * Update NeoXArgs docs automatically * Add rwkv version dirname, make hdim 3.5x * pre-commit * Update NeoXArgs docs automatically * fix bug and set batch size to 32 * Update NeoXArgs docs automatically --------- Co-authored-by: Quentin Anthony <[email protected]> Co-authored-by: github-actions <[email protected]>
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.4. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](pallets/jinja@3.1.3...3.1.4) --- updated-dependencies: - dependency-name: jinja2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* misc changes to neox_args * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
* misc changes to neox_args * allow rwkv pp * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* format: flagged on pre-commit * feat: add pytorch profiling * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* Tolerate no fused kernels * Fix requirements file syntax * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* Update README.md * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* add workflow_dispatch to gh actions pr so we can run on command * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
* init changes to README * Update NeoXArgs docs automatically * Update README.md * Update NeoXArgs docs automatically * Update README.md * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* Fix changed behavior of pipe_parallel * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* update is_pipe_parallel logic ; handle tied-embeddings case correctly * Update NeoXArgs docs automatically * revert PP to be consistent * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* fix python version and pytest install * Update NeoXArgs docs automatically * python3 * Update NeoXArgs docs automatically * pip not pip3 * Update NeoXArgs docs automatically * python3 pip * Update NeoXArgs docs automatically * python3 -m pip * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * add docker setup to workflow * Update NeoXArgs docs automatically * python setup * Update NeoXArgs docs automatically * python setup v2 * Update NeoXArgs docs automatically * python setup v3 * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * python setup v3 * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Add hash back to deep speed version * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )