Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from EleutherAI:main #2

Open
wants to merge 103 commits into
base: main
Choose a base branch
from
Open

Conversation

pull[bot]
Copy link

@pull pull bot commented Oct 31, 2023

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

jahatef and others added 2 commits October 30, 2023 21:48
* fix lion optimizer documentation

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
* Fix preprocess_data.py link

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
@pull pull bot added the ⤵️ pull label Oct 31, 2023
haileyschoelkopf and others added 27 commits October 31, 2023 20:46
* edge-casing for multiGPU hf to sequential case

* cleanup whitespace

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
* Pin lm_eval version

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
…anks are not reflected, so strange results always appear when tp_ranks is greater than 1.
�Fixing convert neox to huggingface bug
* Update neox_args.py

These attention configuration options were missing from the docs. This will fix that.

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
* Update README.md

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
* Use `.yml` extensions in README to reflect extensions used in `configs/` folder

* Rename `save_interval` -> `checkpoint_factor`

* Mark expected failures in existing tests

* Fix minor typos

* Allow creation of checkpoint at iteration 0 when `do_train=False`

Helpful for unit tests because it allows use of a randomly initialised model

* Delete duplicated `test_fused_kernels.py`

Primary version lives in `tests/model/test_fused_kernels.py`

* Avoid initializing CUDA whenever `megatron` is imported

Resolves `Cannot re-initialize CUDA in forked subprocess` error when running distributed unit tests

* Extend suite of unit tests
* Update coverity_scan.yml

* Update coverity_scan.yml

* Update coverity_scan.yml

* Update coverity_scan.yml

* Update coverity_scan.yml

* Update coverity_scan.yml

* Update coverity_scan.yml

* Update coverity_scan.yml

update build command to avert empty cwd in build metrics

* Update coverity_scan.yml

* Update coverity_scan.yml

adding verbose to debug curl

* Update coverity_scan.yml

debug print trace to examine build metrics xml

* Update coverity_scan.yml

* Update coverity_scan.yml

* Update coverity_scan.yml

* Update coverity_scan.yml

* Update coverity_scan.yml

* Update coverity_scan.yml

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
* Update logging.py

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Remove myself as a code owner as I shouldn't be approving PRs.
Bumps [transformers](https://github.com/huggingface/transformers) from 4.30.2 to 4.36.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.30.2...v4.36.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Pins old DeeperSpeed until bug is fixed

There is a bug in upstream DeepSpeed detailed [here](microsoft/DeepSpeed#4781) that we didn't catch before synching with main. This pins the prior commit so the bug doesn't impact users.

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
* add qk normalization

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
haileyschoelkopf and others added 30 commits March 12, 2024 21:33
* add cuda support for flash attn w/ alibi, warn of deprecation of triton

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
* TP works!

* merge TP mamba changes with most current MambaLayer

* cleanup TP, confirmed working still

* make shapes with TP>1 work with conversion

* tested and PP works, so no need for assert blocking it in arguments

* update comment

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
* added ds zero.Init() to get_model

* Clean up conditional with block

* pre-commit

---------

Co-authored-by: Quentin Anthony <[email protected]>
* making PR triggered CPU test for changes to megatron

* Update NeoXArgs docs automatically

* pre-commit

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
* initial JIT load functions

* passing neox_arge to load() as optional for easy testing

* modified headers for correct copyright statements
… init (#1191)

* added ds zero.Init() to get_model

* Clean up conditional with block

* pre-commit

* ensured deepspeed configs are passed to init

---------

Co-authored-by: Quentin Anthony <[email protected]>
* Fixes a weird typo

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Bumps [transformers](https://github.com/huggingface/transformers) from 4.36.0 to 4.38.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.36.0...v4.38.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* split PR and CPU tests into separate work; adjust references to env variables in workflow

* tweaking to pull compose file from CPU test dir

* adding post-cleanup for portability; adding workflow_dispatch to test

* fixing mapping

* forgot shell declaration in composite run

* make sure all steps run even if first CPU tests fail

* adding workflow dispatch to manually call workflow; removing httpserver

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
* Add megablocks dropless MoE

* pre-commit

---------

Co-authored-by: Yang Zhang <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
…_size (#1209)

In tools/ckpts/convert_neox_to_hf.py, for neox architecture the 'intermediate_size'
argument is not explicitly set, so it defaults to 24576 from:

https://github.com/huggingface/transformers/blob/9fe3f585bb4ea29f209dc705d269fbe292e1128f/src/transformers/models/gpt_neox/configuration_gpt_neox.py#L48

Proposed solution: set intermediate-size to 4 * hidden-size
* add rwkv support

* Update init_functions.py

* rwkv model files

* configs

* kernels

* Cleanup

* Update 760M.yml

* remove preffn and mishglu

* Update NeoXArgs docs automatically

* Add RWKV parallelism assertions

* Update NeoXArgs docs automatically

* pre-commit and config cleanup

* Update NeoXArgs docs automatically

* rwkv logging

* Update NeoXArgs docs automatically

* Add rwkv version dirname, make hdim 3.5x

* pre-commit

* Update NeoXArgs docs automatically

* fix bug and set batch size to 32

* Update NeoXArgs docs automatically

---------

Co-authored-by: Quentin Anthony <[email protected]>
Co-authored-by: github-actions <[email protected]>
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.4.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](pallets/jinja@3.1.3...3.1.4)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* misc changes to neox_args

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
* misc changes to neox_args

* allow rwkv pp

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
* format: flagged on pre-commit

* feat: add pytorch profiling

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
* Tolerate no fused kernels

* Fix requirements file syntax

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

---------

Co-authored-by: Yang Zhang <[email protected]>
Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
* Update README.md

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
* add workflow_dispatch to gh actions pr so we can run on command

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
* init changes to README

* Update NeoXArgs docs automatically

* Update README.md

* Update NeoXArgs docs automatically

* Update README.md

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
* Fix changed behavior of pipe_parallel

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

---------

Co-authored-by: Yang Zhang <[email protected]>
Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
* update is_pipe_parallel logic ; handle tied-embeddings case correctly

* Update NeoXArgs docs automatically

* revert PP to be consistent

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
* fix python version and pytest install

* Update NeoXArgs docs automatically

* python3

* Update NeoXArgs docs automatically

* pip not pip3

* Update NeoXArgs docs automatically

* python3 pip

* Update NeoXArgs docs automatically

* python3 -m pip

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

* add docker setup to workflow

* Update NeoXArgs docs automatically

* python setup

* Update NeoXArgs docs automatically

* python setup v2

* Update NeoXArgs docs automatically

* python setup v3

* python setup v3

* Update NeoXArgs docs automatically

* python setup v3

* Update NeoXArgs docs automatically

* python setup v3

* Update NeoXArgs docs automatically

* python setup v3

* Update NeoXArgs docs automatically

* python setup v3

* python setup v3

* Update NeoXArgs docs automatically

* python setup v3

* Update NeoXArgs docs automatically

* python setup v3

* Update NeoXArgs docs automatically

* python setup v3

* Update NeoXArgs docs automatically

* python setup v3

* Update NeoXArgs docs automatically

* python setup v3

* Update NeoXArgs docs automatically

* python setup v3

* Update NeoXArgs docs automatically

* python setup v3

* Update NeoXArgs docs automatically

* python setup v3

* Update NeoXArgs docs automatically

* python setup v3

* Update NeoXArgs docs automatically

* Update NeoXArgs docs automatically

* Add hash back to deep speed version

* Update NeoXArgs docs automatically

---------

Co-authored-by: github-actions <[email protected]>
Co-authored-by: Quentin Anthony <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet