Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from EleutherAI:main #2

Merged
merged 105 commits into from
Jul 29, 2024
Merged

[pull] main from EleutherAI:main #2

merged 105 commits into from
Jul 29, 2024

Commits on Oct 31, 2023

  1. fix lion optimizer documentation (#1067)

    * fix lion optimizer documentation
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    jahatef and github-actions authored Oct 31, 2023
    Configuration menu
    Copy the full SHA
    e277bc7 View commit details
    Browse the repository at this point in the history
  2. Fix preprocess_data.py link (#1064)

    * Fix preprocess_data.py link
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Quentin-Anthony and github-actions authored Oct 31, 2023
    Configuration menu
    Copy the full SHA
    f574f22 View commit details
    Browse the repository at this point in the history

Commits on Nov 1, 2023

  1. Edge-casing for multi-GPU HF-to-NeoX conversion (#1065)

    * edge-casing for multiGPU hf to sequential case
    
    * cleanup whitespace
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Nov 1, 2023
    Configuration menu
    Copy the full SHA
    fcc5af5 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    8c9fc00 View commit details
    Browse the repository at this point in the history
  3. Pin version of lm_eval (#1070)

    * Pin lm_eval version
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    haileyschoelkopf and github-actions authored Nov 1, 2023
    Configuration menu
    Copy the full SHA
    a10f69c View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    41f019e View commit details
    Browse the repository at this point in the history

Commits on Nov 5, 2023

  1. Update README.md

    StellaAthena authored Nov 5, 2023
    Configuration menu
    Copy the full SHA
    90aa131 View commit details
    Browse the repository at this point in the history

Commits on Nov 7, 2023

  1. When processing mlp.dense_4h_to_h.bias and attention.dense.bias, tp_r…

    …anks are not reflected, so strange results always appear when tp_ranks is greater than 1.
    kyuheejang committed Nov 7, 2023
    Configuration menu
    Copy the full SHA
    04dc2ba View commit details
    Browse the repository at this point in the history
  2. Merge pull request #1072 from kyuheejang/Fixing-neox-to-huggingface

    �Fixing convert neox to huggingface bug
    StellaAthena authored Nov 7, 2023
    Configuration menu
    Copy the full SHA
    f214358 View commit details
    Browse the repository at this point in the history

Commits on Nov 8, 2023

  1. Configuration menu
    Copy the full SHA
    d8028f8 View commit details
    Browse the repository at this point in the history

Commits on Nov 16, 2023

  1. Update neox_args.py (#1081)

    * Update neox_args.py
    
    These attention configuration options were missing from the docs. This will fix that.
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    jahatef and github-actions authored Nov 16, 2023
    Configuration menu
    Copy the full SHA
    10bf788 View commit details
    Browse the repository at this point in the history

Commits on Nov 22, 2023

  1. Update README.md (#1082)

    * Update README.md
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    StellaAthena and github-actions authored Nov 22, 2023
    Configuration menu
    Copy the full SHA
    f48d3a6 View commit details
    Browse the repository at this point in the history

Commits on Nov 30, 2023

  1. Update README.md

    StellaAthena authored Nov 30, 2023
    Configuration menu
    Copy the full SHA
    efea81f View commit details
    Browse the repository at this point in the history

Commits on Dec 4, 2023

  1. Extend ci suite (#1080)

    * Use `.yml` extensions in README to reflect extensions used in `configs/` folder
    
    * Rename `save_interval` -> `checkpoint_factor`
    
    * Mark expected failures in existing tests
    
    * Fix minor typos
    
    * Allow creation of checkpoint at iteration 0 when `do_train=False`
    
    Helpful for unit tests because it allows use of a randomly initialised model
    
    * Delete duplicated `test_fused_kernels.py`
    
    Primary version lives in `tests/model/test_fused_kernels.py`
    
    * Avoid initializing CUDA whenever `megatron` is imported
    
    Resolves `Cannot re-initialize CUDA in forked subprocess` error when running distributed unit tests
    
    * Extend suite of unit tests
    mkerin authored Dec 4, 2023
    Configuration menu
    Copy the full SHA
    3be59a4 View commit details
    Browse the repository at this point in the history
  2. Patch coverity scan (#1090)

    * Update coverity_scan.yml
    
    * Update coverity_scan.yml
    
    * Update coverity_scan.yml
    
    * Update coverity_scan.yml
    
    * Update coverity_scan.yml
    
    * Update coverity_scan.yml
    
    * Update coverity_scan.yml
    
    * Update coverity_scan.yml
    
    update build command to avert empty cwd in build metrics
    
    * Update coverity_scan.yml
    
    * Update coverity_scan.yml
    
    adding verbose to debug curl
    
    * Update coverity_scan.yml
    
    debug print trace to examine build metrics xml
    
    * Update coverity_scan.yml
    
    * Update coverity_scan.yml
    
    * Update coverity_scan.yml
    
    * Update coverity_scan.yml
    
    * Update coverity_scan.yml
    
    * Update coverity_scan.yml
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Dec 4, 2023
    Configuration menu
    Copy the full SHA
    a2b2020 View commit details
    Browse the repository at this point in the history

Commits on Dec 6, 2023

  1. Corrects FLOPs formula as per 1093 (#1094)

    * Update logging.py
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    StellaAthena and github-actions authored Dec 6, 2023
    Configuration menu
    Copy the full SHA
    050f560 View commit details
    Browse the repository at this point in the history

Commits on Dec 19, 2023

  1. Update CODEOWNERS

    Remove myself as a code owner as I shouldn't be approving PRs.
    StellaAthena authored Dec 19, 2023
    Configuration menu
    Copy the full SHA
    f19b2ec View commit details
    Browse the repository at this point in the history

Commits on Dec 20, 2023

  1. Bump transformers from 4.30.2 to 4.36.0 in /requirements (#1097)

    Bumps [transformers](https://github.com/huggingface/transformers) from 4.30.2 to 4.36.0.
    - [Release notes](https://github.com/huggingface/transformers/releases)
    - [Commits](huggingface/transformers@v4.30.2...v4.36.0)
    
    ---
    updated-dependencies:
    - dependency-name: transformers
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Dec 20, 2023
    Configuration menu
    Copy the full SHA
    07166da View commit details
    Browse the repository at this point in the history
  2. Pins old DeeperSpeed until bug is fixed (#1095)

    * Pins old DeeperSpeed until bug is fixed
    
    There is a bug in upstream DeepSpeed detailed [here](microsoft/DeepSpeed#4781) that we didn't catch before synching with main. This pins the prior commit so the bug doesn't impact users.
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    StellaAthena and github-actions authored Dec 20, 2023
    Configuration menu
    Copy the full SHA
    9283eff View commit details
    Browse the repository at this point in the history

Commits on Dec 22, 2023

  1. Update README.md

    StellaAthena authored Dec 22, 2023
    Configuration menu
    Copy the full SHA
    9eef954 View commit details
    Browse the repository at this point in the history
  2. Update README.md

    StellaAthena authored Dec 22, 2023
    Configuration menu
    Copy the full SHA
    a48e09e View commit details
    Browse the repository at this point in the history
  3. Update NeoXArgs docs automatically

    github-actions committed Dec 22, 2023
    Configuration menu
    Copy the full SHA
    613e5a6 View commit details
    Browse the repository at this point in the history
  4. Update README.md

    StellaAthena authored Dec 22, 2023
    Configuration menu
    Copy the full SHA
    be7eeda View commit details
    Browse the repository at this point in the history
  5. Update README.md

    StellaAthena authored Dec 22, 2023
    Configuration menu
    Copy the full SHA
    2117afc View commit details
    Browse the repository at this point in the history
  6. Update NeoXArgs docs automatically

    github-actions committed Dec 22, 2023
    Configuration menu
    Copy the full SHA
    8dba5b6 View commit details
    Browse the repository at this point in the history
  7. Add QK Normalization (#1100)

    * add qk normalization
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Dec 22, 2023
    Configuration menu
    Copy the full SHA
    f161245 View commit details
    Browse the repository at this point in the history
  8. Update README.md

    StellaAthena authored Dec 22, 2023
    Configuration menu
    Copy the full SHA
    7fb3b3c View commit details
    Browse the repository at this point in the history
  9. Update README.md

    StellaAthena authored Dec 22, 2023
    Configuration menu
    Copy the full SHA
    a7509f0 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    8eaac4e View commit details
    Browse the repository at this point in the history
  11. Update NeoXArgs docs automatically

    github-actions committed Dec 22, 2023
    Configuration menu
    Copy the full SHA
    4d5a811 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    05cc29c View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    e25446e View commit details
    Browse the repository at this point in the history
  14. Merge pull request #1102 from EleutherAI/StellaAthena-patch-4

    More readme updates
    StellaAthena authored Dec 22, 2023
    Configuration menu
    Copy the full SHA
    287f9f7 View commit details
    Browse the repository at this point in the history

Commits on Dec 23, 2023

  1. Lm eval 0.4.0 support (#1101)

    * add lm-eval v0.4.0
    
    * rename evaluate.py to avoid shadowing HF evaluate library
    
    * document new evaluate.py filename
    
    * Update NeoXArgs docs automatically
    
    * handle results format differently
    
    * Update NeoXArgs docs automatically
    
    * Update hanging evaluate.py scripts
    
    * Update NeoXArgs docs automatically
    
    * Add triviaqa to default eval_tasks
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Dec 23, 2023
    Configuration menu
    Copy the full SHA
    b27e409 View commit details
    Browse the repository at this point in the history
  2. Update README.md

    StellaAthena authored Dec 23, 2023
    Configuration menu
    Copy the full SHA
    1148a0f View commit details
    Browse the repository at this point in the history

Commits on Dec 26, 2023

  1. Update neox_args.py (#1107)

    * Update neox_args.py
    
    Changed some default values to correspond to values that we generally recommend people use.
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    StellaAthena and github-actions authored Dec 26, 2023
    Configuration menu
    Copy the full SHA
    e5a7ea7 View commit details
    Browse the repository at this point in the history

Commits on Jan 4, 2024

  1. Fix repo for CI (#1106)

    * Fix syntax errors
    
    * Make pre-commit fixes across repo
    
    * Ensure correct version of clang-format in CI
    
    ---------
    
    Co-authored-by: Yang Zhang <[email protected]>
    yang and yang authored Jan 4, 2024
    Configuration menu
    Copy the full SHA
    eca6b1a View commit details
    Browse the repository at this point in the history
  2. Fix install, Dockerfile, CI (#1104)

    * Add missing jinja2 dep
    
    Missing transitive dep of lm_eval
    
    * Fix Dockerfile
    
    Only devel has nvcc, needed to build packages
    
    And don't rebuild fused kernels if no relevant change
    
    * Ensure Dockerfile builds in CI
    
    Also ensures that install actually works
    
    ---------
    
    Co-authored-by: Yang Zhang <[email protected]>
    yang and yang authored Jan 4, 2024
    Configuration menu
    Copy the full SHA
    98716eb View commit details
    Browse the repository at this point in the history

Commits on Jan 5, 2024

  1. Fused Rotary Embeddings (fixed) (#1108)

    * Create fused_rotary_positional_embedding.cpp
    
    * Create fused_rotary_positional_embedding.h
    
    * Create fused_rotary_positional_embedding_cuda.cu
    
    * Update fused_rotary_positional_embedding.h
    
    Ports the fix from NVIDIA/apex#1750 into this branch.
    
    * Update neox_args.py
    
    * Update setup.py
    
    * Update initialize.py
    
    * Update setup.py
    
    * Update __init__.py
    
    * Update test_fused_kernels.py
    
    * Update setup.py
    
    * Create fused_rope.py
    
    * Update fused_rotary_positional_embedding.h
    
    * Update fused_rotary_positional_embedding.cpp
    
    * Update fused_rotary_positional_embedding.cpp
    
    * Update transformer.py
    
    * Update transformer.py
    
    Just checked and this should work for bf16. Or, at least, the reason I originally thought it wouldn't doesn't apply.
    
    * Update transformer.py
    
    * Create 125M_fused_rope.yml
    
    * Update 125M_fused_rope.yml
    
    * Update transformer.py
    
    Add `self.rope_fusion = neox_args.rope_fusion` so that `ParallelSelfAttention` knows if we're using rope fusion.
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * Fix fused rope
    
    Just needed to bring in the latest headers/sources,
    and call into it the right way from transformers.py.
    
    * Add rope_fusion arg to all ymls
    
    ---------
    
    Co-authored-by: Stella Biderman <[email protected]>
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    Co-authored-by: Yang Zhang <[email protected]>
    5 people authored Jan 5, 2024
    Configuration menu
    Copy the full SHA
    77605ca View commit details
    Browse the repository at this point in the history
  2. Add pythia 14M and 31M configs (#1111)

    * Add pythia 14M config
    
    * Create 31M.yml
    segyges authored Jan 5, 2024
    Configuration menu
    Copy the full SHA
    f14782a View commit details
    Browse the repository at this point in the history

Commits on Jan 9, 2024

  1. Add docker compose and change containerized setup instructions to use…

    … it (#1113)
    
    * Add pythia 14M config
    
    * Create 31M.yml
    
    * Add docker compose, update readme docker instructions to utilize it
    
    * Add logging limits to docker-compose files
    
    * Change data mount from /gpt-neox/data to /data/
    
    This prevents possible errors if the user already has a /data/ directory in their /gpt-neox/ folder
    
    * Update README.md
    
    Makes the code blocks into blocks in the changed parts
    
    * Make the docker-compose spinup tidier
    
    * Avoid config bloat by only providing the updated paths
    
    * Apply precommit
    
    ---------
    
    Co-authored-by: Quentin Anthony <[email protected]>
    segyges and Quentin-Anthony authored Jan 9, 2024
    Configuration menu
    Copy the full SHA
    e6e944a View commit details
    Browse the repository at this point in the history

Commits on Jan 11, 2024

  1. Configuration menu
    Copy the full SHA
    92b1b6f View commit details
    Browse the repository at this point in the history

Commits on Jan 13, 2024

  1. Bump jinja2 from 3.1.2 to 3.1.3 in /requirements (#1120)

    Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.2 to 3.1.3.
    - [Release notes](https://github.com/pallets/jinja/releases)
    - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
    - [Commits](pallets/jinja@3.1.2...3.1.3)
    
    ---
    updated-dependencies:
    - dependency-name: jinja2
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Jan 13, 2024
    Configuration menu
    Copy the full SHA
    90f70ff View commit details
    Browse the repository at this point in the history

Commits on Jan 19, 2024

  1. Enable passing of --account to srun / SlurmLauncher (#1126)

    * add `account` to Deepspeed args
    
    * Add handling of `account` when `deepspeed_slurm` is set
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    haileyschoelkopf and github-actions authored Jan 19, 2024
    Configuration menu
    Copy the full SHA
    6399155 View commit details
    Browse the repository at this point in the history

Commits on Jan 24, 2024

  1. update copyrights (#1128)

    * update copyrights
    
    * Update NeoXArgs docs automatically
    
    * nvidia copyright years
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    jahatef and github-actions authored Jan 24, 2024
    Configuration menu
    Copy the full SHA
    7a8fa2f View commit details
    Browse the repository at this point in the history

Commits on Jan 26, 2024

  1. fused layernorm (#1105)

    * Add simple util for CUDA timings
    
    * Add fused layernorm kernel from Megatron
    
    Closes #952
    
    * change default fused layernorm to false
    
    * Update test_setup.yml
    
    * Update test_train_base.yml
    
    ---------
    
    Co-authored-by: Yang Zhang <[email protected]>
    Co-authored-by: jahatef <[email protected]>
    Co-authored-by: Jacob Hatef <[email protected]>
    4 people authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    3d8fec0 View commit details
    Browse the repository at this point in the history

Commits on Jan 29, 2024

  1. Contributing Guide (#1138)

    * contributing guide
    
    * Update NeoXArgs docs automatically
    
    * Update CONTRIBUTING.md
    
    * Update NeoXArgs docs automatically
    
    * Remove microsoft references and link on main readme
    
    * Update NeoXArgs docs automatically
    
    * pre-commit
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    e5602c3 View commit details
    Browse the repository at this point in the history

Commits on Jan 30, 2024

  1. Configuration menu
    Copy the full SHA
    1c133bf View commit details
    Browse the repository at this point in the history

Commits on Feb 1, 2024

  1. Update lm_eval v0.4 to PyPI dependencies (#1141)

    * Update requirements.txt
    
    * Update requirements.txt
    
    * Update NeoXArgs docs automatically
    
    * add note to neox_args.py
    
    * pre-commit
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Feb 1, 2024
    Configuration menu
    Copy the full SHA
    032ec8c View commit details
    Browse the repository at this point in the history

Commits on Feb 5, 2024

  1. Remove gas (beano) (#1144)

    * Remove 'gas' configuration variable
    
    * Remove gas from configs and config documentation
    
    * Update training.py
    segyges authored Feb 5, 2024
    Configuration menu
    Copy the full SHA
    91c44bc View commit details
    Browse the repository at this point in the history

Commits on Feb 8, 2024

  1. Improve Conversion Utilities (#1124)

    * draft: unify sequential + PPModule conversion scripts
    
    * Update NeoXArgs docs automatically
    
    * draft: pull out model param names / model definition
    
    * Update NeoXArgs docs automatically
    
    * tested: neox models with TP = 1, PipelineModule, work
    
    * Update NeoXArgs docs automatically
    
    * draft: Llama + GQA QKV resharding
    
    * Update NeoXArgs docs automatically
    
    * update Llama conversion script to support Mistral and GQA
    
    * Update NeoXArgs docs automatically
    
    * test Mistral-7B conversion
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * push documentation on imports / Llama loading
    
    * push further readme updates (Mistral included)
    
    * Preventconversions for unsupported featurees, disclaim in README
    
    * Update NeoXArgs docs automatically
    
    * revert PR#1072 RowParallel bias conversion error
    
    * remove sequential_to_hf and module_to_hf scripts, deprecated in favor of convert_neox_to_hf.py
    
    * Update NeoXArgs docs automatically
    
    * pre-commit
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Feb 8, 2024
    Configuration menu
    Copy the full SHA
    f7373f8 View commit details
    Browse the repository at this point in the history

Commits on Feb 21, 2024

  1. Fixes distributed tests, and skips tests that are broken. (#1149)

    * Fixes distributed tests, and skips tests that are broken.
    
    * Update NeoXArgs docs automatically
    
    * improve pytest msgs and remove commented code
    
    * pre-commit
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Feb 21, 2024
    Configuration menu
    Copy the full SHA
    412cf6e View commit details
    Browse the repository at this point in the history
  2. Memory profiling (#1153)

    * Fixes distributed tests, and skips tests that are broken.
    
    * memory profiling for gpt-neox. Only works for pp=0, pp=1+ needs DS commits.
    
    * Update NeoXArgs docs automatically
    
    * adds memory profiling for pipeline parallel
    
    * Update NeoXArgs docs automatically
    
    * fix spacing
    
    * Update NeoXArgs docs automatically
    
    * fix spacing again
    
    * Update NeoXArgs docs automatically
    
    * get rid of unwanted changes
    
    * Update NeoXArgs docs automatically
    
    * get rid of file
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * add nsight systems support
    
    * remove tests changes again
    
    * Update NeoXArgs docs automatically
    
    * add tests
    
    * Update NeoXArgs docs automatically
    
    * Update training.py
    
    * Update NeoXArgs docs automatically
    
    * Add assertion message
    
    * pre-commit
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Feb 21, 2024
    Configuration menu
    Copy the full SHA
    46d179c View commit details
    Browse the repository at this point in the history

Commits on Feb 23, 2024

  1. add profiling to readme (#1154)

    * add profiling to readme
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    jahatef and github-actions authored Feb 23, 2024
    Configuration menu
    Copy the full SHA
    eee03b2 View commit details
    Browse the repository at this point in the history
  2. Python version update (#1122)

    * Switch default command for docker image
    
    * Rename pythia paths docker file for clarity
    
    * Update docker build to use python 3.10
    
    * Update github workflows to use ubuntu 22.04 and python 3.10
    
    * Bump pytorch library patch versions
    
    * Add pytest-html for reasonably formatted test reports
    
    * Fix build after torch and cuda version bump
    
    * Fix apex install for newer version
    
    1) This, empirically, works, as tested by running the build and kicking off training.
    2) Apex documentation says it is incorrect syntax and deprecated.
    3) It takes so long to compile that it is probably, all by itself, something that needs fixing.
    4) I will probably pull the fused adamw out of apex.
    5) It has been building for twenty minutes so I am going to go do something else.
    
    * Fix pip version to ensure apex compilation remains good
    
    * Fix unit test for evaluate
    
    * Fix pip requirement
    
    Prevents possible build issues with apex especially across divergent pip versions
    
    * Update dockerfile to point to stripped-down apex repo
    
    * Revert "Update dockerfile to point to stripped-down apex repo"
    
    This reverts commit 40c7656.
    
    * Update apex version in dockerfile
    
    * Switch to downloading prebuilt apex wheel
    
    * Clean up docker copy commands
    
    * Have docker build conditionally get binaries or build apex
    
    * Apply precommit
    segyges authored Feb 23, 2024
    Configuration menu
    Copy the full SHA
    a7638a8 View commit details
    Browse the repository at this point in the history
  3. Minor changes (#1125)

    * Switch default command for docker image
    
    * Rename pythia paths docker file for clarity
    
    * Fix unit test for evaluate
    
    * Update readme for testing to omit --forked argument
    
    * Add pytest-html to requirements-dev.txt
    
    * Revert "Update readme for testing to omit --forked argument"
    
    This reverts commit 19021fc.
    
    * Add data/ directory and .bin and .idx files in /tests/data to .gitignore
    
    This makes it so that git doesn't try to let you commit (or force you to stash) data files
    
    * Make .gitignore for data files slightly more elegant
    
    * Add utility script for doing token counts on processed datasets
    
    * Run precommit hook
    
    * Fix token count script, run precommit
    segyges authored Feb 23, 2024
    Configuration menu
    Copy the full SHA
    72d1803 View commit details
    Browse the repository at this point in the history
  4. Draft PR Adding mistral 0.1 (#1131)

    * add support for flash attention 2
    
    * change cosine decay to chinchilla style
    
    * set default warmup to none so that warmup_iters can be set
    
    * fixed bug
    
    * fixed chinchilla lr
    
    * add s3 checkpoint syncing
    
    * rotary embedding in fp32
    
    * fix for seq_len < max_seq_len
    
    * some fixes, still not working
    
    * ?'
    :
    
    * fix bugs; evaluate on step 0
    
    * first attempt at gqa
    
    * gqa works in kv_heads==query_heads case
    
    * gqa working
    
    * workaround for FSX quota
    
    * update with llemma
    
    * update with recent PR
    
    * README and requirements updated
    
    * Added Mistral config
    
    * Added sliding window through flash attention 2
    
    * Added sliding window
    
    * Mistral should likely use mp=2 like llama2
    
    * Update gitignore
    
    * Removed unused CPCargo import
    
    * Conversion script (WIP)
    
    * Fixed missing slurm environ vars
    
    * updated mistral config
    
    * updated job script
    
    * initial commit conversion mistral hf to sequential
    
    * Added stacking q, k, v appropriately for mp ranks
    
    * pp=0 support from end of 2023
    
    * Cleaning up config and removing Autoconfig in conversion script
    
    * Cleaned up conversion example script
    
    * cleanup: add back configs folder, discard Llemma readme
    
    * cleanup: remove llemma lr sched changes, re-add requirements/ folder
    
    * docs: add explanation of intermediate_size behavior
    
    * args: add argument checking for num_kv_heads, clean up usage syntax
    
    * args: prevent num KV heads < TP worldsize
    
    * readd triton flash attn func
    
    * cleanup: use tools/ dir from main
    
    * docs: re-add mistral , GQA as supported
    
    * cleanup: delete duplicate tools/ files
    
    * cleanup: use fp32 rope (non-fused) from main
    
    * cleanup: no longer block out GQA codepaths in conversion scripts
    
    * cleanup: gqa code a bit
    
    * add llama2, llemma configs
    
    * add non-flash GQA ; refactor modeling code
    
    * clean up mistral config for commit
    
    * further cleanup configs dir
    
    * remove slurm script from llemma
    
    * update seqlen params for codellama, llemma configs
    
    * add more comments to GQA code, and make reshapes more readable
    
    * make inv_freq non-persistent
    
    * actually, just ensure mistral has inv_freqs as a persistent buffer
    
    * non-flash GQA works, so ensure arguments.py permits it
    
    * no longer use our own copies of flash attention interface functions
    
    * remove unused mpu util fn
    
    * delete unused config file
    
    * fix diff on mpu/utils.py
    
    * remove slurm scripts that won't be in this PR
    
    * run pre-commit
    
    * update tests for conversion scripts
    
    * add flash version check for sliding window
    
    * pre-commit
    
    ---------
    
    Co-authored-by: zhangir-azerbayev <[email protected]>
    Co-authored-by: haileyschoelkopf <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    4 people authored Feb 23, 2024
    Configuration menu
    Copy the full SHA
    f36aed7 View commit details
    Browse the repository at this point in the history

Commits on Feb 26, 2024

  1. [Bug?] Fix profiling argument names (#1155)

    * possibly fix profiling flag names
    
    * actually, profile_backward already exists
    
    * Update NeoXArgs docs automatically
    
    * neox_args.profile was also used some places, update that too
    
    * Update NeoXArgs docs automatically
    
    * profiling --> profile
    
    * Update NeoXArgs docs automatically
    
    * Revert neox_arguments.md changes
    
    * Update NeoXArgs docs automatically
    
    * Update gen_docs since __name__ only returns the Literal for string args with Python 3.10
    
    * Update NeoXArgs docs automatically
    
    * Another update to preserve non-literals
    
    * Update NeoXArgs docs automatically
    
    * add union
    
    * Update NeoXArgs docs automatically
    
    * pre-commit
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Feb 26, 2024
    Configuration menu
    Copy the full SHA
    9663802 View commit details
    Browse the repository at this point in the history

Commits on Feb 29, 2024

  1. Update cpu_ci.yml (#1159)

    * Update cpu_ci.yml
    
    Updating the workflow to point CPU workflow towards self hosted runner versus Github provided runners
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    jaimemcc-intel and github-actions authored Feb 29, 2024
    Configuration menu
    Copy the full SHA
    3c03fc7 View commit details
    Browse the repository at this point in the history

Commits on Mar 2, 2024

  1. Improve argument validation for Flash-attn + SWA (#1162)

    * Improve argument validation for Flash-attn + SWA
    
    * Update NeoXArgs docs automatically
    
    * don't pass window_size if not necessary
    
    * Update NeoXArgs docs automatically
    
    * Update 7B.yml
    
    * Update NeoXArgs docs automatically
    
    * apply precommit
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    haileyschoelkopf and github-actions authored Mar 2, 2024
    Configuration menu
    Copy the full SHA
    19596b0 View commit details
    Browse the repository at this point in the history

Commits on Mar 4, 2024

  1. Single node Pythia 14M training on ngc pytorch 24.02 container (#1170)

    * Pythia 14M training on ngc pytorch 24.02 container
    
    * pre-commit
    
    ---------
    
    Co-authored-by: Quentin Anthony <[email protected]>
    tf-nv and Quentin-Anthony authored Mar 4, 2024
    Configuration menu
    Copy the full SHA
    119950c View commit details
    Browse the repository at this point in the history
  2. Remove unnecessary fp32/bf16 conversion (#1169)

    * feat: remove unnecessary bf16 conversions since no collective op is performed
    
    * pre-commit
    
    ---------
    
    Co-authored-by: Quentin Anthony <[email protected]>
    DayOfThePenguin and Quentin-Anthony authored Mar 4, 2024
    Configuration menu
    Copy the full SHA
    7b8187a View commit details
    Browse the repository at this point in the history
  3. Ignore markdown for pre-commit (#1171)

    * ignore markdown for pre-commit
    
    * only ignore end of file and trailing whitespace
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Quentin-Anthony and github-actions authored Mar 4, 2024
    Configuration menu
    Copy the full SHA
    31cfe52 View commit details
    Browse the repository at this point in the history
  4. Make rotary freqs buffer non-persistent (#1168)

    * make inv_freq non-persistent by default
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Mar 4, 2024
    Configuration menu
    Copy the full SHA
    e109bf5 View commit details
    Browse the repository at this point in the history
  5. Support Lion with Zero Optimizer (#1166)

    * feat: deepspeed zero lion support
    
    * feat: bump DeeperSpeed version to one that includes DeepSpeed FusedLion
    
    * feat: bump DeeperSpeed version to include pipeline logging fix
    
    * pre-commit
    
    ---------
    
    Co-authored-by: Quentin Anthony <[email protected]>
    DayOfThePenguin and Quentin-Anthony authored Mar 4, 2024
    Configuration menu
    Copy the full SHA
    df8cf24 View commit details
    Browse the repository at this point in the history

Commits on Mar 7, 2024

  1. Add MoE (#1129)

    * Add DeepSpeed MoE
    
    Thanks to dayofthepenguin for extensive testing
    
    Closes #479
    
    * Update NeoXArgs docs automatically
    
    * pre-commit
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: Yang Zhang <[email protected]>
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    4 people authored Mar 7, 2024
    Configuration menu
    Copy the full SHA
    86758c3 View commit details
    Browse the repository at this point in the history

Commits on Mar 8, 2024

  1. remove best_download as dependency (#1179)

    * Update requirements.txt
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    haileyschoelkopf and github-actions authored Mar 8, 2024
    Configuration menu
    Copy the full SHA
    63b9fa1 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    90d4cb3 View commit details
    Browse the repository at this point in the history
  3. clean up dockerfile: (#1175)

    - Eliminate already installed apt packages
    - sparse attn requirement lead to a triton downgrade
    - flash attn is already part of the ngc container (in another version
      that is compatible with TE)
    tf-nv authored Mar 8, 2024
    Configuration menu
    Copy the full SHA
    8c13642 View commit details
    Browse the repository at this point in the history
  4. When using kv cache and flash attention in conjunction, it's crucial …

    …to set the causal parameter of flash_varlen_qkv_fn to False. Failing to do so will lead to inaccurate results. (#1178)
    chaochen99 authored Mar 8, 2024
    Configuration menu
    Copy the full SHA
    c1fa994 View commit details
    Browse the repository at this point in the history
  5. Remove gas from Pythia configs (#1181)

    Fixes #1165
    
    Co-authored-by: Yang Zhang <[email protected]>
    yang and yang authored Mar 8, 2024
    Configuration menu
    Copy the full SHA
    1e7abe7 View commit details
    Browse the repository at this point in the history
  6. Fix moe_loss in gpt_j_residual path (#1180)

    Fixes #1174
    
    Co-authored-by: Yang Zhang <[email protected]>
    yang and yang authored Mar 8, 2024
    Configuration menu
    Copy the full SHA
    82ddc66 View commit details
    Browse the repository at this point in the history

Commits on Mar 10, 2024

  1. Add Mamba Architecture (#1157)

    * initial mamba support (no kernels, no parallelism)
    
    * Mamba runs! Also, add flags for sel. scan and conv1d fused kernels
    
    * Update NeoXArgs docs automatically
    
    * add mamba_inner_fn ; try really hard to make A_log and D no-WD and stored in fp32
    
    * cleanup print statements
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * add draft conversion script (tested working TP=1)
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * update parallelism checks for mamba--partition activations works
    
    * add mamba requirements
    
    * clean up and better comment mamba code
    
    * clean up and better comment mamba code
    
    * update arg validation in mamba
    
    * more cleanup
    
    * add flag for fp32 Alog/D, add init_methods support for mamba
    
    * Update NeoXArgs docs automatically
    
    * update conversion script name, add docstring
    
    * name conversion script
    
    * Update NeoXArgs docs automatically
    
    * add demo configs
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * add arguments to control conv and (in,out)_proj biases in mamba separately
    
    * Update NeoXArgs docs automatically
    
    * make x_proj bias also controlled by flag
    
    * Update NeoXArgs docs automatically
    
    * pre-commit, add comments
    
    * Update NeoXArgs docs automatically
    
    * Add mamba import print
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Mar 10, 2024
    Configuration menu
    Copy the full SHA
    6809bbc View commit details
    Browse the repository at this point in the history

Commits on Mar 13, 2024

  1. Switch to using Cuda Flash Attn for Alibi (#1183)

    * add cuda support for flash attn w/ alibi, warn of deprecation of triton
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    haileyschoelkopf and github-actions authored Mar 13, 2024
    Configuration menu
    Copy the full SHA
    03186de View commit details
    Browse the repository at this point in the history

Commits on Mar 15, 2024

  1. Mamba + Tensor Parallel Support (#1184)

    * TP works!
    
    * merge TP mamba changes with most current MambaLayer
    
    * cleanup TP, confirmed working still
    
    * make shapes with TP>1 work with conversion
    
    * tested and PP works, so no need for assert blocking it in arguments
    
    * update comment
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Mar 15, 2024
    Configuration menu
    Copy the full SHA
    277141e View commit details
    Browse the repository at this point in the history

Commits on Mar 19, 2024

  1. [ZeRO-3] Partitioned init with deepspeed.zero.Init() (#1190)

    * added ds zero.Init() to get_model
    
    * Clean up conditional with block
    
    * pre-commit
    
    ---------
    
    Co-authored-by: Quentin Anthony <[email protected]>
    R0n12 and Quentin-Anthony authored Mar 19, 2024
    Configuration menu
    Copy the full SHA
    7267a74 View commit details
    Browse the repository at this point in the history

Commits on Mar 26, 2024

  1. Small typo in the README

    edouardoyallon committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    e6b5261 View commit details
    Browse the repository at this point in the history
  2. Merge pull request #1196 from edouardoyallon/typo_readme

    ENH Small typo in the README
    StellaAthena authored Mar 26, 2024
    Configuration menu
    Copy the full SHA
    4085302 View commit details
    Browse the repository at this point in the history
  3. Added more papers

    StellaAthena authored Mar 26, 2024
    Configuration menu
    Copy the full SHA
    1960b66 View commit details
    Browse the repository at this point in the history
  4. Update README.md

    StellaAthena authored Mar 26, 2024
    Configuration menu
    Copy the full SHA
    3616658 View commit details
    Browse the repository at this point in the history

Commits on Apr 1, 2024

  1. making PR triggered CPU test for changes to megatron (#1195)

    * making PR triggered CPU test for changes to megatron
    
    * Update NeoXArgs docs automatically
    
    * pre-commit
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Apr 1, 2024
    Configuration menu
    Copy the full SHA
    977448e View commit details
    Browse the repository at this point in the history
  2. [AMD] Supporting fused kernels build using JIT (#1188)

    * initial JIT load functions
    
    * passing neox_arge to load() as optional for easy testing
    
    * modified headers for correct copyright statements
    R0n12 authored Apr 1, 2024
    Configuration menu
    Copy the full SHA
    51a7de9 View commit details
    Browse the repository at this point in the history
  3. [ZeRO-3] Ensured passing neox deepspeed_config when using partitioned…

    … init (#1191)
    
    * added ds zero.Init() to get_model
    
    * Clean up conditional with block
    
    * pre-commit
    
    * ensured deepspeed configs are passed to init
    
    ---------
    
    Co-authored-by: Quentin Anthony <[email protected]>
    R0n12 and Quentin-Anthony authored Apr 1, 2024
    Configuration menu
    Copy the full SHA
    01657aa View commit details
    Browse the repository at this point in the history

Commits on Apr 24, 2024

  1. Configuration menu
    Copy the full SHA
    703d02f View commit details
    Browse the repository at this point in the history

Commits on Apr 25, 2024

  1. Fixes a weird typo (#1207)

    * Fixes a weird typo
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    StellaAthena and github-actions authored Apr 25, 2024
    Configuration menu
    Copy the full SHA
    838d5bf View commit details
    Browse the repository at this point in the history

Commits on May 4, 2024

  1. Bump transformers from 4.36.0 to 4.38.0 in /requirements (#1199)

    Bumps [transformers](https://github.com/huggingface/transformers) from 4.36.0 to 4.38.0.
    - [Release notes](https://github.com/huggingface/transformers/releases)
    - [Commits](huggingface/transformers@v4.36.0...v4.38.0)
    
    ---
    updated-dependencies:
    - dependency-name: transformers
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored May 4, 2024
    Configuration menu
    Copy the full SHA
    9d9d7c8 View commit details
    Browse the repository at this point in the history
  2. Jaimemcc intel/ci composite cpu tests (#1205)

    * split PR and CPU tests into separate work; adjust references to env variables in workflow
    
    * tweaking to pull compose file from CPU test dir
    
    * adding post-cleanup for portability; adding workflow_dispatch to test
    
    * fixing mapping
    
    * forgot shell declaration in composite run
    
    * make sure all steps run even if first CPU tests fail
    
    * adding workflow dispatch to manually call workflow; removing httpserver
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored May 4, 2024
    Configuration menu
    Copy the full SHA
    06e5f0c View commit details
    Browse the repository at this point in the history
  3. Add megablocks dropless MoE (#1192)

    * Add megablocks dropless MoE
    
    * pre-commit
    
    ---------
    
    Co-authored-by: Yang Zhang <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored May 4, 2024
    Configuration menu
    Copy the full SHA
    916c883 View commit details
    Browse the repository at this point in the history
  4. Fix bug in tools/ckpts/convert_neox_to_hf.py for setting intermediate…

    …_size (#1209)
    
    In tools/ckpts/convert_neox_to_hf.py, for neox architecture the 'intermediate_size'
    argument is not explicitly set, so it defaults to 24576 from:
    
    https://github.com/huggingface/transformers/blob/9fe3f585bb4ea29f209dc705d269fbe292e1128f/src/transformers/models/gpt_neox/configuration_gpt_neox.py#L48
    
    Proposed solution: set intermediate-size to 4 * hidden-size
    jvendrow authored May 4, 2024
    Configuration menu
    Copy the full SHA
    c814959 View commit details
    Browse the repository at this point in the history

Commits on May 6, 2024

  1. add rwkv support (#1198)

    * add rwkv support
    
    * Update init_functions.py
    
    * rwkv model files
    
    * configs
    
    * kernels
    
    * Cleanup
    
    * Update 760M.yml
    
    * remove preffn and mishglu
    
    * Update NeoXArgs docs automatically
    
    * Add RWKV parallelism assertions
    
    * Update NeoXArgs docs automatically
    
    * pre-commit and config cleanup
    
    * Update NeoXArgs docs automatically
    
    * rwkv logging
    
    * Update NeoXArgs docs automatically
    
    * Add rwkv version dirname, make hdim 3.5x
    
    * pre-commit
    
    * Update NeoXArgs docs automatically
    
    * fix bug and set batch size to 32
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: Quentin Anthony <[email protected]>
    Co-authored-by: github-actions <[email protected]>
    3 people authored May 6, 2024
    Configuration menu
    Copy the full SHA
    4bc6670 View commit details
    Browse the repository at this point in the history

Commits on May 13, 2024

  1. Bump jinja2 from 3.1.3 to 3.1.4 in /requirements (#1211)

    Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.4.
    - [Release notes](https://github.com/pallets/jinja/releases)
    - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
    - [Commits](pallets/jinja@3.1.3...3.1.4)
    
    ---
    updated-dependencies:
    - dependency-name: jinja2
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored May 13, 2024
    Configuration menu
    Copy the full SHA
    49cd41f View commit details
    Browse the repository at this point in the history

Commits on May 16, 2024

  1. Run document update again (#1216)

    * misc changes to neox_args
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    jahatef and github-actions authored May 16, 2024
    Configuration menu
    Copy the full SHA
    d037756 View commit details
    Browse the repository at this point in the history

Commits on May 21, 2024

  1. Rwkv pipeline parallelism (#1221)

    * misc changes to neox_args
    
    * allow rwkv pp
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored May 21, 2024
    Configuration menu
    Copy the full SHA
    153e732 View commit details
    Browse the repository at this point in the history
  2. Add Torch Profiler Support (#1226)

    * format: flagged on pre-commit
    
    * feat: add pytorch profiling
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored May 21, 2024
    Configuration menu
    Copy the full SHA
    2746d43 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1d55708 View commit details
    Browse the repository at this point in the history
  4. Small tidying (#1222)

    * Tolerate no fused kernels
    
    * Fix requirements file syntax
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: Yang Zhang <[email protected]>
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    4 people authored May 21, 2024
    Configuration menu
    Copy the full SHA
    d3d59f2 View commit details
    Browse the repository at this point in the history

Commits on May 26, 2024

  1. Fix markdown formatting error (#1217)

    * Update README.md
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored May 26, 2024
    Configuration menu
    Copy the full SHA
    dfc6722 View commit details
    Browse the repository at this point in the history

Commits on Jun 4, 2024

  1. add workflow_dispatch to gh actions pr so we can run on command (#1233)

    * add workflow_dispatch to gh actions pr so we can run on command
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    jahatef and github-actions authored Jun 4, 2024
    Configuration menu
    Copy the full SHA
    b5c0afe View commit details
    Browse the repository at this point in the history

Commits on Jun 5, 2024

  1. init changes to README (#1232)

    * init changes to README
    
    * Update NeoXArgs docs automatically
    
    * Update README.md
    
    * Update NeoXArgs docs automatically
    
    * Update README.md
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Jun 5, 2024
    Configuration menu
    Copy the full SHA
    4a34e0a View commit details
    Browse the repository at this point in the history

Commits on Jun 7, 2024

  1. Configuration menu
    Copy the full SHA
    90a6cdb View commit details
    Browse the repository at this point in the history
  2. Fix changed behavior of pipe_parallel (#1219)

    * Fix changed behavior of pipe_parallel
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: Yang Zhang <[email protected]>
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    4 people authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    2382bd4 View commit details
    Browse the repository at this point in the history
  3. Conversion script bugfixes (#1218)

    * update is_pipe_parallel logic ; handle tied-embeddings case correctly
    
    * Update NeoXArgs docs automatically
    
    * revert PP to be consistent
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    4c426da View commit details
    Browse the repository at this point in the history

Commits on Jun 19, 2024

  1. fix python version and pytest install (#1234)

    * fix python version and pytest install
    
    * Update NeoXArgs docs automatically
    
    * python3
    
    * Update NeoXArgs docs automatically
    
    * pip not pip3
    
    * Update NeoXArgs docs automatically
    
    * python3 pip
    
    * Update NeoXArgs docs automatically
    
    * python3 -m pip
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * add docker setup to workflow
    
    * Update NeoXArgs docs automatically
    
    * python setup
    
    * Update NeoXArgs docs automatically
    
    * python setup v2
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * Add hash back to deep speed version
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Jun 19, 2024
    Configuration menu
    Copy the full SHA
    2608972 View commit details
    Browse the repository at this point in the history

Commits on Jun 25, 2024

  1. Add a chat data preprocessing script (#1239)

    * Add a chat data preprocessing script
    
    * add EOT at end of a chat
    
    * update README.md
    
    * apply pre-commit
    
    ---------
    
    Co-authored-by: Quentin Anthony <[email protected]>
    dmahan93 and Quentin-Anthony authored Jun 25, 2024
    Configuration menu
    Copy the full SHA
    0e5f6db View commit details
    Browse the repository at this point in the history

Commits on Jun 28, 2024

  1. Configuration menu
    Copy the full SHA
    1cee5b7 View commit details
    Browse the repository at this point in the history