Robust testing suite #957

StellaAthena · 2023-05-28T15:14:03Z

Recently we’ve been having issues with new features breaking old ones. We do not currently have a detailed testing suite that validates all of the features of GPT-NeoX, but the codebase is large and complicated enough that we definitely should. This is going to be the start of a list of tests we should support off the top of my head, feel free to post comments with additional ones and I’ll add them. I’ll also start going through things and building it up systematically.

We have some tests here but they are missing notable new features and don’t seem to be updated as the library changes.

Data Processing

Download scripts run
Preprocessing with each supported tokenizer works
Training a new tokenizer

Primary Functions

Launcher scripts
Training (on one GPU, one node, and one pod)
Finetuning (especially loading and training without optimizer states)
Inference
Evaluation

Optimizations and Parallelizations

ZeRO works and memory usage is within prescribed limits
fp16 and bf16
Various MP and PP values
Flash Attention

Model Options

Conversion Scripts

NeoX -> HF transformers library
NeoX -> Megatron-DS
NeoX -> SafeTensors
NeoX V1 -> NeoX V2

Misc Features

Library installs correctly and packages don’t have conflicts
MuP (currently bugged, see The plot got from muP coord_check seems not horizontal, which may indicates there exits a bug in the muP implementation? #956)

The text was updated successfully, but these errors were encountered:

Quentin-Anthony · 2023-05-28T19:20:20Z

The DeepSpeed unit test suite is very good, and I suggest that whoever picks this up use them as a template.

Once we have a solid test suite, they should be applied to every PR as a github action, similar to how DeepSpeed does it.

whiz-Tuhin · 2023-06-05T17:31:16Z

@StellaAthena @Quentin-Anthony I can pick this up. I'll need a day or two to go through the codebase as well as the DeepSpeed unit test suite. Will keep this thread updated.

StellaAthena · 2023-06-05T20:12:26Z

@whiz-Tuhin That's awesome, welcome to the team! I went ahead and edited the OP with some additional things that should probably be incorporated though I'm sure it's still not totally comprehensive. That said, don't get overwhelmed by the number of tests to write. A small and reliable test suite that doesn't cover every feature but does cover major ones would be a huge value add. We can then build on that over time to be more and more comprehensive.

Quentin knows what he's talking about, so I would definitely start by working on porting the DeepSpeed tests over, removing the stuff we don't need and adding tests for things DeepSpeed doesn't support.

I've sent you an invite to the EleutherAI Org that will allow you to work on a non-main branch without having to mess around with forking the library.

Quentin-Anthony · 2023-09-29T00:36:29Z

Looking again for someone to pick this up!

mkerin · 2023-11-07T14:00:06Z

Hi @Quentin-Anthony @StellaAthena - just checking if you're still looking for a volunteer here? I have good bandwidth to work on this over the next 2 weeks, which I think should be more than enough time (& I've been looking for an opportunity to contribue for a while!).

StellaAthena · 2023-11-07T14:19:07Z

Hi @Quentin-Anthony @StellaAthena - just checking if you're still looking for a volunteer here? I have good bandwidth to work on this over the next 2 weeks, which I think should be more than enough time (& I've been looking for an opportunity to contribue for a while!).

Yes! That would be phenomenal

Quentin-Anthony · 2023-11-07T17:56:22Z

Hi @Quentin-Anthony @StellaAthena - just checking if you're still looking for a volunteer here? I have good bandwidth to work on this over the next 2 weeks, which I think should be more than enough time (& I've been looking for an opportunity to contribue for a while!).

@mkerin -- Great to hear! Please reach out to me here or over Discord if you need any help or have any questions. A good first step would be to just run our existing CPU tests locally, then start with some of the simple model option tests.

mkerin · 2023-11-07T23:22:35Z

Hey Quentin - thank you for the tips!

I have got the existing CPU tests running & started working on integration tests for the data processing functionality (I didn’t see your message about focussing on model options until just now).

I have dropped you a line on discord. My handle there is mkez.

StellaAthena added feature request New feature or request good first issue Good for newcomers help wanted This issue needs assistance labels May 28, 2023

StellaAthena assigned whiz-Tuhin Jun 5, 2023

Quentin-Anthony unassigned whiz-Tuhin Sep 29, 2023

Quentin-Anthony assigned mkerin Nov 7, 2023

EleutherAI deleted a comment from rezaarefi Nov 7, 2023

mkerin mentioned this issue Nov 15, 2023

Extend ci suite #1080

Merged

Quentin-Anthony closed this as completed in #1080 Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Robust testing suite #957

Robust testing suite #957

StellaAthena commented May 28, 2023 •

edited

Loading

Quentin-Anthony commented May 28, 2023

whiz-Tuhin commented Jun 5, 2023

StellaAthena commented Jun 5, 2023 •

edited

Loading

Quentin-Anthony commented Sep 29, 2023

mkerin commented Nov 7, 2023

StellaAthena commented Nov 7, 2023

Quentin-Anthony commented Nov 7, 2023

mkerin commented Nov 7, 2023

Robust testing suite #957

Robust testing suite #957

Comments

StellaAthena commented May 28, 2023 • edited Loading

Quentin-Anthony commented May 28, 2023

whiz-Tuhin commented Jun 5, 2023

StellaAthena commented Jun 5, 2023 • edited Loading

Quentin-Anthony commented Sep 29, 2023

mkerin commented Nov 7, 2023

StellaAthena commented Nov 7, 2023

Quentin-Anthony commented Nov 7, 2023

mkerin commented Nov 7, 2023

StellaAthena commented May 28, 2023 •

edited

Loading

StellaAthena commented Jun 5, 2023 •

edited

Loading