Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase test coverage #289

Merged
merged 67 commits into from
May 12, 2021
Merged

Increase test coverage #289

merged 67 commits into from
May 12, 2021

Conversation

sweinbach
Copy link
Contributor

This pull request does the following:

  • executing tests in a distributed environment
  • add test configs to test more configuration options
  • add requirement for text coverage report and a small test readme

@sweinbach sweinbach requested a review from a team as a code owner May 3, 2021 16:24
@sweinbach sweinbach linked an issue May 3, 2021 that may be closed by this pull request
@sweinbach sweinbach marked this pull request as draft May 3, 2021 16:26
@sweinbach
Copy link
Contributor Author

Converting the pull request to draft for the following reasons:

  • Test coverage is not optimal (though most likely not the best metric to assess test quality). I am happy to include other suggested configs.
    image

  • I am getting a failed testcase for loading the state dict for one bit adam. I would appreciate help in validating that the testcase works as intendend and (potentially) in the fix of the error.

checkpoint_name, state_dict = model.load_checkpoint(neox_args.load,
  File "/home/aleph/repos/neox/gpt-neox-copy/src/deepspeed/deepspeed/runtime/engine.py", line 1476, in load_checkpoint
    load_path, client_states = self._load_checkpoint(load_dir,
  File "/home/aleph/repos/neox/gpt-neox-copy/src/deepspeed/deepspeed/runtime/engine.py", line 1519, in _load_checkpoint
    self.optimizer.load_state_dict(checkpoint['optimizer'])
  File "/home/aleph/repos/neox/gpt-neox-copy/src/deepspeed/deepspeed/runtime/fp16/onebit/adam.py", line 285, in load_state_dict
    if self.state[self.param_groups[0]['params'][0]]['step'] < self.freeze_step:
KeyError: 'step'

@sweinbach
Copy link
Contributor Author

I am quite happy with the current status. Configuration files are used for a number of testcases. This should enable adding new testcases easily.

When reviewing please pay special attention to the commits starting with "fix ...". These fix minor issues and affect the codebase outside of test cases.

tests/Readme.md gives a short overview on how to run tests and also view the coverage report.

All testcases pass at the moment:
image

Coverage is far from100%. But a reasonable amount of code is covered.
image

@sweinbach sweinbach marked this pull request as ready for review May 11, 2021 18:09
@sdtblck sdtblck merged commit bb8222f into main May 12, 2021
@sdtblck sdtblck deleted the increase_test_coverage branch May 12, 2021 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Increase Test Coverage
3 participants