Remove global vars #277

sweinbach · 2021-04-30T16:50:40Z

This pull requests removes the global variables. Exception are mpu global variables. These are more complicated and potentially breaking due to being used in torch. autograd.Function implementations.

Train for some steps on small config results in equal loss graphs:

Tests are adjusted to the new setup and pass:

…nto remove_global_vars

StellaAthena

This is amazing! Great job.

sdtblck · 2021-04-30T19:14:52Z

megatron/text_generation_utils.py

- logits = logits[:, -1].view(batch_size, -1).contiguous()
-
- if args.greedy:
+ # we have to use neox_args instead of kwargs here because deepspeed :|


This reverts this bugfix here #253

sdtblck · 2021-04-30T19:17:41Z

megatron/utils.py

-
+ local_rank = os.environ.get("LOCAL_RANK")
+ if local_rank is None:
+ print("utils.local_rank() environment variable LOCAL_RANK not set, defaulting to 0", flush=True)


I think we should default to -1 here? (torch / deepspeed's default if local rank isn't set, i.e we're not distributed.)

Some functions rely on rank being equal to 0 (e.g. tensorboard initialization

gpt-neox/megatron/neox_arguments/arguments.py

Line 106 in 319ad17

if self.tensorboard_dir and self.rank == 0:

). Would we need to change these functions according to the print_rank_0 function (

gpt-neox/megatron/__init__.py

Line 17 in 319ad17

def print_rank_0(*message):

)?

Samuel Weinbach added 30 commits April 29, 2021 15:39

add todos to calls of get_args()

94246fc

move init_wandb and timers to utils

a24a742

import wandb usage error to utils

b53b72c

remove global args from megatron.initialize

8e257ae

remove global vars in layers

7de4cc1

remove global vars transformer

e8cb100

remove global vars gpt2 model

fc9b5ab

add tokenizer to neox args

73e2a9e

remove global vars from word_embeddings

b9f29c1

remove global vars from make_data_loader

ad52f6b

rename args gradient noise scale

e991635

remove global vars checkpointing

8b1d036

remove global vars training

5a52e9a

remove global vars pretrain_gpt2.py

1f316c0

rename args in megatron.model.utils

1a8509f

add tensorboard writer to neox args

365f120

remove global vars adlr autoresume to neox args

5ea73d4

remove global vars utils

7f6349a

remove global vars delete global_vars.py

61dc444

pretrain gpt2 consume neox args

487bdaf

add inference flag to training's get_model_and_optimizer

f1dd3da

remove global vars text generation

305c155

remove convert args to conf file - this is implemented in neox args

988e42d

raise not implemented error for tasks

610dd0a

remove global vars training batch for parallel

0626a5b

remove print of args from pretrain_gpt2

c467ba9

pretrain gpt2 comments

b04d117

neox args print sorted by updated vs. default arg

feef5bc

add todos to calls of get_args()

834d6fd

move init_wandb and timers to utils

d19aef0

Samuel Weinbach added 20 commits April 30, 2021 12:44

pretrain gpt2 consume neox args

72b66f4

add inference flag to training's get_model_and_optimizer

b81835c

remove global vars text generation

6a28602

remove convert args to conf file - this is implemented in neox args

6486b71

raise not implemented error for tasks

df8a6e3

remove global vars training batch for parallel

504a267

remove print of args from pretrain_gpt2

1482bd1

pretrain gpt2 comments

b5256ca

neox args print sorted by updated vs. default arg

b106866

Merge branch 'remove_global_vars' of github.com:EleutherAI/gpt-neox i…

5803505

…nto remove_global_vars

remove todos in textgen

85c1238

delete obsolete code in transformer.py

de762a3

update docstring in initialize megatron

cd60087

add all config property to neox args

2234fbd

neox args testcase for usage update

1d56953

align load_checkpoint args to save_checkpoint args

3f42973

update test model checkpoint after removing global vars

91113f2

add destroy_model_parallel to test model checkpoint

9fdca6f

tests update and cleanup

cfcfd1c

text gen update to run with deepy

3448945

sweinbach requested a review from a team as a code owner April 30, 2021 16:50

sweinbach requested review from joshlk and sdtblck April 30, 2021 16:50

sweinbach linked an issue Apr 30, 2021 that may be closed by this pull request

Get rid of global_vars #254

Closed

Merge branch 'main' into remove_global_vars

06e9d6a

StellaAthena approved these changes Apr 30, 2021

View reviewed changes

StellaAthena merged commit 56a4cd0 into main Apr 30, 2021

StellaAthena deleted the remove_global_vars branch April 30, 2021 18:26

sdtblck reviewed Apr 30, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove global vars #277

Remove global vars #277

sweinbach commented Apr 30, 2021

StellaAthena left a comment

sdtblck Apr 30, 2021

sdtblck Apr 30, 2021

sweinbach May 2, 2021

Remove global vars #277

Remove global vars #277

Conversation

sweinbach commented Apr 30, 2021

StellaAthena left a comment

Choose a reason for hiding this comment

sdtblck Apr 30, 2021

Choose a reason for hiding this comment

sdtblck Apr 30, 2021

Choose a reason for hiding this comment

sweinbach May 2, 2021

Choose a reason for hiding this comment