Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Generation / Eval with deepspeed model engine #58

Closed
sdtblck opened this issue Jan 13, 2021 · 6 comments
Closed

Implement Generation / Eval with deepspeed model engine #58

sdtblck opened this issue Jan 13, 2021 · 6 comments
Labels
feature request New feature or request
Projects

Comments

@sdtblck
Copy link
Contributor

sdtblck commented Jan 13, 2021

currently Generation / Eval are happening with the pytorch model, not the model engine. This is already causing memory problems and won't allow us to scale up - we'll need to implement this with the deepspeed model engine.

@StellaAthena StellaAthena added the feature request New feature or request label Jan 14, 2021
@StellaAthena
Copy link
Member

Where in the code does this happen?

@srulikbd
Copy link
Contributor

srulikbd commented Jan 24, 2021

at the end of train_enwik8.py for example-the commented code.
I can try doing it but I'm not an expert in deepspeed yet.
should we use those examples?
https://github.com/microsoft/DeepSpeedExamples/blob/master/Megatron-LM/evaluate_gpt2.py
https://github.com/microsoft/DeepSpeedExamples/blob/master/Megatron-LM/generate_samples.py

@srulikbd
Copy link
Contributor

srulikbd commented Jan 24, 2021

but actually I see that it's already changed on train.py:
if params.get("validate_every") is not None: if is_main and i % params["validate_every"] == 0: model_engine.eval() with torch.no_grad(): val_data = next(val_loader).cuda() loss = model_engine(val_data) pbar.write(f'Validation Loss: {loss.item()}')
but not on:

if params.get("generate_every") is not None: if is_main and i % params["generate_every"] == 0: model.eval() val_data = next(val_loader).cuda() inp = random.choice(val_data)[:-1] prime = tokenizer.decode(inp) pbar.write(f"{prime} \n\n {'*' * 100}") sample = model.generate(inp.cuda(), params["generate_length"]) output_str = tokenizer.decode(sample) pbar.write(output_str)

@StellaAthena
Copy link
Member

Huh, funny oversight. Yeah, push a patch to the generate function and we’ll close this issue.

@StellaAthena StellaAthena added this to To do in 1T or BUST via automation Jan 24, 2021
@srulikbd
Copy link
Contributor

actually it's not possible to just change "model" to "model_engine" for generation.
is it implemented here
https://github.com/microsoft/DeepSpeedExamples/blob/master/Megatron-LM/generate_samples.py
to generate using multiple GPUs?

@StellaAthena
Copy link
Member

Superseded by codebase refactoring.

1T or BUST automation moved this from To do to Done Feb 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
Development

No branches or pull requests

3 participants