Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model sampling/text generation #160

Merged
merged 60 commits into from
Apr 22, 2021
Merged

Model sampling/text generation #160

merged 60 commits into from
Apr 22, 2021

Conversation

joshlk
Copy link
Member

@joshlk joshlk commented Mar 5, 2021

Text generation is working with no output (unconditional) and with an input file. By "working" I mean it is loading a model, creating output and saving it to file. The output is currently utter gibberish as I'm testing it on basically a randomly initialised model.

Steps to test:

Generate a model:

  1. first change the save-interval to 10 or something small in the model config.
  2. Run a model (note this now saves the model config in the checkpoint directory so you can easily determine what the model data corresponds to):

./deepy.py pretrain_gpt2.py -d configs small.yml eleutherai_cluster.yml and wait for a checkpoint to save.

To test unconditional text generation:

  1. Make sure "text-gen-type": "unconditional" is set in the text_generation.yml config
  2. Run an text generation job loading the generated model:

./deepy.py text_gen_gpt2.py -d configs small.yml eleutherai_cluster.yml text_generation.yml

  1. Output will be found in /mnt/ssd-cluster/output/text_generation.txt

To test input-file text generation:

  1. Make sure "text-gen-type": "input-file" is set in the text_generation.yml config
  2. Create an new line delineated input file at /mnt/ssd-cluster/output/sample_input.txt
  3. Run a text generation job:

./deepy.py text_gen_gpt2.py -d configs small.yml eleutherai_cluster.yml text_generation.yml

  1. Output will be found in /mnt/ssd-cluster/output/sample_output.txt

Todo:

  • Model loading
  • Unconditional output
  • Output from file
  • Save model config to checkpoint folder
  • Create example script in examples folder

@StellaAthena StellaAthena linked an issue Mar 5, 2021 that may be closed by this pull request
joshlk added 10 commits March 6, 2021 18:06
# Conflicts:
#	Dockerfile
#	requirements.txt
commit 43be6ce
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 10:19:12 2021 +0000

    Remove debugging

commit 450dfb9
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 10:04:23 2021 +0000

    Test `input-file`

commit 3d0f562
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 09:52:57 2021 +0000

    Skip tokens that don't exist

commit 9384ace
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 09:48:52 2021 +0000

    Debug

commit 82871dd
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 09:47:49 2021 +0000

    Debug

commit 5d1d19e
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 09:44:50 2021 +0000

    Remove debugging

commit dc4cf73
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 09:41:33 2021 +0000

    Debug

commit a4d2bf3
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 09:33:29 2021 +0000

    Force activation checkpointing to be disabled

commit 901aa73
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 09:29:27 2021 +0000

    Debugging

commit c9706d1
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 09:19:41 2021 +0000

    Debugging

commit a4835b1
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 08:55:17 2021 +0000

    Debugging

commit 747b9a9
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 08:46:48 2021 +0000

    Debugging

commit 36137c1
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 08:28:54 2021 +0000

    Debugging

commit f58ea20
Author: Josh Levy-Kramer <[email protected]>
Date:   Tue Mar 9 08:22:35 2021 +0000

    Debugging

commit 4ff6300
Author: Josh Levy-Kramer <[email protected]>
Date:   Mon Mar 8 18:47:04 2021 +0000

    Change port based on rank

commit be4b715
Author: Josh Levy-Kramer <[email protected]>
Date:   Mon Mar 8 18:45:13 2021 +0000

    Change port based on rank

commit 63a1be0
Author: Josh Levy-Kramer <[email protected]>
Date:   Mon Mar 8 18:43:52 2021 +0000

    Change port based on rank

commit 1b8f6b9
Author: Josh Levy-Kramer <[email protected]>
Date:   Mon Mar 8 18:16:58 2021 +0000

    pycharm debugger

commit 1110bee
Author: Josh Levy-Kramer <[email protected]>
Date:   Mon Mar 8 18:05:47 2021 +0000

    pycharm debugger

commit 90bb5d4
Author: Josh Levy-Kramer <[email protected]>
Date:   Mon Mar 8 17:51:46 2021 +0000

    Test: try manhole

commit 6fbcc1f
Author: Josh Levy-Kramer <[email protected]>
Date:   Mon Mar 8 17:50:08 2021 +0000

    Test: try manhole

commit a272a59
Author: Josh Levy-Kramer <[email protected]>
Date:   Mon Mar 8 17:46:50 2021 +0000

    Test: try manhole
@sdtblck
Copy link
Contributor

sdtblck commented Apr 8, 2021

Ok, this is now working with pipeline parallel size > 1.

The merge conflicts will be quite intense (since rotary pos emb means the number of args passed between pipe stages can differ). I'll sort these soon and we can merge this.

@sdtblck
Copy link
Contributor

sdtblck commented Apr 8, 2021

Ok so sampling is now working with all pipeline parallel sizes. It's the most awful, hacky code ever, you need to update your deeperspeed branch to the latest commit to get it to work. This should be ready to be merged now imo.

sdtblck
sdtblck previously approved these changes Apr 22, 2021
@sdtblck sdtblck dismissed StellaAthena’s stale review April 22, 2021 09:27

changes have been made

@sdtblck sdtblck dismissed stale reviews from ShivanshuPurohit and themself via 1f861e6 April 22, 2021 12:11
@sdtblck sdtblck merged commit e1f7fcb into main Apr 22, 2021
@sdtblck sdtblck deleted the model_sampling branch April 22, 2021 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure Sampling works correctly
5 participants