Fix pp generation #505

sdtblck · 2022-02-06T18:48:28Z

This should fix generation + eval tasks for pipeline parallel sizes > 1 in conjunction with model parallel > 1

I haven't yet compared 20B outputs when generating to the main branch, since I don't have access to any A100 pods rn. Would appreciate if someone could do that, but eval results remain the same.

main:

{
    "results": {
        "lambada": {
            "ppl": 3.677788098948207,
            "ppl_stderr": 0.07588756650708797,
            "acc": 0.715311469047157,
            "acc_stderr": 0.006287017261279481
        }
    },
    "versions": {
        "lambada": 0
    }
}

PR:

{
    "results": {
        "lambada": {
            "ppl": 3.6760727055431444,
            "ppl_stderr": 0.07584380001242345,
            "acc": 0.715311469047157,
            "acc_stderr": 0.006287017261279481
        }
    },
    "versions": {
        "lambada": 0
    }
}

Changes:

removes layernorm test in test_fused_kernels.py (dead code)
layer pasts / presents are no longer passed around the model, instead they're cached to the ParallelTransformerLayer class.
I've renamed get_key_value to use_cache, since I found that naming very confusing

… class state

sdtblck · 2022-02-09T22:51:54Z

Just verified outputs are the same across branches:

Main (pp1):

{
    "context": "",
    "text": "Q:\n\nHow to get the value of a variable in a function in a different file?\n\nI have a file called \"main.py\" and a file called \"functions.py\".\nIn \"main.py\" I have:\nimport functions\n\ndef main():\n    print(functions",
    "length": 64,
    "finished": false,
    "message": null,
    "duration_seconds": 10.974509716033936
}

fix_pp_generation (pp1):

{
    "context": "",
    "text": "Q:\n\nHow to get the value of a variable in a function in a different file?\n\nI have a file called \"main.py\" and a file called \"functions.py\".\nIn \"main.py\" I have:\nimport functions\n\ndef main():\n    print(functions",
    "length": 64,
    "finished": false,
    "message": null,
    "duration_seconds": 16.59027099609375
}

fix_pp_generation (pp4):

{
    "context": "",
    "text": "Q:\n\nHow to get the value of a variable in a function in a different file?\n\nI have a file called \"main.py\" and a file called \"functions.py\".\nIn \"main.py\" I have:\nimport functions\n\ndef main():\n    print(functions",
    "length": 64,
    "finished": false,
    "message": null,
    "duration_seconds": 25.938984632492065
}

sdtblck added 12 commits February 4, 2022 14:39

remove layernorm test in test_fused_kernels.py

d79f3f3

don't pass around layer_pasts / presents. Instead - cache them in the…

1bdae82

… class state

remove fused kernels layernorm test

ec36d2c

fix text generation

1e2b5c5

fix pipe + model forward

4977c3d

fix error if config files already exist

6d06fd6

remove assertiion that pp <= 1, cleanup

851560b

wip push changes

7e89f24

fix generation for pp>1 & mp>1

11227fa

rename get_key_value to use_cache

9a2c0cb

cache -> use_cache

5fe389e

never pass in a full size attn mask

46d288c

sdtblck requested a review from a team as a code owner February 6, 2022 18:48

sdtblck requested review from StellaAthena and ShivanshuPurohit February 6, 2022 18:48

EricHallahan approved these changes Feb 9, 2022

View reviewed changes

sdtblck merged commit 7aed133 into main Feb 9, 2022

sdtblck deleted the fix_pp_generation branch February 9, 2022 23:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix pp generation #505

Fix pp generation #505

sdtblck commented Feb 6, 2022

sdtblck commented Feb 9, 2022

Fix pp generation #505

Fix pp generation #505

Conversation

sdtblck commented Feb 6, 2022

sdtblck commented Feb 9, 2022