Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pp generation #505

Merged
merged 12 commits into from
Feb 9, 2022
Merged

Fix pp generation #505

merged 12 commits into from
Feb 9, 2022

Conversation

sdtblck
Copy link
Contributor

@sdtblck sdtblck commented Feb 6, 2022

This should fix generation + eval tasks for pipeline parallel sizes > 1 in conjunction with model parallel > 1

I haven't yet compared 20B outputs when generating to the main branch, since I don't have access to any A100 pods rn. Would appreciate if someone could do that, but eval results remain the same.

main:

{
    "results": {
        "lambada": {
            "ppl": 3.677788098948207,
            "ppl_stderr": 0.07588756650708797,
            "acc": 0.715311469047157,
            "acc_stderr": 0.006287017261279481
        }
    },
    "versions": {
        "lambada": 0
    }
}

PR:

{
    "results": {
        "lambada": {
            "ppl": 3.6760727055431444,
            "ppl_stderr": 0.07584380001242345,
            "acc": 0.715311469047157,
            "acc_stderr": 0.006287017261279481
        }
    },
    "versions": {
        "lambada": 0
    }
}

Changes:

  • removes layernorm test in test_fused_kernels.py (dead code)
  • layer pasts / presents are no longer passed around the model, instead they're cached to the ParallelTransformerLayer class.
  • I've renamed get_key_value to use_cache, since I found that naming very confusing

@sdtblck
Copy link
Contributor Author

sdtblck commented Feb 9, 2022

Just verified outputs are the same across branches:

Main (pp1):

{
    "context": "",
    "text": "Q:\n\nHow to get the value of a variable in a function in a different file?\n\nI have a file called \"main.py\" and a file called \"functions.py\".\nIn \"main.py\" I have:\nimport functions\n\ndef main():\n    print(functions",
    "length": 64,
    "finished": false,
    "message": null,
    "duration_seconds": 10.974509716033936
}

fix_pp_generation (pp1):

{
    "context": "",
    "text": "Q:\n\nHow to get the value of a variable in a function in a different file?\n\nI have a file called \"main.py\" and a file called \"functions.py\".\nIn \"main.py\" I have:\nimport functions\n\ndef main():\n    print(functions",
    "length": 64,
    "finished": false,
    "message": null,
    "duration_seconds": 16.59027099609375
}

fix_pp_generation (pp4):

{
    "context": "",
    "text": "Q:\n\nHow to get the value of a variable in a function in a different file?\n\nI have a file called \"main.py\" and a file called \"functions.py\".\nIn \"main.py\" I have:\nimport functions\n\ndef main():\n    print(functions",
    "length": 64,
    "finished": false,
    "message": null,
    "duration_seconds": 25.938984632492065
}

@sdtblck sdtblck merged commit 7aed133 into main Feb 9, 2022
@sdtblck sdtblck deleted the fix_pp_generation branch February 9, 2022 23:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants