EleutherAI / gpt-neox Public

Notifications You must be signed in to change notification settings
Fork 970
Star 6.7k

Code
Issues 55
Pull requests 23
Actions
Projects 2
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Issues: EleutherAI/gpt-neox

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear current search query, filters, and sorts

42 Open 286 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Cannot convert neox model to HF bug

Something isn't working

#1231 opened May 28, 2024 by srivassid

How to set the ffn hidden size parameter in gpt neox feature request

New feature or request

#1230 opened May 28, 2024 by IronMan-WangJinxi

The results of running eval show only 1 digit after decimal point for acc on all tested tasks bug

Something isn't working

#1227 opened May 22, 2024 by lernerjenny

My servers used for multi-node training do not have ssh. How can I launch multi-node training using the torchrun command? feature request

New feature or request

#1203 opened Apr 23, 2024 by dingning97

PyTorch Lightning Fused optimizer step feature request

New feature or request

#1160 opened Feb 29, 2024 by jahatef

Tests fail when run with pytest --forked bug

Something isn't working

#1132 opened Jan 25, 2024 by segyges

Integrate TransformerEngine feature request

New feature or request

#1098 opened Dec 21, 2023 by Quentin-Anthony

Interoperability and GPT-NeoX documentation

Improvements or additions to documentation

question

#1058 opened Oct 12, 2023 by StellaAthena

Support for Mosaic Models feature request

New feature or request

#1057 opened Oct 6, 2023 by rajveer43

[BUG] Inconsistent loss between overlap_comm=true and overlap_comm=false bug

Something isn't working

#1004 opened Jul 27, 2023 by 0x6b64

Convert HF Llama Checkpoints to Neox Checkpoints feature request

New feature or request

#994 opened Jul 10, 2023 by sxthunder

AssertionError: zero stage 1 requires an optimizer bug

Something isn't working

good first issue

Good for newcomers

help wanted

This issue needs assistance

#987 opened Jul 4, 2023 by yonglianglan

How to preserve Pythia's sampling order but for different batch size. bug

Something isn't working

#984 opened Jul 3, 2023 by lintangsutawika

Why we need to average LayerNorm values over mp ranks when converting to HFformat checkpoint?

#983 opened Jun 26, 2023 by forceshorty

Bias weights are multi-added when using gpt_j_residual in model-parallel execution bug

Something isn't working

good first issue

Good for newcomers

#962 opened May 31, 2023 by cbcase

Can't finetune 20B model from slim weights with zero optimizer enabled bug

Something isn't working

#926 opened May 5, 2023 by coreystatendet

Problems on generating with llama model

#921 opened May 4, 2023 by wiio12

Fine-tuning gpt-neox on 8 A100s feature request

New feature or request

#892 opened Apr 20, 2023 by rajhans

OOM error when training on a 220G Memory machine with 8 V100. feature request

New feature or request

#867 opened Apr 2, 2023 by SefaZeng

Add support for pytorch 2.0 ? deprioritized

Issues that are not closed, but are low priority and unlikely to be solved soon

feature request

New feature or request

#858 opened Mar 27, 2023 by guozhiyao

Finetuning loss explode when not loading deepspeed zero optimal states bug

Something isn't working

#843 opened Mar 19, 2023 by sxthunder

Implement Prefix-LM attention masking feature request

New feature or request

#805 opened Mar 1, 2023 by TokyoExpress

Cannot load the checkpoint bug

Something isn't working

#782 opened Feb 6, 2023 by jmlongriver12

Unable to load model checkpoint with model parallelism feature request

New feature or request

#773 opened Jan 20, 2023 by RaoNikitha

Multi-node training without shared memory deprioritized

Issues that are not closed, but are low priority and unlikely to be solved soon

feature request

New feature or request

#765 opened Jan 6, 2023 by VHellendoorn

Previous 1 2 Next

Previous Next

ProTip! Updated in the last three days: updated:>2024-06-16.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly