Skip to content

Navigation Menu

Explore
By size
By industry
By use case
Topics
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

EleutherAI / gpt-neox Public

Notifications You must be signed in to change notification settings
Fork 978
Star 6.7k

Code
Issues 56
Pull requests 30
Actions
Projects 2
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Pull requests: EleutherAI/gpt-neox

Labels 11 Milestones 1

Labels 11 Milestones 1

New pull request New

Clear current search query, filters, and sorts

0 Open 338 Closed

0 Open 338 Closed

Author

Filter by author

Loading

Label

Filter by label

Loading

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Loading

Milestones

Filter by milestone

Loading

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Loading

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add support for Flash attention

#725 by VHellendoorn was merged Dec 10, 2022

Loading…

26

Add Rotary Positional Embedding

#213 by sdtblck was merged Apr 7, 2021

Loading…

1

Draft PR Adding mistral 0.1

#1131 by AIproj was merged Feb 23, 2024

Loading…

10

[streaming_ds branch] Allow resuming from latest checkpoint when using StreamingDataset

#1163 by LeoGrin was merged Mar 15, 2024

Loading…

2

integrated flash attention 2

#1035 by a663E-36z1120 was merged Sep 20, 2023

Loading…

3

Add docker compose and change containerized setup instructions to use it

#1113 by segyges was merged Jan 9, 2024

Loading…

3

fused layernorm

#1105 by yang was merged Jan 26, 2024

Loading…

12

Improve argument validation for Flash-attn + SWA

#1162 by haileyschoelkopf was merged Mar 2, 2024

Loading…

4

#1129 by yang was merged Mar 7, 2024

Loading…

1

6

Mamba + Tensor Parallel Support

#1184 by haileyschoelkopf was merged Mar 15, 2024

Loading…

2

Add megablocks dropless MoE

#1192 by yang was merged May 4, 2024

Loading…

1

Jaimemcc intel/ci composite cpu tests

#1205 by jaimemcc-intel was merged May 4, 2024

Loading…

Fix bug in tools/ckpts/convert_neox_to_hf.py for setting intermediate_size

#1209 by jvendrow was merged May 4, 2024

Loading…

2

LR scheduler fix no longer breaks inference

#1060 by dashstander was merged Oct 17, 2023

Loading…

1

#1062 by andylolu2 was merged Oct 20, 2023

Loading…

6

PR for Deepspeed Integration

#9 by trisongz was merged Dec 24, 2020

Loading…

1

get rid of test file

#10 by sdtblck was merged Dec 26, 2020

Loading…

1

3

fix small bug where sequence length is not passed into attention class

#21 by lucidrains was merged Jan 1, 2021

Loading…

2

make mask value smaller by factor of 2

#25 by lucidrains was merged Jan 4, 2021

Loading…

2

#1 by lucidrains was merged Dec 22, 2020

Loading…

1

Update base_model.json

#93 by srulikbd was merged Jan 26, 2021

Loading…

Implement distributed training using Kubernetes

#77 by leogao2 was merged Jan 23, 2021

Loading…

2

6

Batch size needs to be specified

#87 by joshlk was merged Jan 23, 2021

Loading…

1

Add checkpoint saving / loading

#90 by sdtblck was merged Jan 28, 2021

Loading…

1

5

Remove layer caching

#109 by joshlk was merged Feb 1, 2021

Loading…

Previous 1 2 3 4 5 … 13 14 Next

Previous Next

ProTip! Updated in the last three days: updated:>2024-07-09.

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.