GPT-NeoX-20B Announcement

philpax · Feb 2, 2022 · 6d2c67e · 6d2c67e
1 parent ba41f1f
commit 6d2c67e
Show file tree

Hide file tree

Showing 43 changed files with 354 additions and 131 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,2 +1,3 @@
 .vscode
-public*
+public*
+resources*
diff --git a/config-blog.toml b/config-blog.toml
@@ -15,11 +15,11 @@ googleAnalytics = "G-9KCLV8BK53"
  email="[email protected]"
  sameAs="https://github.com/EleutherAI"
  foundingDate="2020-07-02"
- logo="https://eleuther.ai/images/EAI_logo2.png"
+ logo="https://eleuther.ai/images/promo.png"
  ShowAllPagesInArchive = true
  [params.label]
  icon = "/images/libre.svg"
- iconHeight = "45px"
+ iconHeight = "48px"
  [params.assets]
  disableHLJS = true
  favicon="favicon.ico"

diff --git a/config.toml b/config.toml
@@ -20,7 +20,7 @@ googleAnalytics = "G-9KCLV8BK53"
  email="[email protected]"
  sameAs=["https://github.com/EleutherAI","https://wandb.ai/eleutherai","https://huggingface.co/EleutherAI"]
  foundingDate="2020-07-02"
- logo="https://eleuther.ai/images/EAI_logo2.png"
+ logo="https://eleuther.ai/images/promo.png"
  [params.homeInfoParams]
  Title="EleutherAI"
  Content="A grassroots collective of researchers working to open source AI research."

diff --git a/content-blog/announcing-20B.md b/content-blog/announcing-20B.md
@@ -0,0 +1,177 @@
+---
+title: "Announcing GPT-NeoX-20B"
+date: 2022-02-02T11:00:00-05:00
+draft: False
+description: "Announcing GPT-NeoX-20B, a 20 billion parameter model trained in collaboration with CoreWeave."
+author: ["Connor Leahy"]
+contributors: ["EleutherAI"]
+categories: ["Announcement"]
+---
+
+**GPT-NeoX-20B will be publicly downloadable from The Eye on the <date datetime="2022-02-08">8th of February</date>.**
+In the meantime, you can already try out the model using CoreWeave's new inference service, <a href="https://goose.ai/" title="We're dead serious, that is actually what it is called.">GooseAI</a>!
+
+---
+
+After a year-long odyssey through months of chip shortage-induced shipping delays, technical trials and tribulations, and aggressively boring debugging, we are happy to finally announce EleutherAI's latest open-source language model: GPT-NeoX-20B, a 20 billion parameter model trained using our [GPT-NeoX](https://github.com/EleutherAI/gpt-neox) framework on GPUs generously provided by our friends at [CoreWeave](https://www.coreweave.com/).
+
+GPT-NeoX-20B is, to our knowledge, the largest publicly accessible pretrained general-purpose autoregressive language model, and we expect it to perform well on many tasks.
+
+We hope that the increased accessibility of models of this size will aid in [research towards the safe use of AI systems](https://blog.eleuther.ai/why-release-a-large-language-model/), and encourage anyone interested in working in this direction to reach out to us.
+
+As a thank you to our generous compute donors, we are delaying the public downloadable release of the model by 7 days. On <date datetime="2022-02-09">February 9, 2022</date>, the full model weights will be downloadable for free under a permissive Apache 2.0 license from The Eye.
+
+There will be a {{<discord/channel "#20b">}} channel set up in our Discord for discussions of this model. Please note that much like our other language models and codebases, GPT-NeoX and GPT-NeoX-20B are very much research artifacts and we *do not recommend deploying either in a production setting without careful consideration*. In particular, we strongly encourage those looking to use GPT-NeoX-20B to read the [paper](https://arxiv.org/abs/2101.00027) and [datasheet](https://arxiv.org/abs/2201.07311) on our training data. There are still bugs to be ironed out and many inefficiencies that could be addressed---but hey, we do this in our free time, give us a break lol
+
+---
+
+{{<figure caption="Accuracy on standard language modeling tasks.">}}
+
+<table>
+<thead>
+<tr>
+<th style="text-align: left;">Task</th>
+<th style="text-align: left;">Category</th>
+<th style="text-align: center;">Babbage</th>
+<th style="text-align: center;">Curie</th>
+<th style="text-align: center;">GPT-J-6B</th>
+<th style="text-align: center;">GPT-NeoX-20B</th>
+<th style="text-align: center;">DaVinci</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td style="text-align: left;">LAMBADA</td>
+<td style="text-align: left;">Sentence Completion</td>
+<td style="text-align: right;">62.49%</td>
+<td style="text-align: right;">69.51%</td>
+<td style="text-align: right;">68.29%</td>
+<td style="text-align: right;">71.98%</td>
+<td style="text-align: right;">75.16%</td>
+</tr>
+<tr>
+<td style="text-align: left;">ANLI R1</td>
+<td style="text-align: left;">Natural Language Inference</td>
+<td style="text-align: right;">32.40%</td>
+<td style="text-align: right;">32.80%</td>
+<td style="text-align: right;">32.40%</td>
+<td style="text-align: right;">33.50%</td>
+<td style="text-align: right;">36.30%</td>
+</tr>
+<tr>
+<td style="text-align: left;">ANLI R2</td>
+<td style="text-align: left;">Natural Language Inference</td>
+<td style="text-align: right;">30.90%</td>
+<td style="text-align: right;">33.50%</td>
+<td style="text-align: right;">34.00%</td>
+<td style="text-align: right;">34.40%</td>
+<td style="text-align: right;">37.00%</td>
+</tr>
+<tr>
+<td style="text-align: left;">ANLI R3</td>
+<td style="text-align: left;">Natural Language Inference</td>
+<td style="text-align: right;">33.75%</td>
+<td style="text-align: right;">35.50%</td>
+<td style="text-align: right;">35.50%</td>
+<td style="text-align: right;">35.75%</td>
+<td style="text-align: right;">36.83%</td>
+</tr>
+<tr>
+<td style="text-align: left;">WSC</td>
+<td style="text-align: left;">Coreference Resolution</td>
+<td style="text-align: right;">40.38%</td>
+<td style="text-align: right;">54.81%</td>
+<td style="text-align: right;">36.53%</td>
+<td style="text-align: right;">53.61%</td>
+<td style="text-align: right;">63.46%</td>
+</tr>
+<tr>
+<td style="text-align: left;">Winogrande</td>
+<td style="text-align: left;">Coreference Resolution</td>
+<td style="text-align: right;">59.51%</td>
+<td style="text-align: right;">64.56%</td>
+<td style="text-align: right;">64.01%</td>
+<td style="text-align: right;">65.27%</td>
+<td style="text-align: right;">69.93%</td>
+</tr>
+<tr>
+<td style="text-align: left;">HellaSwag</td>
+<td style="text-align: left;">Sentence Completion</td>
+<td style="text-align: right;">54.54%</td>
+<td style="text-align: right;">49.54%</td>
+<td style="text-align: right;">49.54%</td>
+<td style="text-align: right;">49.04%</td>
+<td style="text-align: right;">59.18%</td>
+</tr>
+<tr>
+<td style="text-align: left;">Total</td>
+<td style="text-align: left;"></td>
+<td style="text-align: right;">39.40%</td>
+<td style="text-align: right;">42.57%</td>
+<td style="text-align: right;">40.28%</td>
+<td style="text-align: right;">43.31%</td>
+<td style="text-align: right;">48.40%</td>
+</tr>
+</tbody>
+</table>
+
+{{</figure>}}
+
+{{<figure caption="Accuracy of factual knowledge by subject group, as measured by the [HendrycksTest](https://arxiv.org/abs/2009.03300) evaluation.">}}
+
+<table>
+<thead>
+<tr>
+<th style="text-align: left;">Subject Group</th>
+<th style="text-align: center;">Babbage</th>
+<th style="text-align: center;">Curie</th>
+<th style="text-align: center;">GPT-J-6B</th>
+<th style="text-align: center;">GPT-NeoX-20B</th>
+<th style="text-align: center;">DaVinci</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td style="text-align: left;">Humanities</td>
+<td style="text-align: right;">27.01%</td>
+<td style="text-align: right;">26.48%</td>
+<td style="text-align: right;">28.07%</td>
+<td style="text-align: right;">28.70%</td>
+<td style="text-align: right;">32.30%</td>
+</tr>
+<tr>
+<td style="text-align: left;">Social Science</td>
+<td style="text-align: right;">27.94%</td>
+<td style="text-align: right;">29.24%</td>
+<td style="text-align: right;">28.73%</td>
+<td style="text-align: right;">31.63%</td>
+<td style="text-align: right;">35.87%</td>
+</tr>
+<tr>
+<td style="text-align: left;">STEM</td>
+<td style="text-align: right;">25.83%</td>
+<td style="text-align: right;">24.25%</td>
+<td style="text-align: right;">25.71%</td>
+<td style="text-align: right;">26.27%</td>
+<td style="text-align: right;">28.60%</td>
+</tr>
+<tr>
+<td style="text-align: left;">Other</td>
+<td style="text-align: right;">26.86%</td>
+<td style="text-align: right;">28.84%</td>
+<td style="text-align: right;">27.95%</td>
+<td style="text-align: right;">29.83%</td>
+<td style="text-align: right;">36.85%</td>
+</tr>
+<tr>
+<td style="text-align: left;">Total</td>
+<td style="text-align: right;">26.78%</td>
+<td style="text-align: right;">26.90%</td>
+<td style="text-align: right;">27.38%</td>
+<td style="text-align: right;">28.77%</td>
+<td style="text-align: right;">32.86%</td>
+</tr>
+</tbody>
+</table>
+
+{{</figure>}}
diff --git a/content/about.md b/content/about.md
@@ -1,12 +1,8 @@
 ---
 title: "About Us"
-date: 2019-04-26T20:18:54+03:00
+lastMod: 2022-02-02T11:00:00-05:00
 layout: aboutpage
 hideMeta: True
-description:
 ---
 
-EleutherAI (/iˈluθər eɪ. aɪ/) is a decentralized grassroots collective of volunteer researchers, engineers, and developers focused on AI alignment, scaling, and open source AI research. Founded in July of 2020, our flagship project is the GPT-Neo family of models designed to replicate those developed by OpenAI as GPT-3. Our Discord server is open and welcomes contributors.
-
-
-
+EleutherAI (/iˈluθər eɪ. aɪ/) is a decentralized collective of volunteer researchers, engineers, and developers focused on AI alignment, scaling, and open source AI research. Founded in <date datetime="2020-07">July of 2020</date>, we are most well known for our ongoing efforts to build and open source large language models, but we also do open research in alignment, interpretability, BioML, ML art and many other fields. Our Discord server is open and welcomes contributors!
diff --git a/content/announcements/2022-02-01.md b/content/announcements/2022-02-01.md
@@ -0,0 +1,9 @@
+---
+title: GPT-NeoX
+date: 2022-02-02T11:00:00-05:00
+link: 
+ url: https://blog.eleuther.ai/announcing-20B/
+ text: Read the announcement
+---
+
+Announcing GPT-NeoX-20B, a 20 billion parameter model trained in collaboration with [CoreWeave](https://www.coreweave.com/).