forked from EleutherAI/website
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request EleutherAI#79 from EleutherAI/fm-cheatsheet
Add FM Dev Cheatsheet blogpost
- Loading branch information
Showing
1 changed file
with
14 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
--- | ||
title: "The Foundation Model Development Cheatsheet" | ||
date: 2024-2-29 | ||
description: "Announcing a new resource, the FM Dev Cheatsheet." | ||
author: ["EleutherAI"] | ||
draft: false | ||
--- | ||
|
||
The pace of foundation model releases and progress has continued to grow rapidly over the past few years, with many new models released from [organizations of all kinds worldwide](https://docs.google.com/spreadsheets/d/1gc6yse74XCwBx028HV_cvdxwXkmXejVjkO-Mz2uwE0k/edit?pli=1#gid=0). In addition to releasing models themselves, it's also important to make the tools to create these models - [large-scale training libraries](https://github.com/EleutherAI/gpt-neox), [data processing and creation tooling](https://github.com/allenai/dolma), and more - widely available. In April 2023 we released the Pythia model suite, the first LLMs with a fully released and reproducible technical pipeline from start to finish. We are excited to see other organizations following suit, with the [LLM360](https://www.llm360.ai/) project releasing Amber later that year and AI2’s [OLMo](https://allenai.org/olmo) as fully-transparent artifact releases across the entire language model development process. Additionally, many other orgs have released new tools for underserved aspects of the development pipeline. Without full-pipeline transparency, accountability for undisclosed design decisions is prevented, and independent research and auditing are limited in their ability to draw robust conclusions or accurately assess harms. | ||
|
||
As a continuation of EleutherAI’s mission to lower [barriers to entry](https://arxiv.org/abs/2210.06413) of research and provide mentorship and [educational](https://blog.eleuther.ai/transformer-math/) [resources](https://github.com/EleutherAI/cookbook) about large-scale AI model development, we have collaborated with researchers from MIT, AI2, Hugging Face, Stanford, Princeton, Masakhane, MLCommons, and more to release “The Foundation Model Development Cheatsheet”, a quick-start guide to familiarize new developers with useful tools and resources for developing new open models. The topics covered span the entire model development cycle, from data collection to licensing and release practices, and are aimed to give a jumping-off point and high level survey of all the important steps for responsibly and successfully developing new models. We hope that the Cheatsheet will be a useful learning resource and reference for newer developers to be exposed to not just the technical aspects of model creation, which rightfully receives much attention already, but also the crucially important good practices around responsible development practices and release management. | ||
|
||
We hope the Cheatsheet will be a useful entry point into responsible and well-documented model development, and help raise awareness of these crucial issues. You can read the [paper](https://github.com/allenai/fm-cheatsheet/blob/main/app/resources/paper.pdf) for full details, or explore the collection of resources interactively via the [interactive website](https://fmcheatsheet.org/). It is intended as a living resource–all are welcome to [submit new resources](https://github.com/allenai/fm-cheatsheet#add-to-cheatsheet) and be recognized for their contributions! | ||
|