We present a GPT-2 style decoder transformer language model. This largely follows Andrej Karpathy's implementation on "Let's build GPT: from scratch, in code, spelled out" (see https://www.youtube.com/watch?v=kCc8FmEb1nY&ab_channel=AndrejKarpathy), but with a few tweaks.
michailmelonas/gpt-2
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.