Skip to content

Repo to go with my youtube video - Matrix Magic: Understanding Transformers with Matrices, Math, and Code.

Notifications You must be signed in to change notification settings

nadaataiyab/matrixmagic-transformer-talk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Matrix Magic: Understanding Transformers with Matrices, Math, and Code.

Hello! I made this video to help fellow data scientists and developers better understand the Transformer architecture that underlies the GPT models and most modern large language models! I was struggling until I finally just created a spreadsheet with a toy example matrix and worked out each matrix transformation one step at a time. It was tough going, but once I was done the concepts finally "clicked" for me. If you are on the same journey of understanding, I hope this helps you too!

https://youtu.be/aXLIebCK0pE

Matrix Magic: Understanding Transformers with Matrices, Math, and Code

Resources: Andrej Kaparthy’s YouTube video:

  • Let’s build GPT: from scratch, in code, spelled out. Andrej Kaparthy’s Colab Notebook:
  • Building a GPT The famous paper with the Transformer Architecture:
  • Attention is All You Need GPT Papers
  • Language Models are Few Shot Learners (GPT-3)
  • Language Models are Unsupervised Multitask Learners (GPT-2) YouTube video explaining Self-Attention:
  • Intuition Behind Self Attention in Transformer Networks Helpful blog post with matrix diagrams:
  • Step-by-Step Illustrated Explanations for Transformer Stanford lecture on word vectors:
  • Stanford CS224N: NLP with Deep Learning | Winter 2021 | Lecture 1 - Intro & Word Vectors

About

Repo to go with my youtube video - Matrix Magic: Understanding Transformers with Matrices, Math, and Code.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published