Hacker News new | past | comments | ask | show | jobs | submit | from login
The Weirdness of LLM Tokenization (twitter.com/karpathy)
2 points by tosh 1 day ago | past | discuss
Jagged Intelligence (twitter.com/karpathy)
2 points by mellosouls 1 day ago | past | discuss
Andrej Karpathy: "LLM model size competition is intensifying backwards (twitter.com/karpathy)
13 points by bilsbie 8 days ago | past | discuss
I am starting an AI+Education company (twitter.com/karpathy)
915 points by bilsbie 10 days ago | past | 540 comments
The if-then-else monster (twitter.com/karpathy)
4 points by tosh 16 days ago | past
[flagged] Andrej Karpathy on X: "100% Software 2.0 computer.Just a single neural net (twitter.com/karpathy)
25 points by bilsbie 26 days ago | past | 23 comments
One built-in UI/UX feature of LLM interfaces I'd love is proof (twitter.com/karpathy)
1 point by bilsbie 35 days ago | past
These 94 lines of code are everything that is needed to train a neural network (twitter.com/karpathy)
3 points by r_singh 35 days ago | past
[Andrej Karpathy] Let's reproduce GPT-2, in PyTorch from scratch (nanoGPT) (twitter.com/karpathy)
7 points by _giorgio_ 46 days ago | past
Let's reproduce GPT-2 (124M) (twitter.com/karpathy)
5 points by Multiset 47 days ago | past
FineWeb-Edu: High quality LLM dataset (twitter.com/karpathy)
3 points by tosh 54 days ago | past
I had ~30 direct reports and didn't do 1on1s at Tesla and imo it was great (twitter.com/karpathy)
1 point by tosh 56 days ago | past | 2 comments
CUDA/C++ origins of Deep Learning (twitter.com/karpathy)
4 points by tosh 84 days ago | past
llm.c: multi-GPU, bfloat16, flash attention, ~7% faster than PyTorch (twitter.com/karpathy)
121 points by tosh 84 days ago | past | 10 comments
LLMs must one day run in Space (twitter.com/karpathy)
6 points by tosh 84 days ago | past | 1 comment
Llm.c Update (twitter.com/karpathy)
31 points by ibobev 3 months ago | past
Karpathy on Llama 3 (twitter.com/karpathy)
12 points by tosh 3 months ago | past
Consider Being a Labeler for an LLM (twitter.com/karpathy)
2 points by tosh 3 months ago | past
Scheduling Workloads to Run on Humans (twitter.com/karpathy)
2 points by tosh 3 months ago | past
llm.c is now down to 26.2ms/iteration, matching PyTorch (twitter.com/karpathy)
46 points by tosh 3 months ago | past | 8 comments
Llm.c is only 2X slower than PyTorch (fp32, forward pass) (twitter.com/karpathy)
2 points by tosh 3 months ago | past
Andrej Karpathy explaining llm.c in layman terms (twitter.com/karpathy)
1 point by ibobev 3 months ago | past
Karpathy: Explaining llm.c in layman terms [tweet] (twitter.com/karpathy)
9 points by mellosouls 3 months ago | past
LLM training in simple, pure C/CUDA (twitter.com/karpathy)
4 points by theaniketmaurya 3 months ago | past | 1 comment
Automating Software Engineering (twitter.com/karpathy)
4 points by quick_brown_fox 4 months ago | past
Andrej Karpathy on automating software engineering (twitter.com/karpathy)
5 points by hubraumhugo 4 months ago | past
Love Letter to Obsidian (twitter.com/karpathy)
69 points by tosh 5 months ago | past | 40 comments
A Deepdive into the Gemma Tokenizer (twitter.com/karpathy)
2 points by tosh 5 months ago | past
A lot of weird behaviors and problems of LLMs trace back to tokenization (twitter.com/karpathy)
30 points by stared 5 months ago | past | 8 comments
Karpathy on Tokenizers (twitter.com/karpathy)
2 points by drcwpl 5 months ago | past | 1 comment

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: