Commit Graph

5 Commits

Author SHA1 Message Date
Andrej Karpathy 4ddc803797 fix adamw slight bug. this chunk was copy pasted originally from modded-nanogpt, which still seems to have the bug 2026-01-08 18:18:42 +00:00
Matěj Kripner bbc57da7d5 slightly nicer error message 2025-12-09 12:46:48 +01:00
Matěj Kripner f1bf69d562 feat: pad vocab size to 64 for DDP optimizers and efficiency 2025-12-09 12:38:18 +01:00
Sermet Pekin 49cd02f283 fix: remove unnecessary tensor allocation in DistAdamW optimizer
fix: remove unnecessary tensor allocation in DistAdamW optimizer
2025-10-20 12:03:26 +03:00
karpathy 3a5e0bc50b initial commit 2025-10-13 06:49:24 -07:00