Commit Graph

4 Commits

Author SHA1 Message Date
Matěj Kripner bbc57da7d5 slightly nicer error message 2025-12-09 12:46:48 +01:00
Matěj Kripner f1bf69d562 feat: pad vocab size to 64 for DDP optimizers and efficiency 2025-12-09 12:38:18 +01:00
Sermet Pekin 49cd02f283 fix: remove unnecessary tensor allocation in DistAdamW optimizer
fix: remove unnecessary tensor allocation in DistAdamW optimizer
2025-10-20 12:03:26 +03:00
karpathy 3a5e0bc50b initial commit 2025-10-13 06:49:24 -07:00