Commit Graph

37 Commits

Author SHA1 Message Date
Andrej Karpathy 96522798f1 docs docs docs 2026-02-05 20:27:07 +00:00
Andrej Karpathy 5fdd5cdb24 new leaderboard record via new auto-calculated optimal batch size. for d26 it is 1M, up from 0.5M that was default earlier 2026-02-05 20:11:32 +00:00
Sofie Van Landeghem 012da1a78b Typo fixes (#480)
* small typo

* few more small fixes

* small fixes in leaderboard.md
2026-02-05 19:12:50 +01:00
Andrej Karpathy 75b302f331 fix hash commit on leaderboard and a paragraph clarification 2026-02-05 16:14:28 +00:00
Andrej Karpathy fe55b092b8 minor cosmetics for the table 2026-02-03 21:05:28 +00:00
Andrej Karpathy a67eba35dc add feb2 new leaderboard record from upgrading to fp8 training, +4.3% speedup to time to GPT-2 2026-02-03 21:03:42 +00:00
Andrej Karpathy 0307997f9b merge two files base_loss and base_eval into a single file, it's nicer this way, and unify the huggingface code associated with both 2026-02-01 02:36:43 +00:00
Andrej Karpathy 1ddaad1c1c nuke midtraining from orbit, it's not as needed now that we have a BOS-aligned dataloader. Also change the README a lot. midtrianing is not yet fully properly erased across the board, but good enough for step 1 2026-01-31 19:12:25 +00:00
Andrei Panferov 4d8dbaf6e0 Fix escape character in README bibtex entry (#454) 2026-01-30 09:34:02 -08:00
Andrej Karpathy 02baa15405 i am feeling in a delete mood today. i need to delete a lot of code. there is too much code and surface area and complexity. ew 2026-01-30 17:08:53 +00:00
Andrej Karpathy 41bb2eac32 Combine AdamW and Muon into single MuonAdamW optimizer, cleaner, ty @chrisjmccormick for idea/help 2026-01-29 00:52:08 +00:00
Andrej Karpathy 63bb5831e2 something i've wanted to do for a while - move all .sh runs to their own directory so they don't pollute root dir 2026-01-18 15:27:41 +00:00
Andrej Karpathy 6460dc6382 tweaks to readme a bit 2026-01-17 02:28:31 +00:00
Sofie Van Landeghem d4ea28d4e2 Fix args in readme (#438)
* fix commands in readme, using new arg format

* fix typo

* add required -i flag to chat_eval example runs
2026-01-15 16:26:38 -08:00
Andrej Karpathy 4cc605b940 quick pointer to miniseries post in readme for now 2026-01-07 22:14:21 +00:00
Andrej Karpathy eb7bbc1b66 delete the configurator in favor of argparse and clean up a lot of kwarg details to make them more consistent across all scripts 2026-01-04 19:14:23 +00:00
Andrej Karpathy da8b7ea4cb also delete the rustbpe test code, this now lives in rustbpe repo that is separate 2026-01-04 01:23:34 +00:00
Andrej Karpathy aa42f40e66 delete the inline rustbpe project. it was ugly to have a project within project and rustbpe is now nicely a separate repo on my github karpathy/rustbpe and it's on pypi etc., so we just add it as a depedency to uv. i think it is appropriate that this is a separate repo because 1) it doesn't have too many knobs, other than the ones that are exposed - the regex pattern and vocab size and 2) all of its complexity is not algorithmic (it's equivalent to minbpe), instead it is efficiency-related, so it is ok to hide relatively speaking 2026-01-03 23:55:28 +00:00
Hossein-Lakzaei 8c89661465 Update README to match current d34 demo (#314) (#381)
* Update README: switch hosted model description from d32 to d34 per discussion #314

* link to discussion thread

* parameter in quotes

---------

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2025-12-30 10:17:11 +01:00
Andrej 4763ce612a Small fixes to typos 2025-11-14 07:25:59 -08:00
svlandeg e5efb4b471 add test_engine.py to file structure 2025-11-14 11:13:42 +01:00
Andrej Karpathy 9a71d13688 typo oops 2025-11-13 16:08:30 +00:00
Andrej Karpathy 7b7fd0fe71 thank you Sophie for your help with nanochat 2025-11-13 16:07:54 +00:00
Andrej Karpathy f15732524a make deepwiki link better 2025-11-01 14:13:29 +00:00
svlandeg 0a3ce7b0ff typo fixes in readme 2025-10-28 20:11:00 +01:00
Andrej Karpathy 9415931f85 delete czar call for help, i'm working through the inbound on that now. add current LLM policy which just asks for disclosure atm 2025-10-28 15:17:43 +00:00
Andrej Karpathy c75fe54aa7 readme tweak, link to new discussion and add file structure 2025-10-25 19:39:16 +00:00
Andrej Karpathy 5eeb2b6ef9 experiment: looking to 'hire' a nanochat repo czar to help the repo, mentioning in readme 2025-10-22 16:55:54 +00:00
Andrej Karpathy 50bea28ef9 also add readme mention of the cpu mps changes 2025-10-21 17:24:48 +00:00
Andrej c9ea7a91e2 Add customization instructions to README
Added a section on customization for nanochat.
2025-10-21 08:57:10 -07:00
Andrej Karpathy d6d86cbf4c update readme with a link to the CPU|MPS branch 2025-10-16 22:03:39 +00:00
Andrej Karpathy ccfe7915ac mention the current d32 chat hosted on nanochat.karpathy.ai, as an example endpoint of the repo 2025-10-16 19:32:44 +00:00
Enes Poyraz 6a795baf27 Update README.md
fix typos
2025-10-13 18:40:12 +02:00
Andrej 626bd3e260 Add image of the WebUI to readme 2025-10-13 08:03:00 -07:00
karpathy da96b46565 update link to the new discussion 2025-10-13 07:42:09 -07:00
karpathy a53833d04f add nanochat logo png 2025-10-13 06:59:59 -07:00
karpathy 3a5e0bc50b initial commit 2025-10-13 06:49:24 -07:00