nanochat-omni/scripts at 43078c347efa2d8840272cb262d69a42a272379e - nanochat-omni - Gitea: Git with a cup of tea

fam/nanochat-omni

Files

T

History

Andrej Karpathy 0307997f9b merge two files base_loss and base_eval into a single file, it's nicer this way, and unify the huggingface code associated with both

2026-02-01 02:36:43 +00:00

..

base_eval.py

merge two files base_loss and base_eval into a single file, it's nicer this way, and unify the huggingface code associated with both

2026-02-01 02:36:43 +00:00

base_train.py

warmdown of 0.5 is slightly better:

2026-01-31 01:08:44 +00:00

chat_cli.py

nuke midtraining from orbit, it's not as needed now that we have a BOS-aligned dataloader. Also change the README a lot. midtrianing is not yet fully properly erased across the board, but good enough for step 1

2026-01-31 19:12:25 +00:00

chat_eval.py

nuke midtraining from orbit, it's not as needed now that we have a BOS-aligned dataloader. Also change the README a lot. midtrianing is not yet fully properly erased across the board, but good enough for step 1

2026-01-31 19:12:25 +00:00

chat_rl.py

Combine AdamW and Muon into single MuonAdamW optimizer, cleaner, ty @chrisjmccormick for idea/help

2026-01-29 00:52:08 +00:00

chat_sft.py

nuke midtraining from orbit, it's not as needed now that we have a BOS-aligned dataloader. Also change the README a lot. midtrianing is not yet fully properly erased across the board, but good enough for step 1

2026-01-31 19:12:25 +00:00

chat_web.py

feat: allow top_k=0 in web api to disable filtering (#458 )

2026-01-30 09:21:41 -08:00

tok_eval.py

initial commit

2025-10-13 06:49:24 -07:00

tok_train.py

quick fix to not OOM main speedrun script

2026-01-26 22:31:42 +00:00