Logo
Explore Help
Sign In
fam/nanochat-omni
1
0
Fork 0
You've already forked nanochat-omni
Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity
Files
a641b6ca966fdabe81d8c30f25b287f3de9039a3
nanochat-omni/scripts
T
History
Mathieu Lacage a641b6ca96 MMLU main split is named auxiliary_train, not train
2026-03-13 13:19:10 +01:00
..
base_eval.py
delete autocast, an unnecessary thorn in my side, manage dtypes directly
2026-03-04 23:55:30 +00:00
base_train.py
All of these improvements were developed by Claude running autonomously over ~2 days using autoresearch. I didn't touch anything - incredible. All tuning was done on d12 but generalized easily to larger models (e.g. d24 in particular). This means we will also get a new "Time to GPT-2" Leaderboard entry, which I will push separately.
2026-03-09 20:45:17 +00:00
chat_cli.py
delete autocast, an unnecessary thorn in my side, manage dtypes directly
2026-03-04 23:55:30 +00:00
chat_eval.py
delete autocast, an unnecessary thorn in my side, manage dtypes directly
2026-03-04 23:55:30 +00:00
chat_rl.py
delete autocast, an unnecessary thorn in my side, manage dtypes directly
2026-03-04 23:55:30 +00:00
chat_sft.py
MMLU main split is named auxiliary_train, not train
2026-03-13 13:19:10 +01:00
chat_web.py
delete autocast, an unnecessary thorn in my side, manage dtypes directly
2026-03-04 23:55:30 +00:00
tok_eval.py
initial commit
2025-10-13 06:49:24 -07:00
tok_train.py
quick fix to not OOM main speedrun script
2026-01-26 22:31:42 +00:00
Powered by Gitea Version: 1.26.0+rc0 Page: 47ms Template: 3ms
Auto
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API