nanochat-omni

Files

T

Andrej Karpathy 43c29dd9d5 Big DataLoader refactor: BOS-aligned dataloaders with epoch tracking for pre/mid-training

The new DataLoader ensures that every token sequence in train/val batches has a BOS token
at the beginning. Therefore, no token streams start abruptly in the middle of a document,
which could be confusing for the model. Note that this changes the loss scale because there
are fewer confusing tokens in the train/val batches. The main downside is that we now waste
about 35% of tokens due to cropping. This is ok because we have a lot of data. See dev/LOG.md
entry for this change for a lot more information.

2026-01-13 20:05:47 +00:00

estimate_gpt3_core.ipynb

add notebook on deriving the CORE estimates for the GPT-3 miniseries.

2026-01-05 18:40:28 +00:00

gen_synthetic_data.py

sane secrets management

2026-01-04 19:29:22 +00:00

generate_logo.html

initial commit

2025-10-13 06:49:24 -07:00

LOG.md

Big DataLoader refactor: BOS-aligned dataloaders with epoch tracking for pre/mid-training