i am feeling in a delete mood today. i need to delete a lot of code. there is too much code and surface area and complexity. ew

This commit is contained in:
Andrej Karpathy
2026-01-30 17:08:53 +00:00
parent d6c4f3b923
commit 02baa15405
2 changed files with 1 additions and 95 deletions
+1 -2
View File
@@ -13,7 +13,7 @@ This repo is a full-stack implementation of an LLM like ChatGPT in a single, cle
## Talk to it
To get a sense of the endpoint of this repo, you can currently find [nanochat d34](https://github.com/karpathy/nanochat/discussions/314) hosted on [nanochat.karpathy.ai](https://nanochat.karpathy.ai/). "d34" means that this model has 34 layers in the Transformer neural network. This model has 2.2 billion parameters, it was trained on 88 billion tokens by simply running the training script [run1000.sh](runs/run1000.sh) with `--target_param_data_ratio=40` (2x longer than Chinchilla-optimal), and the total cost of training was ~$2,500 (about 100 hours training time on 8XH100 GPU node). While today this is enough to outperform GPT-2 of 2019, it falls dramatically short of modern Large Language Models like GPT-5. When talking to these micro models, you'll see that they make a lot of mistakes, they are a little bit naive and silly and they hallucinate a ton, a bit like children. It's kind of amusing. But what makes nanochat unique is that it is fully yours - fully configurable, tweakable, hackable, and trained by you from start to end. To train and talk to your own, we turn to...
To get a sense of the endpoint of this repo, you can currently find [nanochat d34](https://github.com/karpathy/nanochat/discussions/314) hosted on [nanochat.karpathy.ai](https://nanochat.karpathy.ai/). This model is now a few months old but it still gives a rough idea of the intelligence you can achieve for approximately $1000. While this model easily outperforms GPT-2 of 2019, it falls dramatically short of modern Large Language Models like GPT-5. When talking to these micro models, you'll see that they make a lot of mistakes, they are a little bit naive and silly and they hallucinate a ton, a bit like children. But what makes nanochat unique is that it is fully yours - fully configurable, tweakable, hackable, and trained by you from start to end. To train and talk to your own, we turn to...
## Quick start
@@ -152,7 +152,6 @@ python -m pytest tests/test_engine.py -v -s
├── pyproject.toml
├── runs
│ ├── miniseries.sh # Miniseries training script
│ ├── run1000.sh # Train the ~$800 nanochat d32
│ ├── runcpu.sh # Small example of how to run on CPU/MPS
│ ├── scaling_laws.sh # Scaling laws experiments
│ └── speedrun.sh # Train the ~$100 nanochat d20