Luke Stanley
|
32571664b1
|
Fix Torch crash caused by pinning on CPU
|
2025-10-22 16:25:36 +00:00 |
|
ulanch
|
796f84527f
|
fix(ui): prevent iOS Safari toolbar from covering input on initial load
|
2025-10-21 17:34:40 -07:00 |
|
Andrej
|
2e938530ce
|
delete spurious torch.empty allocation in adamw
fix: remove unnecessary tensor allocation in DistAdamW optimizer
|
2025-10-21 11:35:17 -07:00 |
|
Andrej Karpathy
|
a088b7a6ec
|
use enable_gqa of pytorch sdpa, allows us to delete some code, didnt realize it's available
|
2025-10-21 18:07:33 +00:00 |
|
Andrej Karpathy
|
5bdc99abfb
|
merge and resolve conflict
|
2025-10-21 17:19:10 +00:00 |
|
Andrej Karpathy
|
dfcb1c16f1
|
Merge branch 'master' into cpu-mps-dev
|
2025-10-21 17:15:53 +00:00 |
|
Andrej Karpathy
|
bb71c64579
|
fix silly issue in dataloader, this version is much faster and more portable to mps too
|
2025-10-21 17:12:50 +00:00 |
|
karpathy
|
2e9669e03a
|
upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming
|
2025-10-20 10:15:17 -07:00 |
|
Sermet Pekin
|
49cd02f283
|
fix: remove unnecessary tensor allocation in DistAdamW optimizer
fix: remove unnecessary tensor allocation in DistAdamW optimizer
|
2025-10-20 12:03:26 +03:00 |
|
Andrej
|
cf2baf9933
|
fix typo
Co-authored-by: Tancrède Lepoint <tlepoint@users.noreply.github.com>
|
2025-10-17 08:35:41 -07:00 |
|
karpathy
|
df600b6ed5
|
many small tweaks. base, eval, core work now i think
|
2025-10-16 15:46:18 -07:00 |
|
karpathy
|
786119d593
|
add autodetect of device and related stuff. getting weird warnings/errors still, so wip
|
2025-10-16 10:26:19 -07:00 |
|
karpathy
|
306bc380ab
|
add support for CPU and for MPS. I had to change a few cosmetic things. I also discovered I think a bit of a bug, where I was casting wte to bfloat16 in the wrong place (the model init) instead of in init_weights
|
2025-10-16 10:04:43 -07:00 |
|
Andrej Karpathy
|
722da4f543
|
trying to add basic cpu support, will try mps too
|
2025-10-16 16:14:38 +00:00 |
|
Andrej Karpathy
|
4346536ab2
|
also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate
|
2025-10-16 01:28:37 +00:00 |
|
Andrej Karpathy
|
2846999b8f
|
allow user to click on their message to edit them. conversation after that point is wiped
|
2025-10-16 01:16:22 +00:00 |
|
Andrej Karpathy
|
92d52ecc92
|
add slash commands to webui
|
2025-10-16 01:09:53 +00:00 |
|
Andrej Karpathy
|
01fb290f53
|
allow multiple GPUs to do inference in a data parallel way
|
2025-10-15 19:12:19 +00:00 |
|
Mirza-Samad-Ahmed-Baig
|
afaa5b4c90
|
Fix: Handle missing d<number> model tags in find_largest_model
|
2025-10-14 00:24:07 +03:00 |
|
karpathy
|
3a5e0bc50b
|
initial commit
|
2025-10-13 06:49:24 -07:00 |
|