Andrej Karpathy
|
e569b59f92
|
delete torchao dependency, create our own exact API-matched version of Float8Linear, document it very well. for some poorly understood reason, the performance is not only ~identical but actually runs 3% faster. despite of it being significantly simpler and much less code. i don't fully understand why/how atm
|
2026-02-10 18:46:39 +00:00 |
|