nanochat-omni/nanochat/fp8.py at d9678ff0f9c5d9967512adce23cb60ea0a5cd3f3

Files

T

Alan d9678ff0f9 Save FP8 tensors in autograd ctx instead of full-precision inputs

Store quantized input/weight and their inverse scales in _Float8Matmul ctx to avoid re-quantization in backward and reduce saved-activation memory without changing numerics.

2026-02-15 14:31:54 +00:00

12 KiB

Raw Blame History

View Raw

12 KiB Raw Blame History

12 KiB

Raw Blame History