nuonuo/doc/exp01_encoder_roundtrip.md
Fam Zheng d923aa1e31 NuoNuo: Hippocampal memory module prototype
Hopfield + Hebbian hybrid memory system for LLMs.
Two nights of experiments (16 iterations), validated on LongMemEval (ICLR 2025).

Architecture:
- Single-hop: Two-Stage Hopfield (NN top-20 → softmax settle)
- Multi-hop: Hebbian W matrix with WTA pattern separation
- 64% on LongMemEval (500 questions), retrieval-only, no LLM dependency
- 4ms latency @ 20K memories, ~1GB VRAM

Key findings:
- Hopfield attention solved noise tolerance (20% → 100% vs flat Hebbian)
- WTA pattern separation enables 20K+ capacity
- Multi-hop associative chains (6 hops, CosSim=1.0) — RAG can't do this
- MiniLM-L6 is optimal (discrimination gap > absolute similarity)
- Paraphrase cue augmentation: 55% → 100% on synthetic, 36% → 64% on benchmark
- SNN encoder viable (CosSim 0.99) but not needed for current architecture
2026-04-07 10:37:24 +01:00

39 lines
1.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 实验1Encoder Roundtrip Test
## 目标
验证 embedding → spike train → embedding 往返编码的信息保留度。
## 关键发现
### 结论roundtrip 编码完全可行CosSim 可达 0.99
最佳配置:**768-dim, 2048 neurons, 64 steps → CosSim 0.9898, MSE 0.000111**
### 详细结果 (200 epochs, AdamW + CosineAnnealing)
| Dim | Neurons | Steps | MSE | CosSim | 备注 |
|-----|---------|-------|-----|--------|------|
| 768 | 2048 | 64 | 0.000111 | **0.9898** | ⭐ 最佳 |
| 768 | 4096 | 64 | 0.000057 | 0.9873 | MSE最低但CosSim略低 |
| 768 | 8192 | 64 | 0.000094 | 0.9773 | 过宽反而差 |
| 768 | 4096 | 128 | 0.000711 | 0.9640 | 步数太多反而差!|
### 重要观察
1. **"死神经元"相变**训练前60个epochfiring rate = 0网络完全不放电。然后突然开始放电CosSim飙升。这是因为膜电位初始化需要学习到正确的尺度才能突破阈值。类似生物神经网络中的突触成熟过程。
2. **更宽不等于更好**2048 neurons 比 4096、8192 都好。更窄的瓶颈迫使编码更高效。这和 autoencoder 的经典结论一致。
3. **更多 steps 反而有害**128 steps 比 64 差很多0.964 vs 0.990。LIF 膜电位指数衰减,长序列末端的脉冲和初始 embedding 的关联太弱了。
4. **firing rate 自然收敛到 ~6%**:目标是 10%,实际收敛到 5-7%。说明稀疏编码是最优的。
5. **收敛速度**50 epochs 时 768-dim 只有 ~0.89,但 200 epochs 可以到 0.99。CosineAnnealing scheduler 帮助很大。
### 对后续实验的指导
- 使用 **768-dim, 2048 neurons, 64 steps** 作为默认配置
- 训练至少 200 epochs
- 实际记忆模块不需要完美重建——0.95 的 CosSim 已经足够做 associative recall
- 关键瓶颈不在 encoder而在后续的 STDP 记忆层是否能保持 spike pattern 的完整性