nuonuo/doc/exp04_real_embeddings.md
Fam Zheng d923aa1e31 NuoNuo: Hippocampal memory module prototype
Hopfield + Hebbian hybrid memory system for LLMs.
Two nights of experiments (16 iterations), validated on LongMemEval (ICLR 2025).

Architecture:
- Single-hop: Two-Stage Hopfield (NN top-20 → softmax settle)
- Multi-hop: Hebbian W matrix with WTA pattern separation
- 64% on LongMemEval (500 questions), retrieval-only, no LLM dependency
- 4ms latency @ 20K memories, ~1GB VRAM

Key findings:
- Hopfield attention solved noise tolerance (20% → 100% vs flat Hebbian)
- WTA pattern separation enables 20K+ capacity
- Multi-hop associative chains (6 hops, CosSim=1.0) — RAG can't do this
- MiniLM-L6 is optimal (discrimination gap > absolute similarity)
- Paraphrase cue augmentation: 55% → 100% on synthetic, 36% → 64% on benchmark
- SNN encoder viable (CosSim 0.99) but not needed for current architecture
2026-04-07 10:37:24 +01:00

88 lines
3.9 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 实验4真实语义 Embedding 端到端测试
## 模型
sentence-transformers/all-MiniLM-L6-v2, embedding dim=384
## 关键结果
### Embedding 空间分析
- 原始 cue ↔ paraphrase 余弦相似度: mean=0.68, min=0.18, max=0.86
- 不同 pair 间余弦相似度: mean=0.10
- Gap = 0.59 — 语义空间有合理分离度
### 精确 cue 召回: 100% ✓
20 对记忆,使用原始 cue 查询,全部正确。
### Paraphrase 召回20 对,无 background
| Config | Direct Recall | Coarse-to-Fine |
|--------|---------------|----------------|
| code=4096, k=20 | 85% | 90% |
| code=16384, k=50 | **95%** | 90% |
| code=16384, k=100 | 90% | 90% |
**k=50 是最佳 paraphrase 配置**,超过了 coarse-to-fine。
### Multi-hop: 完美 ✓✓✓
修复 unified projection 后4 条语义链 × 3 跳 = 全部 CosSim=1.0。
多条链共享同一个 memory 也完美。
### Paraphrase at Scale核心问题
| Background memories | Exact Recall | Paraphrase Recall |
|---------------------|-------------|-------------------|
| 0 | 5/5 | 5/5 |
| 100 | 3-4/5 | 1-2/5 |
| 500 | 1-3/5 | 0-1/5 |
| 1000 | 0-3/5 | 0-1/5 |
**随着存储记忆增加paraphrase recall 急剧下降。**
根因Hebbian 回忆是 W @ sep(query) = Σ target_i · (sep(cue_i) · sep(query))
当 memory 数量多时query code 和多个 cue code 部分重叠,产生噪声混合。
这不是容量问题exact recall 2000 条仍然 100%),而是**信噪比问题**。
## 架构决策
### 最终推荐架构Hybrid Memory
```
┌─────────────────────────────────────────────────┐
│ Query Embedding │
│ ↓ │
│ ┌───────────── Single-Hop ──────────────────┐ │
│ │ Key-Value Store (explicit cue→target) │ │
│ │ NN Lookup: cos_sim(query, stored_cues) │ │
│ │ → Top-K nearest cue embeddings │ │
│ │ → Return their associated targets │ │
│ └────────────────────────────────────────────┘ │
│ ↓ │
│ ┌───────────── Multi-Hop ───────────────────┐ │
│ │ Hebbian W matrix (unified projection) │ │
│ │ Start from NN-retrieved exact cue │ │
│ │ → Chain through W for 2+ hop associations │ │
│ └────────────────────────────────────────────┘ │
│ ↓ │
│ Retrieved memories │
└─────────────────────────────────────────────────┘
```
### 为什么这个架构是对的
1. **Single-hop 用 NN lookup**:噪声容忍,任意 paraphrase 都能命中
2. **Multi-hop 用 Hebbian W**:唯一能做 A→B→C 链式联想的方法
3. **不冲突**NN lookup 找到精确 cue 后,用精确 cue 查 W 矩阵(不受噪声影响)
4. **SNN encoder 的位置**:可选,将 embedding 编码为 spike train 作为 W 的输入
- 当前实验中WTA 直接在 embedding 空间上做 pattern separation 就够了
- SNN encoder 的价值在 neuromorphic hardware 部署
### 最优参数
| 参数 | 推荐值 | 理由 |
|------|--------|------|
| code_dim | 16384 | 容量 20K+,显存 ~1GB |
| k (WTA active) | 50 | 平衡 paraphrase 容忍度和容量 |
| input_dim | 384-768 | 取决于 embedding model |
| W 精度 | float32 | 1GB for 16384² |