nuonuo/doc/exp04_real_embeddings.md

# 实验4：真实语义 Embedding 端到端测试

## 模型

sentence-transformers/all-MiniLM-L6-v2, embedding dim=384

## 关键结果

### Embedding 空间分析
- 原始 cue ↔ paraphrase 余弦相似度: mean=0.68, min=0.18, max=0.86
- 不同 pair 间余弦相似度: mean=0.10
- Gap = 0.59 — 语义空间有合理分离度

### 精确 cue 召回: 100% ✓
20 对记忆，使用原始 cue 查询，全部正确。

### Paraphrase 召回（20 对，无 background）

| Config | Direct Recall | Coarse-to-Fine |
|--------|---------------|----------------|
| code=4096, k=20 | 85% | 90% |
| code=16384, k=50 | **95%** | 90% |
| code=16384, k=100 | 90% | 90% |

**k=50 是最佳 paraphrase 配置**，超过了 coarse-to-fine。

### Multi-hop: 完美 ✓✓✓
修复 unified projection 后，4 条语义链 × 3 跳 = 全部 CosSim=1.0。
多条链共享同一个 memory 也完美。

### Paraphrase at Scale（核心问题）⚠️

| Background memories | Exact Recall | Paraphrase Recall |
|---------------------|-------------|-------------------|
| 0 | 5/5 | 5/5 |
| 100 | 3-4/5 | 1-2/5 |
| 500 | 1-3/5 | 0-1/5 |
| 1000 | 0-3/5 | 0-1/5 |

**随着存储记忆增加，paraphrase recall 急剧下降。**

根因：Hebbian 回忆是 W @ sep(query) = Σ target_i · (sep(cue_i) · sep(query))，
当 memory 数量多时，query code 和多个 cue code 部分重叠，产生噪声混合。
这不是容量问题（exact recall 2000 条仍然 100%），而是**信噪比问题**。

## 架构决策

### 最终推荐架构：Hybrid Memory

```
┌─────────────────────────────────────────────────┐
│                Query Embedding                    │
│                      ↓                            │
│  ┌───────────── Single-Hop ──────────────────┐   │
│  │  Key-Value Store (explicit cue→target)     │   │
│  │  NN Lookup: cos_sim(query, stored_cues)    │   │
│  │  → Top-K nearest cue embeddings            │   │
│  │  → Return their associated targets         │   │
│  └────────────────────────────────────────────┘   │
│                      ↓                            │
│  ┌───────────── Multi-Hop ───────────────────┐   │
│  │  Hebbian W matrix (unified projection)     │   │
│  │  Start from NN-retrieved exact cue         │   │
│  │  → Chain through W for 2+ hop associations │   │
│  └────────────────────────────────────────────┘   │
│                      ↓                            │
│              Retrieved memories                    │
└─────────────────────────────────────────────────┘
```

### 为什么这个架构是对的

1. **Single-hop 用 NN lookup**：噪声容忍，任意 paraphrase 都能命中
2. **Multi-hop 用 Hebbian W**：唯一能做 A→B→C 链式联想的方法
3. **不冲突**：NN lookup 找到精确 cue 后，用精确 cue 查 W 矩阵（不受噪声影响）
4. **SNN encoder 的位置**：可选，将 embedding 编码为 spike train 作为 W 的输入
   - 当前实验中，WTA 直接在 embedding 空间上做 pattern separation 就够了
   - SNN encoder 的价值在 neuromorphic hardware 部署

### 最优参数

| 参数 | 推荐值 | 理由 |
|------|--------|------|
| code_dim | 16384 | 容量 20K+，显存 ~1GB |
| k (WTA active) | 50 | 平衡 paraphrase 容忍度和容量 |
| input_dim | 384-768 | 取决于 embedding model |
| W 精度 | float32 | 1GB for 16384² |