Hopfield + Hebbian hybrid memory system for LLMs. Two nights of experiments (16 iterations), validated on LongMemEval (ICLR 2025). Architecture: - Single-hop: Two-Stage Hopfield (NN top-20 → softmax settle) - Multi-hop: Hebbian W matrix with WTA pattern separation - 64% on LongMemEval (500 questions), retrieval-only, no LLM dependency - 4ms latency @ 20K memories, ~1GB VRAM Key findings: - Hopfield attention solved noise tolerance (20% → 100% vs flat Hebbian) - WTA pattern separation enables 20K+ capacity - Multi-hop associative chains (6 hops, CosSim=1.0) — RAG can't do this - MiniLM-L6 is optimal (discrimination gap > absolute similarity) - Paraphrase cue augmentation: 55% → 100% on synthetic, 36% → 64% on benchmark - SNN encoder viable (CosSim 0.99) but not needed for current architecture
47 lines
1.9 KiB
Markdown
47 lines
1.9 KiB
Markdown
# P6: 多轮对话验证
|
||
|
||
## 场景
|
||
|
||
3 天的对话(DB troubleshooting → deployment → monitoring),12 条记忆 + heuristic paraphrase augmentation。
|
||
|
||
## 跨会话召回:12/12 (100%)
|
||
|
||
| 查询 | 跨天? | 结果 |
|
||
|------|-------|------|
|
||
| DB is slow again | Day 1 | ✓ "missing index on created_at" |
|
||
| How big is the users table? | Day 1 | ✓ "2.3 million rows" |
|
||
| Who can access the database? | Day 1 | ✓ "Alice, Bob, Charlie" |
|
||
| What Postgres version? | Day 1 | ✓ "PostgreSQL 15.2" |
|
||
| How to deploy? | Day 2 | ✓ "blue-green via GitHub Actions" |
|
||
| How to rollback? | Day 2 | ✓ "switch load balancer" |
|
||
| Who approves deploys? | Day 2 | ✓ "Alice or David" |
|
||
| Monitoring dashboard? | Day 3 | ✓ "grafana.internal" |
|
||
| What alerts? | Day 3 | ✓ "PagerDuty" |
|
||
| DB slow, what index? | Cross | ✓ "created_at" |
|
||
| Deploy logs? | Cross | ✓ "Loki" |
|
||
| Database monitoring exporter | Cross | ✓ "pg_exporter" |
|
||
|
||
全部 similarity=1.0。Hopfield + augmentation 在小规模(12 memories)下完美。
|
||
|
||
## Multi-hop
|
||
|
||
"database is slow" → hop1: "missing index" → hop2: "missing index" → hop3: "2.3 million rows"
|
||
|
||
hop2 循环了(指回自己),因为 Hebbian W 里 "missing index" 的最强关联还是它自己(自身的 outer product 贡献最大)。需要在 multi-hop 中加**去重**:已访问的 memory 不参与下一跳。
|
||
|
||
## Memory 冲突
|
||
|
||
存了两个版本的 PostgreSQL 版本(15.2 和 16.1):
|
||
- Top-1: "Upgraded to 16.1" (sim=1.0) ← 更新的版本排第一
|
||
- Top-2: "version 15.2" (sim=0.0) ← 旧版本也返回了
|
||
|
||
当前行为可接受(都返回,新的排前面)。更好的做法:
|
||
- 检测到同 cue 的更新 → 自动替换旧记忆
|
||
- 或标记旧记忆为 "superseded"
|
||
|
||
## 待改进
|
||
|
||
1. **Multi-hop 去重**: 已访问的 memory 排除出下一跳候选
|
||
2. **Memory update 检测**: 同 cue 新值自动覆盖旧值
|
||
3. **大规模验证**: 12 条是小规模,需要 100+ 条跨 session 的测试
|