nuonuo/doc/p0_llm_integration.md
Fam Zheng d923aa1e31 NuoNuo: Hippocampal memory module prototype
Hopfield + Hebbian hybrid memory system for LLMs.
Two nights of experiments (16 iterations), validated on LongMemEval (ICLR 2025).

Architecture:
- Single-hop: Two-Stage Hopfield (NN top-20 → softmax settle)
- Multi-hop: Hebbian W matrix with WTA pattern separation
- 64% on LongMemEval (500 questions), retrieval-only, no LLM dependency
- 4ms latency @ 20K memories, ~1GB VRAM

Key findings:
- Hopfield attention solved noise tolerance (20% → 100% vs flat Hebbian)
- WTA pattern separation enables 20K+ capacity
- Multi-hop associative chains (6 hops, CosSim=1.0) — RAG can't do this
- MiniLM-L6 is optimal (discrimination gap > absolute similarity)
- Paraphrase cue augmentation: 55% → 100% on synthetic, 36% → 64% on benchmark
- SNN encoder viable (CosSim 0.99) but not needed for current architecture
2026-04-07 10:37:24 +01:00

33 lines
1.4 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# P0: LLM Integration
## 状态:基础 pipeline 可用LLM Gateway 不通需后续验证
## 实现
- `llm.py`: LLMClient + extract/paraphrase/format 函数
- 支持 OpenAI-compatible APIfallback 到 heuristic
- 端到端 pipeline: 对话 → 提取 → embed → store (with augmentation) → recall → context injection
## 端到端测试结果
5 轮对话存入 7 条记忆24 个 cue entries含 paraphrase augmentation
查询召回结果heuristic paraphrase
| 查询 | 正确? | 说明 |
|------|-------|------|
| DB performance terrible | ✅ | 正确召回 missing indexes |
| How to push a new release? | ✅ | 正确召回 blue-green deploy |
| Redis connection info? | ✅ | 正确召回 port 6379 |
| Login system has a problem | ❌ | 指向 database 而不是 auth |
| Database backup | ✅ | 正确召回 cron job |
| Deployment config? | ✅ | 正确召回 GitHub Actions |
5/6 正确。失败的 case 是因为 heuristic paraphrase 没有生成 "login" ↔ "auth" 的关联。LLM paraphrase 应该能覆盖。
## 待解决
1. **LLM Gateway 不通** — 无法验证 LLM 提取和 paraphrase 质量
2. **重复提取** — heuristic 会对同一对话提取 2 条相似记忆,需要去重
3. **Heuristic paraphrase 质量差** — 机械式替换("issue with X")不如 LLM 生成
4. **Auth→Login 这类语义跳跃** — 只有 LLM paraphrase 或更强 embedding 模型能解决