NuoNuo: Hippocampal Memory Module — Architecture v2
项目目标
为 LLM(如 Gemma 4)添加一个类海马体的长期记忆模块:
- 不使用传统 RAG(向量数据库 + 检索)
- 记忆存储在网络权重(Hebbian)和显式模式(Hopfield)中
- 支持 paraphrase 容忍的模糊检索
- 支持多跳联想推理(A→B→C,RAG 做不到)
- 每晚可整合/遗忘
核心架构
┌─────────────────────────────────────────────────────────┐
│ Query Embedding (from Sentence Transformer) │
│ ↓ │
│ ┌──── Stage 1: NN Pre-filter ────────────────────────┐ │
│ │ cosine(query, stored_cues) → top-20 candidates │ │
│ │ O(N) brute force, O(log N) with FAISS │ │
│ └─────────────────────┬──────────────────────────────┘ │
│ ↓ │
│ ┌──── Stage 2: Hopfield Settle ──────────────────────┐ │
│ │ softmax(β · query @ candidates^T) → attention │ │
│ │ Iterate 3 steps → converge to nearest attractor │ │
│ │ Aggregate attention by memory_id (cue variants) │ │
│ └─────────────────────┬──────────────────────────────┘ │
│ ↓ │
│ ┌──── Optional: Multi-hop Hebbian Chain ─────────────┐ │
│ │ Settled cue → WTA code → W @ code → next target │ │
│ │ Repeat for N hops (A → B → C → ...) │ │
│ └─────────────────────┬──────────────────────────────┘ │
│ ↓ │
│ Retrieved memories │
└─────────────────────────────────────────────────────────┘
生物学类比
| 大脑区域 |
系统组件 |
功能 |
| 嗅内皮层 (EC) |
Sentence Transformer |
感知编码 |
| 齿状回 (DG) |
WTA Pattern Separation |
稀疏化/正交化 |
| CA3 |
Hebbian W matrix |
联想存储 + 多跳 |
| CA1 |
Hopfield attention |
检索输出 |
| 睡眠重播 |
W rebuild |
整合/遗忘 |
实验验证总结
| 能力 |
验证结果 |
实验 |
| Paraphrase recall (+ augmentation) |
95% |
exp07e |
| Multi-hop (3 hops, 500 bg) |
100% (sim=1.0) |
exp07b, 07c |
| Scale (20K memories) |
80% |
exp07d |
| Exact cue recall |
100% |
exp02c |
| Memory capacity |
20K+ |
exp02d |
| Recall latency |
4ms @ 20K |
exp05, 07d |
| SNN encoder roundtrip |
CosSim 0.99 |
exp01b |
参数推荐
| 参数 |
值 |
备注 |
| embed_dim |
384-768 |
取决于 Sentence Transformer |
| code_dim |
16384 |
Hebbian 容量 20K+ |
| k (WTA) |
50 |
平衡噪声容忍和容量 |
| β (Hopfield) |
16.0 |
中等锐度 |
| hopfield_top_k |
20 |
候选集大小,越小越稳 |
| hopfield_steps |
3 |
收敛迭代次数 |
| cue_variants |
3-5 per memory |
LLM 生成 paraphrase |
VRAM 预算 (RTX 4090, 24GB)
| 组件 |
大小 |
| Hebbian W (16384²) |
1024 MB |
| WTA projection (384×16384) |
24 MB |
| Hopfield store (20K × 384 × 2) |
~60 MB |
| Sentence Transformer |
~90 MB |
| Gemma 4B (fp16) |
~8 GB |
| Total |
~9.2 GB |
| Headroom |
~14.8 GB |
与 Gemma 集成
推荐方案:Context Injection
# 1. User input → embed
query_emb = encoder.encode(user_input)
# 2. Recall memories
results = memory.recall(query_emb, top_k=3)
chain = memory.recall_chain(query_emb, hops=2)
# 3. Format and inject
context = format_memories(results + chain)
prompt = f"[Recalled memories]\n{context}\n\n[User]\n{user_input}"
# 4. Generate response
response = gemma.generate(prompt)
# 5. Store new memory (with LLM-generated paraphrases)
paraphrases = gemma.generate(f"Generate 3 paraphrases of: {user_input}")
memory.store(query_emb, response_emb,
cue_variants=[encoder.encode(p) for p in paraphrases])
文件结构
src/nuonuo/
├── hippocampus.py # 最终模块 v2 (Hopfield + Hebbian hybrid)
├── encoder.py # SNN spike encoder/decoder
├── memory.py # STDP + Hebbian memory (historical)
├── consolidation.py # Sleep consolidation (historical)
└── __init__.py
doc/
├── architecture.md # 本文件
├── findings.md # 核心发现与反直觉结论
├── exp01_*.md # SNN Encoder
├── exp02_*.md # Associative Recall
├── exp03_*.md # Consolidation
├── exp04_*.md # Real Embeddings
├── exp05_*.md # Benchmarks
├── exp06_*.md # BioHash
└── exp07_*.md # Hopfield (突破)