x/nuonuo

Fam Zheng d923aa1e31 NuoNuo: Hippocampal memory module prototype

Hopfield + Hebbian hybrid memory system for LLMs.
Two nights of experiments (16 iterations), validated on LongMemEval (ICLR 2025).

Architecture:
- Single-hop: Two-Stage Hopfield (NN top-20 → softmax settle)
- Multi-hop: Hebbian W matrix with WTA pattern separation
- 64% on LongMemEval (500 questions), retrieval-only, no LLM dependency
- 4ms latency @ 20K memories, ~1GB VRAM

Key findings:
- Hopfield attention solved noise tolerance (20% → 100% vs flat Hebbian)
- WTA pattern separation enables 20K+ capacity
- Multi-hop associative chains (6 hops, CosSim=1.0) — RAG can't do this
- MiniLM-L6 is optimal (discrimination gap > absolute similarity)
- Paraphrase cue augmentation: 55% → 100% on synthetic, 36% → 64% on benchmark
- SNN encoder viable (CosSim 0.99) but not needed for current architecture

2026-04-07 10:37:24 +01:00

5.4 KiB

Raw Blame History

NuoNuo: Hippocampal Memory Module — Architecture v2

项目目标

为 LLM（如 Gemma 4）添加一个类海马体的长期记忆模块：

不使用传统 RAG（向量数据库 + 检索）
记忆存储在网络权重（Hebbian）和显式模式（Hopfield）中
支持 paraphrase 容忍的模糊检索
支持多跳联想推理（A→B→C，RAG 做不到）
每晚可整合/遗忘

核心架构

┌─────────────────────────────────────────────────────────┐
│  Query Embedding (from Sentence Transformer)             │
│                    ↓                                     │
│  ┌──── Stage 1: NN Pre-filter ────────────────────────┐ │
│  │  cosine(query, stored_cues) → top-20 candidates     │ │
│  │  O(N) brute force, O(log N) with FAISS              │ │
│  └─────────────────────┬──────────────────────────────┘ │
│                        ↓                                 │
│  ┌──── Stage 2: Hopfield Settle ──────────────────────┐ │
│  │  softmax(β · query @ candidates^T) → attention       │ │
│  │  Iterate 3 steps → converge to nearest attractor     │ │
│  │  Aggregate attention by memory_id (cue variants)     │ │
│  └─────────────────────┬──────────────────────────────┘ │
│                        ↓                                 │
│  ┌──── Optional: Multi-hop Hebbian Chain ─────────────┐ │
│  │  Settled cue → WTA code → W @ code → next target     │ │
│  │  Repeat for N hops (A → B → C → ...)                 │ │
│  └─────────────────────┬──────────────────────────────┘ │
│                        ↓                                 │
│               Retrieved memories                          │
└─────────────────────────────────────────────────────────┘

生物学类比

大脑区域	系统组件	功能
嗅内皮层 (EC)	Sentence Transformer	感知编码
齿状回 (DG)	WTA Pattern Separation	稀疏化/正交化
CA3	Hebbian W matrix	联想存储 + 多跳
CA1	Hopfield attention	检索输出
睡眠重播	W rebuild	整合/遗忘

实验验证总结

能力	验证结果	实验
Paraphrase recall (+ augmentation)	95%	exp07e
Multi-hop (3 hops, 500 bg)	100% (sim=1.0)	exp07b, 07c
Scale (20K memories)	80%	exp07d
Exact cue recall	100%	exp02c
Memory capacity	20K+	exp02d
Recall latency	4ms @ 20K	exp05, 07d
SNN encoder roundtrip	CosSim 0.99	exp01b

参数推荐

参数	值	备注
embed_dim	384-768	取决于 Sentence Transformer
code_dim	16384	Hebbian 容量 20K+
k (WTA)	50	平衡噪声容忍和容量
β (Hopfield)	16.0	中等锐度
hopfield_top_k	20	候选集大小，越小越稳
hopfield_steps	3	收敛迭代次数
cue_variants	3-5 per memory	LLM 生成 paraphrase

VRAM 预算 (RTX 4090, 24GB)

组件	大小
Hebbian W (16384²)	1024 MB
WTA projection (384×16384)	24 MB
Hopfield store (20K × 384 × 2)	~60 MB
Sentence Transformer	~90 MB
Gemma 4B (fp16)	~8 GB
Total	~9.2 GB
Headroom	~14.8 GB

与 Gemma 集成

推荐方案：Context Injection

# 1. User input → embed
query_emb = encoder.encode(user_input)

# 2. Recall memories
results = memory.recall(query_emb, top_k=3)
chain = memory.recall_chain(query_emb, hops=2)

# 3. Format and inject
context = format_memories(results + chain)
prompt = f"[Recalled memories]\n{context}\n\n[User]\n{user_input}"

# 4. Generate response
response = gemma.generate(prompt)

# 5. Store new memory (with LLM-generated paraphrases)
paraphrases = gemma.generate(f"Generate 3 paraphrases of: {user_input}")
memory.store(query_emb, response_emb,
             cue_variants=[encoder.encode(p) for p in paraphrases])

文件结构

src/nuonuo/
├── hippocampus.py    # 最终模块 v2 (Hopfield + Hebbian hybrid)
├── encoder.py        # SNN spike encoder/decoder
├── memory.py         # STDP + Hebbian memory (historical)
├── consolidation.py  # Sleep consolidation (historical)
└── __init__.py

doc/
├── architecture.md   # 本文件
├── findings.md       # 核心发现与反直觉结论
├── exp01_*.md        # SNN Encoder
├── exp02_*.md        # Associative Recall
├── exp03_*.md        # Consolidation
├── exp04_*.md        # Real Embeddings
├── exp05_*.md        # Benchmarks
├── exp06_*.md        # BioHash
└── exp07_*.md        # Hopfield (突破)

5.4 KiB Raw Blame History Unescape Escape

NuoNuo: Hippocampal Memory Module — Architecture v2

项目目标

核心架构

生物学类比

实验验证总结

参数推荐

VRAM 预算 (RTX 4090, 24GB)

与 Gemma 集成

文件结构

5.4 KiB

Raw Blame History