tori/doc/kb.md
Fam Zheng d9d3bc340c Add global knowledge base with RAG search
- KB module: fastembed (AllMiniLML6V2) for CPU embedding, SQLite for
  vector storage with brute-force cosine similarity search
- Chunking by ## headings, embeddings stored as BLOB in kb_chunks table
- API: GET/PUT /api/kb for full-text read/write with auto re-indexing
- Agent tools: kb_search (top-5 semantic search) and kb_read (full text)
  available in both planning and execution phases
- Frontend: Settings menu in sidebar footer, KB editor as independent
  view with markdown textarea and save button
- Also: extract shared db_err/ApiResult to api/mod.rs, add context
  management design doc

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 08:15:50 +00:00

43 lines
1.2 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 知识库 (KB / RAG)
## 概述
全局知识库,所有 project 的 agent 共享。用户在前端用 markdown 编辑保存时自动切块并索引。Agent 通过 `kb_search``kb_read` 工具查询。
## 数据流
```
用户编辑 markdown textarea
→ PUT /api/kb
→ 原文存 SQLite (kb_content 表,单行)
→ 按 ## heading 切块
→ fastembed (AllMiniLML6V2) 生成 embedding
→ chunk + embedding 存 SQLite (kb_chunks 表)
```
## 切块策略
按 markdown `##` heading 切分,每个 section 作为一个 chunk。无 heading 的开头部分作为一个 chunk。
## Agent 工具
- `kb_search(query: str)` → 向量搜索 top-5返回相关片段
- `kb_read()` → 返回 KB 全文
## API
- `GET /api/kb` → 返回 KB 全文 `{ content: string }`
- `PUT /api/kb` → 保存全文 + 重新切块索引 `{ content: string }`
## 技术选型
- **向量存储**: SQLite (embedding 存为 BLOB暴力余弦搜索)
- **Embedding**: fastembed-rs (AllMiniLML6V2, 384 dim, CPU)
- **原文存储**: SQLite (kb_content 表)
## 前端
- Sidebar 底部 Settings 弹出菜单 → Knowledge Base
- 点击切换到 KB 编辑独立 view与 project view 平级)
- 大 textarea + Save 按钮