Status (2026-05-02): Implemented with named-store
registry. See ARCHITECTURE.md for the current design. The planning
decisions below were adopted; the main evolution beyond this document is
multi-store support (RagStoreEntry registry,
/rag new NAME, /rag switch NAME,
/rag drop NAME) so different knowledge domains (golang,
writing, research, etc.) can coexist as separate SQLite files while only
the active one is held open in memory.
Add Retrieval-Augmented Generation (RAG) support to Harvey while:
Knowledge Base (raw data)
β
Embedding Model (e.g. nomic-embed-text)
β
RAG Index (SQLite per embedding model)
β
Generation Model (granite4, llama3, etc.)
Instead of per-generation-model:
β granite4.db
β llama3.db
Use:
β
rag_nomic_v1.db
β
rag_mxbai_v1.db
Example:
type ModelConfig struct {
GenerationModel string
EmbeddingModel string
RagDBPath string
}
var ModelRegistry = map[string]ModelConfig{
"granite4": {
GenerationModel: "granite4",
EmbeddingModel: "nomic-embed-text",
RagDBPath: "rag_nomic_v1.db",
},
"llama3": {
GenerationModel: "llama3",
EmbeddingModel: "nomic-embed-text",
RagDBPath: "rag_nomic_v1.db",
},
}harvey ingest --embedding-model nomic-embed-text
Strict runtime check:
if embedder.Name() != r.embeddingModel {
return errors.New("embedding model mismatch")
}Prevents:
package harvey)harvey/
rag_support.go
rag_support_test.go
rag_support.go
(Design Overview)type Embedder interface {
Embed(text string) ([]float64, error)
Name() string
}Allows:
type RagStore struct {
db *sql.DB
embeddingModel string
}
type Chunk struct {
ID int64
Content string
}CREATE TABLE IF NOT EXISTS chunks (
id INTEGER PRIMARY KEY,
content TEXT NOT NULL,
embedding BLOB NOT NULL
);Future extensions:
source_id TEXT,
chunk_index INTEGER,
tags TEXTfunc NewRagStore(dbPath, embeddingModel string) (*RagStore, error)Uses:
import _ "github.com/glebarez/go-sqlite"Driver:
sql.Open("sqlite", dbPath)func (r *RagStore) Ingest(texts []string, embedder Embedder) errorSteps:
func (r *RagStore) Query(query string, embedder Embedder, topK int) ([]Chunk, error)Steps:
func cosineSimilarity(a, b []float64) float64Binary format:
[int32 length][float64...]
Functions:
serialize([]float64) []byte
deserialize([]byte) []float64rag_support_test.gotype mockEmbedder struct {
name string
}Ensures:
Ensures:
User selects model (granite4)
β
Lookup ModelConfig
β
Get embedding model (nomic-embed-text)
β
Load corresponding RAG DB
β
Embed query
β
Retrieve top-K chunks
β
Inject into prompt
β
Call Ollama (granite4)
These will influence your next steps:
100k β may need indexing soon
If you want, I can next: