Harvey — Architecture & UX Decision Log

This file records significant architectural and UX decisions, their rationale, and known trade-offs. New decisions are added at the top. Each entry names the decision, the context that prompted it, the chosen approach, the rejected alternatives, and the consequences.


2026-06-19 — Tab completion: two-layer design with shared SelectFrom helper

Context. Harvey’s buildCompleter() only completes top-level command names, @route references, Ollama model names, and file paths. Users must remember subcommand names by heart and must know exact RAG store/model names to use use and drop subcommands. Several commands already show numbered pickers when no name is given, but each reimplements the pattern differently. See tab-completion-design.md.

Decision. Extend completion in two layers: (1) second-token subcommand names using a new Subcommands []string field on Command; (2) third-token argument values using a new ArgCompletion map[string]func(*Agent) []string field that maps each subcommand to a candidate-list function. Additionally, introduce a shared SelectFrom / SelectItem / SelectFromStrings API in a new ui.go file. Commands whose first positional argument comes from a finite, enumerable list (/rag use, /memory show, /llamafile use, etc.) display the picker when no argument is given. ui.go lives in the harvey package; promotion to termlib is deferred until a clean generalisation is proven.

Rejected alternatives.

Consequences.


2026-06-18 — MinIO replaced with aws-sdk-go-v2 S3 client

Context. remote_s3.go uses github.com/minio/minio-go/v7 as the S3 protocol client. MinIO’s Go client has moved to a closed-source license, making it unsuitable for Harvey’s AGPL-3.0 codebase. The affected surface is small: Stat, Get, and List operations on S3-compatible stores (AWS S3, MinIO server, Cloudflare R2). See s3-replacement-design.md.

Decision. Replace the MinIO client with github.com/aws/aws-sdk-go-v2 (Apache-2.0 licensed). The AWS SDK v2 supports all S3-compatible endpoints via the BaseEndpoint override option. The call sites in remote_s3.go map cleanly: StatObjectHeadObject, GetObjectGetObject, ListObjectsListObjectsV2. Credentials continue to come from environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) and the SDK’s default credential chain.

Rejected alternatives.

Consequences.


2026-06-18 — Spinner gains dynamic status message channel

Context. Harvey’s spinner currently shows rotating Edward Lear quotes and a timer while waiting for the LLM. Users have no way to tell whether Harvey is embedding a query, calling a tool, waiting for Ollama, or doing something else. Claude Code and similar tools display live status messages that update as work progresses. See spinner-ux-design.md.

Decision. Add a StatusCh chan string field to the Spinner struct and an UpdateStatus(msg string) method. The spinner’s message line shows the most-recent status update instead of the next Lear quote whenever a message is pending; Lear quotes resume when no status is pending. The caller sends non-blocking updates via UpdateStatus; the spinner goroutine reads them on the fast tick. This preserves the existing Lear personality while surfacing actionable progress at key moments: tool call start/end, RAG embedding, context injection, model switching. Tab completion is out of scope for this work item; it is a separate, larger effort.

Rejected alternatives.

Consequences.


2026-06-18 — Assay evaluation output moves to workspace-level directory

Context. bin/assay writes evaluation results to testout/ inside the harvey/ source repository. This directory is gitignored, but the JSON and Markdown artifacts look like test output to language models that read the source tree, causing models to misinterpret stale evaluation results as current test failures. See assay-llamafile-design.md.

Decision. Change the default output directory for bin/assay from testout/ to $WORKSPACE/assay-results/<timestamp>/ where $WORKSPACE is resolved the same way Harvey resolves its workspace (walk up from cwd to the directory containing agents/harvey.yaml). If no workspace is found, fall back to a assay-results/ directory in the current working directory. The --output flag overrides the default as before.

Rejected alternatives.

Consequences.


2026-06-18 — Assay adds Llamafile backend via --llamafile flag

Context. bin/assay currently only supports Ollama as a model backend, but Harvey supports both Ollama and Llamafile. Users evaluating a Llamafile model must run it manually and point assay at it with a custom URL, which is error-prone and undocumented. See assay-llamafile-design.md.

Decision. Add a --llamafile PATH flag to bin/assay. When provided, assay starts the llamafile process on an ephemeral port (same startLlamafile logic as in llamafile_service.go), runs the evaluation suite against that endpoint, then terminates the process on exit. The --model flag is still respected (it sets the model name in the report) but --ollama is ignored when --llamafile is given. Embeddings continue to use the Ollama embedder unless --rag-db is also given and the store’s recorded embedding model differs, in which case the operation fails fast with a clear error.

Rejected alternatives.

Consequences.


2026-06-18 — Web developer template added to built-in profile set

Context. The five templates shipped in v1 (backend-developer, frontend-developer, dataset-developer, data-scientist, technical-writer) do not have a template that covers the full polyglot web development stack used in this workspace: Go backends, uv-managed Python scripts, SQL (SQLite3 and Postgres), Deno+TypeScript frontends, and vanilla JavaScript/CSS/HTML5. A backend developer using Deno or a frontend developer writing Go API clients currently reaches for an incomplete template. See web-developer-template-design.md.

Decision. Add a web-developer.spmd template to templates/profiles/. It covers: Go (net/http, database/sql), uv+Python (scripting, data processing), SQL (SQLite3 dialect and Postgres), Deno+TypeScript (runtime, standard library, no bundler by default), JavaScript (ES modules, no framework by default), CSS (custom properties, no utility framework by default), HTML5 (semantic markup). The template’s NOTE: recommends qwen2.5-coder:7b or granite3.3:2b and suggests ingesting both Go source and the deno.json/package.json for context.

Rejected alternatives.

Consequences.


2026-06-18 — /memory profile subcommand set expanded and naming standardized

Context. The current /memory profile command has three subcommands — show, update, use — but their semantics do not match Harvey’s established command vocabulary. show lists active profiles (like list does elsewhere) rather than showing the content of the active profile. use creates a new profile from a template (like new does elsewhere) rather than selecting an existing saved profile. update opens the current profile in $EDITOR. There is no way to rename a workspace. See memory-profile-ux-design.md.

Decision. Standardize the subcommand set:

Subcommand New behaviour Was
list List all profiles (active + archived) (show partial)
show Print the content of the current active profile (missing)
edit Open the active profile in $EDITOR (rename of update) update
use [NAME] Switch to a named template or picker unchanged
rename NAME Rename the workspace display name in the active profile (missing)

update is kept as a deprecated alias for edit with a one-line deprecation notice, to avoid breaking existing workflows. The /profile top-level alias continues to delegate to all subcommands. The help text for /memory is updated to list all five subcommands.

Rejected alternatives.

Consequences.


2026-06-18 — PDF capability disclosed in HARVEY.md system prompt

Context. Harvey’s read_file built-in tool description states that PDF files are extracted automatically via poppler. But when tools are disabled — or when a small model uses prose tool calls and does not consistently read all tool descriptions — the model has no knowledge of this capability and asks the user to manually convert PDFs to text. HARVEY.md is always injected as the system prompt, making it the correct place to disclose capabilities that should be known regardless of tool-call mode. See quick-fixes-design.md.

Decision. Add a File reading capabilities section to HARVEY.md that enumerates what Harvey can read without conversion: plain text, Markdown, Go/TypeScript/Python source, and PDF (extracted via poppler automatically). This mirrors the pattern of the existing “Tagged code blocks” section — documenting Harvey’s automatic behaviors so the model can confidently use them rather than guessing.

Rejected alternatives.

Consequences.


2026-06-18 — Llamafile model discovery includes Windows .exe extensions

Context. scanLlamafileModels() in llamafile.go uses strings.HasSuffix(e.Name(), ".llamafile") to identify llamafile binaries. On Windows, llamafile binaries end in .exe (plain) or .llamafile.exe (when distributed with the double extension). Users on Windows who place binaries in ~/Models see an empty picker even with valid models present. The same bug affects llamafileModelName, which only strips the .llamafile suffix and leaves .exe on Windows paths. See quick-fixes-design.md.

Decision. Extend scanLlamafileModels to match three patterns: .llamafile, .llamafile.exe, and (on Windows only) any .exe file in the models directory. llamafileModelName is updated to strip suffixes in the correct order: strip .exe first (if present), then .llamafile (if present). The llamafileDefaultModelsDir() platform function already returns the correct OS-appropriate path; no change needed there.

Rejected alternatives.

Consequences.


2026-06-18 — --resume flag auto-selects the most recent session

Context. Harvey’s --continue PATH flag resumes from a specific session file. When the user simply wants to pick up where they left off (the most common case), they must find and type the session path, or navigate the interactive picker. Both are unnecessary friction when the intent is always “resume my last session.” See quick-fixes-design.md.

Decision. Add a --resume flag (no argument) that resolves to the most recently modified .spmd file in agents/sessions/ and sets cfg.ContinuePath to that path before Run. If no sessions exist, Harvey prints a one-line notice and starts fresh. The implementation delegates entirely to the existing ContinueFromFountain path — no new session-loading logic is needed.

Rejected alternatives.

Consequences.


2026-06-09 — Programming language support uses a central LanguageRegistry with pluggable handlers

Context. Harvey’s RAG system already supports ingesting 17 programming language file extensions (commands.go:4975-4979), but the looksLikePath function (commands.go:3463-3467) was missing extensions for C, C++, Pascal, Oberon, Lisp, and Basic. Additionally, all languages used generic paragraph-based chunking which breaks code structures (functions, procedures) across chunk boundaries, reducing RAG retrieval quality for programming queries. Users working with source code need language-aware features: code-aware chunking, documentation extraction, syntax highlighting, and auto-formatting.

Decision. Create a comprehensive language support system with the following architecture:

  1. Central LanguageRegistry (language_registry.go) — Maps language identifiers to handlers (detectors, chunkers, extractors, formatters, highlighters). Each language has a LanguageInfo struct with metadata (name, extensions, comment markers, block delimiters, capabilities).

  2. Pluggable Interfaces — Define Go interfaces for each capability:

    • LanguageDetector — Identifies language from file path and/or content
    • CodeChunker — Splits source into meaningful units (functions, classes, procedures)
    • DocExtractor — Extracts comments, docstrings, and symbol documentation
    • CodeFormatter — Formats source code according to language conventions
    • SyntaxHighlighter — Adds ANSI color to code blocks for terminal display
  3. Code-Aware Chunking — Language-specific chunkers that respect code structure:

    • C/C++: Split at function boundaries, preserve preprocessor directives and structs
    • Pascal: Split at PROCEDURE/FUNCTION boundaries, preserve TYPE/RECORD definitions
    • Oberon: Split at MODULE/PROCEDURE boundaries
    • Lisp: Split at top-level forms (balanced parentheses), keep DEFUN/DEFMACRO together
    • Basic: Split at SUB/FUNCTION boundaries
  4. Progressive Enhancement — All features are opt-in. Basic file I/O works for all languages. If a language-specific handler fails, fall back to generic behavior.

  5. Immediate Fix — Add missing extensions (.c, .cpp, .h, .hpp, .pas, .Mod, .obn, .lisp, .bas) to looksLikePath function for tagged code block detection.

Rejected alternatives.

Consequences.


2026-06-09 — Code block path detection (looksLikePath) extended to support all RAG-ingestible languages

Context. The looksLikePath function in commands.go (lines 3463-3467) determines whether a string looks like a file path rather than a language identifier. This is used by fencePathToken when parsing tagged code blocks (e.g., ```c:program.c). The function had a hardcoded list of known extensions that was missing: .c, .cpp, .h, .hpp, .pas, .Mod, .obn, .lisp, .bas. This meant that tagged code blocks for these languages were not recognized as file paths, preventing the auto-write feature from working.

Decision. Extend the knownExts slice in looksLikePath to include all extensions supported by RAG ingestion (from ragIngestableExts in commands.go:4975-4979). Additionally, add a comment noting that these are programming languages supported by RAG ingestion for future maintainability.

Rejected alternatives.

Consequences.


2026-06-08 — /loop chat iterations use a shared runChatTurn helper that skips skill auto-trigger and autoExecuteReply

Context. The REPL’s plain-chat path does more than call the model: it checks whether the input matches a skill trigger pattern (auto-dispatching to a different flow entirely), and after the reply, offers to write fenced code blocks to disk via an interactive Y/n prompt (autoExecuteReply). Both make sense for a human typing one message at a time; both are problematic when the same prompt is sent N times unattended — a skill could fire on iteration 3 but not iteration 1, and a Y/n prompt would block forever waiting on stdin that nothing will type.

Decision. Factor the REPL’s inline chat block (terminal.go, roughly lines 635-820) into a shared (a *Agent) runChatTurn(ctx, input, out) (reply string, stats ChatStats, err error). It keeps everything that defines “how Harvey answers a prompt” — RAG augmentation, the tool-loop-or-plain-chat branch, token/context warnings, stats, Fountain recording — and excludes skill auto-trigger matching and autoExecuteReply, both of which belong to “how the REPL reacts to a typed line.” /loop calls this helper directly for its chat-mode iterations; the REPL becomes a thin wrapper around the same helper plus its own skill-trigger/autoExecuteReply handling.

Rejected alternatives.

Consequences.


2026-06-08 — /loop caps iterations at 100 and defaults to 10

Context. /loop is the first Harvey command that can run LLM calls — and, with tools enabled, write files or execute shell commands — repeatedly and unattended. Harvey’s existing security posture (safe mode, permission system, audit log) is built around bounding and surfacing risky actions rather than trusting the user to always type the right thing.

Decision. /loop takes an optional --count N (following the --depth N convention already established by /read-dir), defaulting to 10 and capped at 100. There is no “run forever” option.

Rejected alternatives.

Consequences.


2026-06-08 — /loop requires an explicit interval; no self-pacing mode

Context. Claude Code’s /loop can omit the interval and let the agent self-pace via a wake-scheduling primitive. Harvey has no equivalent — it is a synchronous CLI process with no persistent scheduler or “wake me up later” mechanism.

Decision. INTERVAL is a required first argument to /loop, parsed with the existing parseDurationString helper (config.go:650, already used for run_timeout/ollama_timeout in harvey.yaml). There is no self-pacing mode.

Rejected alternatives.

Consequences.


2026-06-08 — /loop runs as a blocking foreground command, not a background goroutine

Context. Harvey’s REPL (terminal.go:Run) is a single-threaded loop that blocks on each turn, mutating a.History, a.Recorder, and the shared output writer with no locking — because nothing has ever run concurrently with it. Adding a command that repeats a prompt on an interval raises the question of whether it should run in the background while the user keeps typing, or take over the REPL until it finishes.

Decision. /loop runs in the foreground inside its own command handler, reusing the SIGINT-cancellation pattern already used three times in terminal.go (chat, ! commands, @mention dispatch): one cancellable context for the whole run, a goroutine watching os.Signal, and a wasCancelled check. Any Ctrl+C — mid-iteration or during the inter-iteration sleep — stops the whole loop and returns to the prompt.

Rejected alternatives.

Consequences.


2026-06-05 — Profile templates and help guides ship embedded in the binary

Context. Harvey installs by copying a single executable to $HOME/bin. Users on three OS / two CPU architectures should not need to install a separate asset package. Templates and help guides must therefore travel with the binary.

Decision. Use Go’s //go:embed directive (standard library since Go 1.16) to compile a templates/ directory tree into the binary at build time. A single EmbeddedTemplates embed.FS variable in templates.go gives the rest of Harvey read access to template and help guide content at runtime. Workspace-local templates in agents/templates/profiles/ are checked at runtime and merged with the built-in list, allowing organisations to add shared templates without patching Harvey.

Rejected alternatives.

Consequences.


2026-06-05 — Initial developer/writer template set; library templates deferred

Context. Harvey needs a useful starting set of profile templates but the full range of library staff roles requires domain expertise and UX review that is not yet available.

Decision. Ship five developer/writer templates for v1:

Template Role
backend-developer Go, Python, TypeScript+Deno, SQL for application work
frontend-developer HTML, CSS, TypeScript/JavaScript, Deno bundling
dataset-developer Front end plus SQL, dataset CLI, datasetd web service
data-scientist Data analysis, SQL for exploration, Python data tooling
technical-writer Documentation, man pages, tutorials, Markdown and Fountain

Library role templates (subject specialist, systems/digital, instruction/data literacy, support staff) are deferred until library staff and a UX colleague can define the categories and content correctly. Placeholder files are named in the plan but not authored.

Rejected alternatives.

Consequences.


2026-06-05 — /profile use verb; /profile top-level alias

Context. The profile switching command needed a name consistent with Harvey’s existing command vocabulary. Two candidates were considered: switch and use.

Decision. Use use as the subcommand verb because it matches the established pattern in Harvey: /ollama use, /rag use, and /kb use all select the active item from a list. Register /profile as a top-level alias delegating to /memory profile, following the same one-line handler pattern as /recall/memory recall.

Rejected alternatives.

Consequences.


2026-06-05 — Profile switching writes a Fountain handoff document

Context. When a user switches profiles mid-session with /profile use, the in-progress conversation context would be lost after ClearHistory(). The user may need to resume the previous context in a future session.

Decision. Before clearing history, /profile use writes a .spmd summary file to agents/hand-off/<timestamp>.spmd. The handoff captures the last N assistant messages as bullet points and lists file paths and open questions from recent turns. No LLM call is required — the handoff is structural, not summarised. Because it is a .spmd file, the memory miner can extract facts from it in a later session, migrating context from the old role into the new session’s experience memories over time.

The previous workspace_profile document is archived (status set to archived) rather than deleted, preserving the history of who this workspace has been used as.

Rejected alternatives.

Consequences.


2026-06-05 — Help guides for Ollama and PDF tools embedded in binary

Context. New users frequently fail to install Ollama or PDF extraction tools before running Harvey. The error messages Harvey currently produces do not explain what is missing or how to fix it. Users on three operating systems need platform-specific install instructions.

Decision. Embed short Markdown help guides (templates/help/ollama.md, templates/help/pdf-tools.md) in the binary using the same //go:embed infrastructure as profile templates. Surface them via /help ollama and /help pdf-tools. Print a one-line pointer to the relevant guide when a detection failure occurs at startup (Ollama unreachable) or during a command (PDF extraction fails). Guides are deliberately short: what it is, how to install on each platform, one troubleshooting line.

Rejected alternatives.

Consequences.


2026-06-02 — Persistent command history across sessions

Context. Harvey’s termlib.LineEditor supports Up/Down arrow history navigation within a session, but the history is in-memory only and lost on exit. Users must retype slash commands, ! shell commands, and prompts from prior sessions, which breaks flow — especially for repeated workflows like /rag ingest, /memory mine, or iterating on a prompt.

Decision. Persist the input history to agents/harvey_history inside the workspace (one entry per line, plain text). On startup Harvey loads this file and seeds the LineEditor before entering the REPL. On clean exit the in-memory history is written back, capped at 1000 entries (most recent kept). Consecutive duplicate suppression is already handled by AppendHistory; no further deduplication is applied at write time.

The implementation requires two changes:

  1. termlib (lineeditor.go) — add two methods to LineEditor:
    • SetHistory(lines []string) — replaces the in-memory history slice wholesale (used at startup).
    • History() []string — returns a copy of the current history slice (used at exit to write back).
  2. Harvey (terminal.go) — add loadCmdHistory(ws, le) called after le is created (line ~225), and saveCmdHistory(ws, le) called in the REPL exit path. Both functions resolve the path as ws.AbsPath("agents/harvey_history"). saveCmdHistory truncates to the last 1000 entries before writing.

The history file path is not configurable in this iteration; agents/ is Harvey’s conventional home for all runtime state (harvey.yaml, sessions/, memories/, rag/, knowledge.db).

Rejected alternatives.

Consequences.


2026-06-02 — UX nudge system for memory discoverability

Context. Users who understand the three storage silos (RAG / Memory / Knowledge Base) can get significantly better results, but the ingestion decision (“where does this go?”) breaks flow. No built-in mechanism surfaced actionable hints about pending mining, empty RAG stores, or RAG being disabled.

Decision. Implement a four-part nudge system:

  1. Session-start digest — a sessionMemoryDigest() function called after the ready line that prints dim hints only when a condition is actionable:

    • Unmined sessions pending → suggest /memory mine
    • Active RAG store is empty → suggest /rag ingest
    • RAG off but chunks exist → suggest /rag on No output is printed when everything looks healthy.
  2. Enhanced /status — extend cmdStatus with a Memory/RAG summary block (active memories, unmined sessions, active store, chunk count, RAG on/off). Keeps the one-stop status view complete.

  3. New /hint command — on-demand improvement suggestions that aggregate all three silos and explain the decision rule. Verbose version of the session digest with context about why each suggestion matters.

  4. /help learn topic — a unified “How Harvey learns” help page with a three-column table (what to ingest → which command → where it goes) and the single decision rule:

    • Have a text file or document? → /rag ingest
    • Something useful happened in a session? → /memory mine
    • Making an observation about an experiment? → /kb observe
  5. /recall alias — routes to /memory recall to make the unified retrieval interface the obvious entry point.

Rejected alternatives. - Single storage silo — would reduce configuration but lose retrieval precision for small models. Topic-scoped RAG stores (e.g., deno_typescript, go) give better recall than one large mixed store. - Always-on verbose status — printing all memory info on every startup is too noisy. Only surface hints when actionable. - Merging /rag on + /memory recall into a single toggle — the per-prompt RAG injection (ragAugment) and session-start injection (UnifiedMemory.Recall) are different channels. A single toggle would require auditing whether UnifiedMemory already includes RAG chunks. Deferred to a future audit.

Consequences. - terminal.go gains a sessionMemoryDigest() call after the ready line. - commands.go gains cmdHint, enhanced cmdStatus, and a /recall registration. - helptext.go gains LearnHelpText. - cmdHelp dispatches "learn" and "memory-overview" to LearnHelpText. - help topic list is updated to include learn.


2026-06-02 — model_map in RAG stores (deferred simplification)

Context. Each RAG store entry in harvey.yaml has a model_map field that maps generation models to embedding models. In practice every store uses nomic-embed-text for all generation models, making the map redundant.

Decision. Deferred. Do not remove model_map now. The code is already correct and operational. Remove it when there is a concrete reason to simplify the config schema (e.g., adding a new embedder type that makes the override meaningful).

Consequences. model_map remains in the config and ragAugment continues to honour it. No user-visible change.


2026-06-02 — Dual RAG injection audit (deferred)

Context. Harvey has two RAG injection paths that run independently: 1. Per-prompt via ragAugment() in terminal.go (when a.RagOn) 2. Session-start via UnifiedMemory.Recall() which also queries the RAG store

A user with both memory.enabled and rag.enabled may receive RAG content twice per turn — once in the system prompt injection and once prepended to each prompt. This wastes context tokens and may confuse small models.

Decision. Deferred. Audit and fix when a user observes noticeably degraded context efficiency. The fix would be to either: (a) skip RAG chunks in UnifiedMemory.Recall() when a.RagOn is true, or (b) make ragAugment a no-op when UnifiedMemory already injected from the same store.

Consequences. Known overlap. No immediate action required.


2026-05-31 — prose tool call correction injection

Context. Small models emit tool calls as JSON fenced blocks rather than structured API responses. The original tryExecuteProseToolCalls returned bool and could not distinguish “dispatched successfully” from “dispatched but every call errored”. When models hallucinated tool names the warning was suppressed because len(results) > 0 was always true.

Decision. Change tryExecuteProseToolCalls to return (dispatched bool, unknownNames []string). Track a succeeded counter internally; set dispatched = true only when ≥1 call succeeded. When unknownNames is non-empty, inject a correction message into history after a.AddMessage("assistant", ...) so history ordering is: user → assistant → correction-user. This gives the model a chance to retry with the correct tool names.

Consequences. The noToolCalls guard also gates autoExecuteReply to prevent directory-tree code blocks from being offered as files to write after successful tool-call turns.


2026-05-31 — histLenBeforeChat pattern for noToolCalls guard

Context. Harvey needs to know whether a chat turn resulted in structured tool calls (via RunToolLoop) so it can skip autoExecuteReply when tool calls already handled file writing. The check len(a.History) == histLenBeforeChat correctly detects no tool calls only when captured before a.AddMessage.

Decision. Capture histLenBeforeChat := len(a.History) before the Chat/RunToolLoop call. Compute noToolCalls := len(a.History) == histLenBeforeChat before a.AddMessage. This invariant must be preserved: any refactor that moves a.AddMessage before the noToolCalls check will silently break the guard.

Consequences. Documented as a key invariant in CLAUDE.md.


2026-05-28 — Three-silo memory architecture

Context. Harvey needs to accumulate knowledge across sessions without polluting the LLM context window on every turn. Three distinct content types require different ingestion and retrieval strategies: (1) external documents, (2) session experience, (3) research observations.

Decision. Three independent silos unified at retrieval time by UnifiedMemory.Recall():

Silo Ingestion Retrieval
RAG store /rag ingest (explicit) Per-prompt via ragAugment()
Memory store /memory mine or auto-mine on exit Session-start via UnifiedMemory
Knowledge base /kb observe (explicit) On-demand via UnifiedMemory

Consequences. Each silo has its own command namespace (/rag, /memory, /kb). The unified retrieval via /memory recall is the recommended entry point. All three silos share a token budget enforced at injection time.