This file records significant architectural and UX decisions, their rationale, and known trade-offs. New decisions are added at the top. Each entry names the decision, the context that prompted it, the chosen approach, the rejected alternatives, and the consequences.
Context. Harvey’s buildCompleter() only
completes top-level command names, @route references,
Ollama model names, and file paths. Users must remember subcommand names
by heart and must know exact RAG store/model names to use
use and drop subcommands. Several commands
already show numbered pickers when no name is given, but each
reimplements the pattern differently. See tab-completion-design.md.
Decision. Extend completion in two layers: (1)
second-token subcommand names using a new
Subcommands []string field on Command; (2)
third-token argument values using a new
ArgCompletion map[string]func(*Agent) []string field that
maps each subcommand to a candidate-list function. Additionally,
introduce a shared SelectFrom / SelectItem /
SelectFromStrings API in a new ui.go file.
Commands whose first positional argument comes from a finite, enumerable
list (/rag use, /memory show,
/llamafile use, etc.) display the picker when no argument
is given. ui.go lives in the harvey package;
promotion to termlib is deferred until a clean
generalisation is proven.
Rejected alternatives.
Subcommands field is
explicit and refactoring-safe.ArgCompletion map[string]func approach keeps candidate
production separate from completion mechanics.Consequences.
Command struct gains Subcommands []string
and ArgCompletion map[string]func(*Agent) []string. The doc
comment is updated. No existing registration is broken (new fields are
optional).buildCompleter() gains two new blocks before the
existing file-path switch. Existing file-path and model-name completion
is unchanged.ui.go is a new file; ui_test.go covers all
exported symbols.llamafile.go and
commands.go are refactored to call SelectFrom
in Phase E. Behaviour is identical; code shrinks.Context. remote_s3.go uses
github.com/minio/minio-go/v7 as the S3 protocol client.
MinIO’s Go client has moved to a closed-source license, making it
unsuitable for Harvey’s AGPL-3.0 codebase. The affected surface is
small: Stat, Get, and List
operations on S3-compatible stores (AWS S3, MinIO server, Cloudflare
R2). See s3-replacement-design.md.
Decision. Replace the MinIO client with
github.com/aws/aws-sdk-go-v2 (Apache-2.0 licensed). The AWS
SDK v2 supports all S3-compatible endpoints via the
BaseEndpoint override option. The call sites in
remote_s3.go map cleanly: StatObject →
HeadObject, GetObject →
GetObject, ListObjects →
ListObjectsV2. Credentials continue to come from
environment variables (AWS_ACCESS_KEY_ID,
AWS_SECRET_ACCESS_KEY) and the SDK’s default credential
chain.
Rejected alternatives.
Consequences.
go.mod removes
github.com/minio/minio-go/v7, adds three
aws-sdk-go-v2 modules (config,
service/s3, credentials).remote_s3.go is rewritten; public interface
(RemoteReader implementation) is unchanged.harvey.yaml config fields are
unaffected.~/.aws/credentials, IAM
roles) works automatically.Context. Harvey’s spinner currently shows rotating Edward Lear quotes and a timer while waiting for the LLM. Users have no way to tell whether Harvey is embedding a query, calling a tool, waiting for Ollama, or doing something else. Claude Code and similar tools display live status messages that update as work progresses. See spinner-ux-design.md.
Decision. Add a StatusCh chan string
field to the Spinner struct and an
UpdateStatus(msg string) method. The spinner’s message line
shows the most-recent status update instead of the next Lear quote
whenever a message is pending; Lear quotes resume when no status is
pending. The caller sends non-blocking updates via
UpdateStatus; the spinner goroutine reads them on the fast
tick. This preserves the existing Lear personality while surfacing
actionable progress at key moments: tool call start/end, RAG embedding,
context injection, model switching. Tab completion is out of scope for
this work item; it is a separate, larger effort.
Rejected alternatives.
Consequences.
spinner.go adds StatusCh chan string,
UpdateStatus(string), and lastStatus string to
the Spinner type.terminal.go calls UpdateStatus at: RAG
embedding start, tool call start, tool call complete, context
injection.Context. bin/assay writes evaluation
results to testout/ inside the harvey/ source
repository. This directory is gitignored, but the JSON and Markdown
artifacts look like test output to language models that read the source
tree, causing models to misinterpret stale evaluation results as current
test failures. See assay-llamafile-design.md.
Decision. Change the default output directory for
bin/assay from testout/ to
$WORKSPACE/assay-results/<timestamp>/ where
$WORKSPACE is resolved the same way Harvey resolves its
workspace (walk up from cwd to the directory containing
agents/harvey.yaml). If no workspace is found, fall back to
a assay-results/ directory in the current working
directory. The --output flag overrides the default as
before.
Rejected alternatives.
testout/ but add a note file —
models still read and misinterpret the directory.--output flag — breaks existing
workflows that rely on the default.$XDG_DATA_HOME/harvey/assay-results/ —
correct in principle but separates results from the workspace they were
generated against, making correlation harder.Consequences.
cmd/assay/main.go gains workspace discovery logic (same
heuristic as Harvey’s NewWorkspace).--help output.testout/ in the harvey repo is no longer populated by
bin/assay in normal use.--llamafile flagContext. bin/assay currently only
supports Ollama as a model backend, but Harvey supports both Ollama and
Llamafile. Users evaluating a Llamafile model must run it manually and
point assay at it with a custom URL, which is error-prone and
undocumented. See assay-llamafile-design.md.
Decision. Add a --llamafile PATH flag
to bin/assay. When provided, assay starts the llamafile
process on an ephemeral port (same startLlamafile logic as
in llamafile_service.go), runs the evaluation suite against
that endpoint, then terminates the process on exit. The
--model flag is still respected (it sets the model name in
the report) but --ollama is ignored when
--llamafile is given. Embeddings continue to use the Ollama
embedder unless --rag-db is also given and the store’s
recorded embedding model differs, in which case the operation fails fast
with a clear error.
Rejected alternatives.
assay-llamafile binary — duplicates
95% of the evaluation harness; not maintainable.--llamafile flag makes the common case ergonomic.Consequences.
cmd/assay/main.go imports
llamafile_service.go functions already in the package; no
new files needed.Context. The five templates shipped in v1
(backend-developer, frontend-developer,
dataset-developer, data-scientist,
technical-writer) do not have a template that covers the
full polyglot web development stack used in this workspace: Go backends,
uv-managed Python scripts, SQL (SQLite3 and Postgres), Deno+TypeScript
frontends, and vanilla JavaScript/CSS/HTML5. A backend developer using
Deno or a frontend developer writing Go API clients currently reaches
for an incomplete template. See web-developer-template-design.md.
Decision. Add a web-developer.spmd
template to templates/profiles/. It covers: Go (net/http,
database/sql), uv+Python (scripting, data processing), SQL (SQLite3
dialect and Postgres), Deno+TypeScript (runtime, standard library, no
bundler by default), JavaScript (ES modules, no framework by default),
CSS (custom properties, no utility framework by default), HTML5
(semantic markup). The template’s NOTE: recommends
qwen2.5-coder:7b or granite3.3:2b and suggests
ingesting both Go source and the
deno.json/package.json for context.
Rejected alternatives.
backend-developer template
— the existing template is already a good fit for pure Go/Python/SQL
work; adding Deno and CSS would make it too broad and undermine the
template picker’s value as a role-specific starting point.go-web and deno-web
templates — two templates for what is effectively one stack in this
workspace is unnecessarily granular.Consequences.
templates/profiles/web-developer.spmd is added to the
embedded binary.ListTemplates() discovers it
automatically./memory profile subcommand set expanded and naming
standardizedContext. The current /memory profile
command has three subcommands — show, update,
use — but their semantics do not match Harvey’s established
command vocabulary. show lists active profiles (like
list does elsewhere) rather than showing the
content of the active profile. use creates a new
profile from a template (like new does elsewhere) rather
than selecting an existing saved profile. update opens the
current profile in $EDITOR. There is no way to rename a
workspace. See memory-profile-ux-design.md.
Decision. Standardize the subcommand set:
| Subcommand | New behaviour | Was |
|---|---|---|
list |
List all profiles (active + archived) | (show partial) |
show |
Print the content of the current active profile | (missing) |
edit |
Open the active profile in $EDITOR (rename of
update) |
update |
use [NAME] |
Switch to a named template or picker | unchanged |
rename NAME |
Rename the workspace display name in the active profile | (missing) |
update is kept as a deprecated alias for
edit with a one-line deprecation notice, to avoid breaking
existing workflows. The /profile top-level alias continues
to delegate to all subcommands. The help text for /memory
is updated to list all five subcommands.
Rejected alternatives.
use to new to match the
new/list/use pattern elsewhere —
/profile use is already shipped, documented, and matches
/ollama use, /rag use. Breaking the alias
would confuse users more than the current inconsistency.show with list semantics — defeats
discoverability; users type /memory profile show expecting
to see what their profile says, not a list of IDs.Consequences.
commands.go: cmdMemoryProfile gains
list, rename, and show
(content-display) cases. show (old list behavior) becomes
list. update remains as alias for
edit.helptext.go: memory and profile help text updated.harvey-memory.7.md: man page updated to document all
five subcommands.Context. Harvey’s read_file built-in
tool description states that PDF files are extracted automatically via
poppler. But when tools are disabled — or when a small model uses prose
tool calls and does not consistently read all tool descriptions — the
model has no knowledge of this capability and asks the user to manually
convert PDFs to text. HARVEY.md is always injected as the system prompt,
making it the correct place to disclose capabilities that should be
known regardless of tool-call mode. See quick-fixes-design.md.
Decision. Add a File reading
capabilities section to HARVEY.md that enumerates
what Harvey can read without conversion: plain text, Markdown,
Go/TypeScript/Python source, and PDF (extracted via poppler
automatically). This mirrors the pattern of the existing “Tagged code
blocks” section — documenting Harvey’s automatic behaviors so the model
can confidently use them rather than guessing.
Rejected alternatives.
read_file tool description —
already done; the problem is the model doesn’t see tool descriptions
when tools are disabled.Consequences.
HARVEY.md gains a short “File reading” section (4-6
bullet points).HARVEY.md is loaded by
LoadHarveyMD() at startup.read_file directly.Context. scanLlamafileModels() in
llamafile.go uses
strings.HasSuffix(e.Name(), ".llamafile") to identify
llamafile binaries. On Windows, llamafile binaries end in
.exe (plain) or .llamafile.exe (when
distributed with the double extension). Users on Windows who place
binaries in ~/Models see an empty picker even with valid
models present. The same bug affects llamafileModelName,
which only strips the .llamafile suffix and leaves
.exe on Windows paths. See quick-fixes-design.md.
Decision. Extend scanLlamafileModels to
match three patterns: .llamafile,
.llamafile.exe, and (on Windows only) any .exe
file in the models directory. llamafileModelName is updated
to strip suffixes in the correct order: strip .exe first
(if present), then .llamafile (if present). The
llamafileDefaultModelsDir() platform function already
returns the correct OS-appropriate path; no change needed there.
Rejected alternatives.
.exe files and users should not
need to rename them..exe files unconditionally — would
pick up non-llamafile executables; restrict to .exe only
when the scan finds no .llamafile or
.llamafile.exe files, or only match .exe files
that also check for the llamafile magic bytes (deferred to a future
improvement).Consequences.
llamafile.go: scanLlamafileModels matches
.llamafile, .llamafile.exe, and
.exe (Windows-only guard); llamafileModelName
strips suffixes in the correct order.~/Models now see them in
the picker.--resume flag auto-selects the most recent sessionContext. Harvey’s --continue PATH flag
resumes from a specific session file. When the user simply wants to pick
up where they left off (the most common case), they must find and type
the session path, or navigate the interactive picker. Both are
unnecessary friction when the intent is always “resume my last session.”
See quick-fixes-design.md.
Decision. Add a --resume flag (no
argument) that resolves to the most recently modified .spmd
file in agents/sessions/ and sets
cfg.ContinuePath to that path before Run. If
no sessions exist, Harvey prints a one-line notice and starts fresh. The
implementation delegates entirely to the existing
ContinueFromFountain path — no new session-loading logic is
needed.
Rejected alternatives.
--continue with no argument mean “most
recent” — changes the semantics of an existing flag; would break
scripts that pass --continue expecting a required
argument.--resume as an alias for opening the
interactive picker — the picker is useful for choosing among
multiple sessions; --resume should be zero-friction and not
prompt.Consequences.
cmd/harvey/main.go gains a --resume case
that calls a new mostRecentSession(sessDir string) string
helper.harvey.go or sessions_files.go gains
mostRecentSession (walks agents/sessions/,
returns path of newest .spmd by ModTime).--continue semantics.--record, the resumed session is not
re-recorded (existing guard in terminal.go:333-338 already
handles this).Context. Harvey’s RAG system already supports
ingesting 17 programming language file extensions
(commands.go:4975-4979), but the looksLikePath function
(commands.go:3463-3467) was missing extensions for C, C++, Pascal,
Oberon, Lisp, and Basic. Additionally, all languages used generic
paragraph-based chunking which breaks code structures (functions,
procedures) across chunk boundaries, reducing RAG retrieval quality for
programming queries. Users working with source code need language-aware
features: code-aware chunking, documentation extraction, syntax
highlighting, and auto-formatting.
Decision. Create a comprehensive language support system with the following architecture:
Central LanguageRegistry
(language_registry.go) — Maps language identifiers to
handlers (detectors, chunkers, extractors, formatters, highlighters).
Each language has a LanguageInfo struct with metadata
(name, extensions, comment markers, block delimiters,
capabilities).
Pluggable Interfaces — Define Go interfaces for each capability:
LanguageDetector — Identifies language from file path
and/or contentCodeChunker — Splits source into meaningful units
(functions, classes, procedures)DocExtractor — Extracts comments, docstrings, and
symbol documentationCodeFormatter — Formats source code according to
language conventionsSyntaxHighlighter — Adds ANSI color to code blocks for
terminal displayCode-Aware Chunking — Language-specific chunkers that respect code structure:
Progressive Enhancement — All features are opt-in. Basic file I/O works for all languages. If a language-specific handler fails, fall back to generic behavior.
Immediate Fix — Add missing extensions
(.c, .cpp, .h, .hpp,
.pas, .Mod, .obn,
.lisp, .bas) to looksLikePath
function for tagged code block detection.
Rejected alternatives.
Use Tree-sitter for all parsing — Tree-sitter provides excellent AST-based parsing but adds ~5MB per language grammar, significant build complexity, and external dependencies. Rejected in favor of simpler regex-based and state-machine approaches for initial implementation, with Tree-sitter as a future enhancement.
Single monolithic chunker — One chunker handling all languages with conditional logic. Rejected for being hard to maintain, test, and extend. The interface-based approach allows independent development and testing of each language’s chunker.
Cloud-based language services — Use external APIs for formatting, analysis, etc. Rejected for violating Harvey’s local-first philosophy and introducing privacy/security concerns (sending user code to external services).
Mandatory formatting — Always format code on write without user control. Rejected for being too opinionated and potentially breaking user workflows. Auto-formatting must be opt-in and configurable.
Consequences.
File Changes: New files
language_registry.go, code_chunkers.go,
doc_extractors.go, syntax_highlighters.go,
code_formatters.go with corresponding test files. Modified
commands.go, config.go,
builtin_tools.go, terminal.go.
Backward Compatibility: Existing RAG stores continue to work. Generic chunking remains as fallback. No breaking changes to SQLite schema or session format.
Performance: Language registry initialization at startup adds < 10ms. Chunking with language-specific handlers adds ~10-20% overhead vs. generic chunking. Formatters only invoked when auto-format is enabled.
Extensibility: New languages can be added by implementing the interfaces and registering them, without modifying core code.
Improved RAG Quality: Code-aware chunking preserves function/procedure boundaries, improving retrieval quality for code-related queries by an estimated 20%+ over generic chunking.
Better UX: Syntax highlighting in terminal output and auto-formatting on file write improve the user experience when working with source code.
looksLikePath) extended to
support all RAG-ingestible languagesContext. The looksLikePath function in
commands.go (lines 3463-3467) determines whether a string
looks like a file path rather than a language identifier. This is used
by fencePathToken when parsing tagged code blocks (e.g.,
```c:program.c). The function had a hardcoded list of known
extensions that was missing: .c, .cpp,
.h, .hpp, .pas,
.Mod, .obn, .lisp,
.bas. This meant that tagged code blocks for these
languages were not recognized as file paths, preventing the auto-write
feature from working.
Decision. Extend the knownExts slice in
looksLikePath to include all extensions supported by RAG
ingestion (from ragIngestableExts in
commands.go:4975-4979). Additionally, add a comment noting that these
are programming languages supported by RAG ingestion for future
maintainability.
Rejected alternatives.
Refactor to use the language registry — While this would
be more maintainable long-term, it would introduce a circular dependency
(the registry isn’t initialized when looksLikePath is first
used during startup). Deferred to a future cleanup.
Create a separate list — Maintain a separate, parallel list of extensions. Rejected for creating a maintenance burden and potential for divergence.
Make it dynamic — Load extensions from configuration. Rejected as over-engineering for a static list that rarely changes.
Consequences.
Tagged code blocks for all RAG-supported languages now work
correctly, e.g.,
c:src/main.c` orpascal:module.pas`.
The hardcoded list remains a maintenance point but now includes all 17 supported languages.
Future additions to RAG ingestion must remember to also update
looksLikePath. This is documented in the code
comments.
/loop chat iterations use a shared
runChatTurn helper that skips skill auto-trigger and
autoExecuteReplyContext. The REPL’s plain-chat path does more than
call the model: it checks whether the input matches a skill trigger
pattern (auto-dispatching to a different flow entirely), and after the
reply, offers to write fenced code blocks to disk via an interactive Y/n
prompt (autoExecuteReply). Both make sense for a human
typing one message at a time; both are problematic when the same prompt
is sent N times unattended — a skill could fire on iteration 3 but not
iteration 1, and a Y/n prompt would block forever waiting on stdin that
nothing will type.
Decision. Factor the REPL’s inline chat block
(terminal.go, roughly lines 635-820) into a shared
(a *Agent) runChatTurn(ctx, input, out) (reply string, stats ChatStats, err error).
It keeps everything that defines “how Harvey answers a prompt” — RAG
augmentation, the tool-loop-or-plain-chat branch, token/context
warnings, stats, Fountain recording — and excludes skill auto-trigger
matching and autoExecuteReply, both of which belong to “how
the REPL reacts to a typed line.” /loop calls this helper
directly for its chat-mode iterations; the REPL becomes a thin wrapper
around the same helper plus its own
skill-trigger/autoExecuteReply handling.
Rejected alternatives.
cmdLoop —
roughly 150 lines of copy-paste that would drift from the REPL’s version
on the next change to the chat path.Consequences.
terminal.go’s plain-chat branch is refactored but
behaviourally unchanged for normal typed input — verified with
go test -race after extraction./loop behaves predictably: the same prompt produces the
same kind of exchange every time, with no surprise skill redirects or
stalled prompts.a.Config.ToolsEnabled, looped prompts can still
cause the model to write files or run commands via the normal tool loop
— /loop does not suppress this, since doing so would make
looped chat behave differently from normal chat (see
loop-design.md, “Safety Considerations”)./loop caps iterations at 100 and defaults to 10Context. /loop is the first Harvey
command that can run LLM calls — and, with tools enabled, write files or
execute shell commands — repeatedly and unattended. Harvey’s existing
security posture (safe mode, permission system, audit log) is built
around bounding and surfacing risky actions rather than trusting the
user to always type the right thing.
Decision. /loop takes an optional
--count N (following the --depth N convention
already established by /read-dir), defaulting to 10 and
capped at 100. There is no “run forever” option.
Rejected alternatives.
/loop 1s tell me a joke) into thousands of unattended
LLM calls before the user notices.Looping every 5m, up to 10 times: ...) gives the same
“last chance to Ctrl+C” moment without an extra interaction step,
consistent with how /pipeline announces its plan before
running.Consequences.
/loop run is bounded to at most 100
iterations — e.g. roughly 8 hours at a 5-minute interval — which still
covers realistic “check on this periodically” use cases./loop, a deliberate
speed bump rather than an oversight./loop requires an explicit interval; no self-pacing
modeContext. Claude Code’s /loop can omit
the interval and let the agent self-pace via a wake-scheduling
primitive. Harvey has no equivalent — it is a synchronous CLI process
with no persistent scheduler or “wake me up later” mechanism.
Decision. INTERVAL is a required first
argument to /loop, parsed with the existing
parseDurationString helper (config.go:650,
already used for run_timeout/ollama_timeout in
harvey.yaml). There is no self-pacing mode.
Rejected alternatives.
Consequences.
/loop’s usage string and help text always show
INTERVAL as required./loop will notice the
difference; the help text explains why (no async scheduler in
Harvey)./loop runs as a blocking foreground command, not a
background goroutineContext. Harvey’s REPL
(terminal.go:Run) is a single-threaded loop that blocks on
each turn, mutating a.History, a.Recorder, and
the shared output writer with no locking — because nothing has ever run
concurrently with it. Adding a command that repeats a prompt on an
interval raises the question of whether it should run in the background
while the user keeps typing, or take over the REPL until it
finishes.
Decision. /loop runs in the foreground
inside its own command handler, reusing the SIGINT-cancellation pattern
already used three times in terminal.go (chat,
! commands, @mention dispatch): one
cancellable context for the whole run, a goroutine watching
os.Signal, and a wasCancelled check. Any
Ctrl+C — mid-iteration or during the inter-iteration sleep — stops the
whole loop and returns to the prompt.
Rejected alternatives.
a.History, a.Recorder, and
out, none of which exist today. The concurrency-safety
surface this opens is large relative to the value of letting the user
type while the loop runs.Consequences.
/loop blocks the REPL for its duration — communicated
up front via a printed plan summary before the first iteration
runs./loop status / /loop stop subcommands
aren’t meaningful (the REPL can’t read them while blocked) and are not
implemented.Context. Harvey installs by copying a single
executable to $HOME/bin. Users on three OS / two CPU
architectures should not need to install a separate asset package.
Templates and help guides must therefore travel with the binary.
Decision. Use Go’s //go:embed directive
(standard library since Go 1.16) to compile a templates/
directory tree into the binary at build time. A single
EmbeddedTemplates embed.FS variable in
templates.go gives the rest of Harvey read access to
template and help guide content at runtime. Workspace-local templates in
agents/templates/profiles/ are checked at runtime and
merged with the built-in list, allowing organisations to add shared
templates without patching Harvey.
Rejected alternatives.
harvey.yaml — templates are
multi-line prose documents; embedding them in YAML is unreadable and
fragile to edit.Consequences.
templates/ directory added to the Harvey source tree;
must be maintained alongside code..spmd files and
three Markdown guides are small).templates.go is the single registration point for all
embedded assets.Context. Harvey needs a useful starting set of profile templates but the full range of library staff roles requires domain expertise and UX review that is not yet available.
Decision. Ship five developer/writer templates for v1:
| Template | Role |
|---|---|
backend-developer |
Go, Python, TypeScript+Deno, SQL for application work |
frontend-developer |
HTML, CSS, TypeScript/JavaScript, Deno bundling |
dataset-developer |
Front end plus SQL, dataset CLI, datasetd web service |
data-scientist |
Data analysis, SQL for exploration, Python data tooling |
technical-writer |
Documentation, man pages, tutorials, Markdown and Fountain |
Library role templates (subject specialist, systems/digital, instruction/data literacy, support staff) are deferred until library staff and a UX colleague can define the categories and content correctly. Placeholder files are named in the plan but not authored.
Rejected alternatives.
Consequences.
blank.spmd or one of the developer templates as a
starting point. Acceptable short-term./profile use verb; /profile top-level
aliasContext. The profile switching command needed a name
consistent with Harvey’s existing command vocabulary. Two candidates
were considered: switch and use.
Decision. Use use as the subcommand
verb because it matches the established pattern in Harvey:
/ollama use, /rag use, and
/kb use all select the active item from a list. Register
/profile as a top-level alias delegating to
/memory profile, following the same one-line handler
pattern as /recall → /memory recall.
Rejected alternatives.
/profile switch — switch does not
appear elsewhere in Harvey’s command vocabulary. use is
already the selection verb./switch-profile or
/change-profile — hyphenated commands are not the
Harvey convention.Consequences.
commands.go gains a "profile" entry in the
top-level command table (identical in structure to
"recall").cmdMemoryProfile gains a "use" dispatch
case./memory profile use, /profile use, and
/profile (showing subcommand help) all work.Context. When a user switches profiles mid-session
with /profile use, the in-progress conversation context
would be lost after ClearHistory(). The user may need to
resume the previous context in a future session.
Decision. Before clearing history,
/profile use writes a .spmd summary file to
agents/hand-off/<timestamp>.spmd. The handoff
captures the last N assistant messages as bullet points and lists file
paths and open questions from recent turns. No LLM call is required —
the handoff is structural, not summarised. Because it is a
.spmd file, the memory miner can extract facts from it in a
later session, migrating context from the old role into the new
session’s experience memories over time.
The previous workspace_profile document is archived
(status set to archived) rather than deleted, preserving
the history of who this workspace has been used as.
Rejected alternatives.
Consequences.
agents/hand-off/ directory is created at workspace init
alongside agents/sessions/.writeHandoff() function added to
harvey.go.agents/hand-off/ as well as
agents/sessions/.Context. New users frequently fail to install Ollama or PDF extraction tools before running Harvey. The error messages Harvey currently produces do not explain what is missing or how to fix it. Users on three operating systems need platform-specific install instructions.
Decision. Embed short Markdown help guides
(templates/help/ollama.md,
templates/help/pdf-tools.md) in the binary using the same
//go:embed infrastructure as profile templates. Surface
them via /help ollama and /help pdf-tools.
Print a one-line pointer to the relevant guide when a detection failure
occurs at startup (Ollama unreachable) or during a command (PDF
extraction fails). Guides are deliberately short: what it is, how to
install on each platform, one troubleshooting line.
Rejected alternatives.
Consequences.
templates/help/ directory contains three Markdown files
maintained alongside the code.helptext.go gains OllamaHelpText and
PDFToolsHelpText helpers.terminal.go and pdf_extract.go each gain
one conditional pointer line.Context. Harvey’s termlib.LineEditor
supports Up/Down arrow history navigation within a session, but the
history is in-memory only and lost on exit. Users must retype slash
commands, ! shell commands, and prompts from prior
sessions, which breaks flow — especially for repeated workflows like
/rag ingest, /memory mine, or iterating on a
prompt.
Decision. Persist the input history to
agents/harvey_history inside the workspace (one entry per
line, plain text). On startup Harvey loads this file and seeds the
LineEditor before entering the REPL. On clean exit the
in-memory history is written back, capped at 1000
entries (most recent kept). Consecutive duplicate suppression
is already handled by AppendHistory; no further
deduplication is applied at write time.
The implementation requires two changes:
termlib (lineeditor.go) —
add two methods to LineEditor:
SetHistory(lines []string) — replaces the in-memory
history slice wholesale (used at startup).History() []string — returns a copy of the current
history slice (used at exit to write back).terminal.go) — add
loadCmdHistory(ws, le) called after le is
created (line ~225), and saveCmdHistory(ws, le) called in
the REPL exit path. Both functions resolve the path as
ws.AbsPath("agents/harvey_history").
saveCmdHistory truncates to the last 1000 entries before
writing.The history file path is not configurable in this iteration;
agents/ is Harvey’s conventional home for all runtime state
(harvey.yaml, sessions/,
memories/, rag/,
knowledge.db).
Rejected alternatives.
~/.harvey_history — shares history
across workspaces, which leaks commands and paths between projects.
Harvey’s workspace-boundary model makes per-workspace the correct
scope.agents/harvey.yaml — would
pollute the config file with ephemeral runtime data and complicate
config schema evolution..spmd session files for history —
session recordings are conversation transcripts, not command logs;
extraction would be fragile and slow.Consequences.
termlib/lineeditor.go gains SetHistory and
History methods.harvey/terminal.go gains loadCmdHistory
and saveCmdHistory helper functions wired into the REPL
startup and exit.harvey.yaml schema, Config,
or any other subsystem.HISTFILE locking.Context. Users who understand the three storage silos (RAG / Memory / Knowledge Base) can get significantly better results, but the ingestion decision (“where does this go?”) breaks flow. No built-in mechanism surfaced actionable hints about pending mining, empty RAG stores, or RAG being disabled.
Decision. Implement a four-part nudge system:
Session-start digest — a
sessionMemoryDigest() function called after the ready line
that prints dim hints only when a condition is actionable:
/memory mine/rag ingest/rag on No output is
printed when everything looks healthy.Enhanced /status — extend
cmdStatus with a Memory/RAG summary block (active memories,
unmined sessions, active store, chunk count, RAG on/off). Keeps the
one-stop status view complete.
New /hint command — on-demand
improvement suggestions that aggregate all three silos and explain the
decision rule. Verbose version of the session digest with context about
why each suggestion matters.
/help learn topic — a unified “How
Harvey learns” help page with a three-column table (what to ingest →
which command → where it goes) and the single decision rule:
/rag ingest/memory mine/kb observe/recall alias — routes to
/memory recall to make the unified retrieval interface the
obvious entry point.
Rejected alternatives. - Single storage
silo — would reduce configuration but lose retrieval precision for
small models. Topic-scoped RAG stores (e.g.,
deno_typescript, go) give better recall than
one large mixed store. - Always-on verbose status — printing
all memory info on every startup is too noisy. Only surface hints when
actionable. - Merging /rag on +
/memory recall into a single toggle — the per-prompt
RAG injection (ragAugment) and session-start injection
(UnifiedMemory.Recall) are different channels. A single
toggle would require auditing whether UnifiedMemory already
includes RAG chunks. Deferred to a future audit.
Consequences. - terminal.go gains a
sessionMemoryDigest() call after the ready line. -
commands.go gains cmdHint, enhanced
cmdStatus, and a /recall registration. -
helptext.go gains LearnHelpText. -
cmdHelp dispatches "learn" and
"memory-overview" to LearnHelpText. -
help topic list is updated to include
learn.
Context. Each RAG store entry in
harvey.yaml has a model_map field that maps
generation models to embedding models. In practice every store uses
nomic-embed-text for all generation models, making the map
redundant.
Decision. Deferred. Do not remove
model_map now. The code is already correct and operational.
Remove it when there is a concrete reason to simplify the config schema
(e.g., adding a new embedder type that makes the override
meaningful).
Consequences. model_map remains in the
config and ragAugment continues to honour it. No
user-visible change.
Context. Harvey has two RAG injection paths that run
independently: 1. Per-prompt via ragAugment() in
terminal.go (when a.RagOn) 2. Session-start
via UnifiedMemory.Recall() which also queries the RAG
store
A user with both memory.enabled and
rag.enabled may receive RAG content twice per turn — once
in the system prompt injection and once prepended to each prompt. This
wastes context tokens and may confuse small models.
Decision. Deferred. Audit and fix when a user
observes noticeably degraded context efficiency. The fix would be to
either: (a) skip RAG chunks in UnifiedMemory.Recall() when
a.RagOn is true, or (b) make ragAugment a
no-op when UnifiedMemory already injected from the same
store.
Consequences. Known overlap. No immediate action required.
Context. Small models emit tool calls as JSON fenced
blocks rather than structured API responses. The original
tryExecuteProseToolCalls returned bool and
could not distinguish “dispatched successfully” from “dispatched but
every call errored”. When models hallucinated tool names the warning was
suppressed because len(results) > 0 was always true.
Decision. Change
tryExecuteProseToolCalls to return
(dispatched bool, unknownNames []string). Track a
succeeded counter internally; set
dispatched = true only when ≥1 call succeeded. When
unknownNames is non-empty, inject a correction message into
history after a.AddMessage("assistant", ...) so
history ordering is: user → assistant → correction-user. This gives the
model a chance to retry with the correct tool names.
Consequences. The noToolCalls guard
also gates autoExecuteReply to prevent directory-tree code
blocks from being offered as files to write after successful tool-call
turns.
Context. Harvey needs to know whether a chat turn
resulted in structured tool calls (via RunToolLoop) so it
can skip autoExecuteReply when tool calls already handled
file writing. The check len(a.History) == histLenBeforeChat
correctly detects no tool calls only when captured before
a.AddMessage.
Decision. Capture
histLenBeforeChat := len(a.History) before the
Chat/RunToolLoop call. Compute
noToolCalls := len(a.History) == histLenBeforeChat
before a.AddMessage. This invariant must be
preserved: any refactor that moves a.AddMessage before the
noToolCalls check will silently break the guard.
Consequences. Documented as a key invariant in
CLAUDE.md.
Context. Harvey needs to accumulate knowledge across sessions without polluting the LLM context window on every turn. Three distinct content types require different ingestion and retrieval strategies: (1) external documents, (2) session experience, (3) research observations.
Decision. Three independent silos unified at
retrieval time by UnifiedMemory.Recall():
| Silo | Ingestion | Retrieval |
|---|---|---|
| RAG store | /rag ingest (explicit) |
Per-prompt via ragAugment() |
| Memory store | /memory mine or auto-mine on exit |
Session-start via UnifiedMemory |
| Knowledge base | /kb observe (explicit) |
On-demand via UnifiedMemory |
Consequences. Each silo has its own command
namespace (/rag, /memory, /kb).
The unified retrieval via /memory recall is the recommended
entry point. All three silos share a token budget enforced at injection
time.