Harvey Security Review - Pre-Release Assessment

Overview

This document provides a comprehensive security review of Harvey (v0.0.2+) in preparation for release. It identifies strengths, vulnerabilities, and recommendations for hardening the codebase.

Review Date: 2026-05-19 (Updated)
Reviewer: Mistral Vibe
Target Version: 0.0.2+ (post-fixes)

Executive Summary

Harvey has significantly improved its security posture since the initial review. All critical issues identified previously have been addressed:

✅ /run command now uses parseCommandLine() validation
✅ Sensitive environment variable list expanded (Cohere, Groq, Perplexity)
✅ Sensitive file deny list expanded (SSH keys, authorized_keys)

Current Security Score: 8.5/10 - Strong foundation, suitable for personal and experimental use. See SECURITY.md for known limitations before production deployment.

Strengths (Well Implemented)

Security Feature	Implementation	Status
API Key Filtering	`filterSkillEnvironment()` & `filterCommandEnvironment()` block 11 LLM provider API keys	✅ Excellent
Shell Injection Prevention	`exec.CommandContext()` used directly (no shell), `parseCommandLine()` parses without shell interpretation	✅ Good
Path Traversal Protection	`AbsPath()` validates workspace containment, `resolveWorkspacePath()` checks symlinks	✅ Good
Sensitive File Blocking	Tools reject `.env`, `.pem`, `.key`, `.p12`, `.pfx`, `harvey.yaml`, SSH keys	✅ Excellent
Workspace Sandboxing	All file ops constrained to workspace root via `Workspace.AbsPath()`	✅ Good
Safe Mode	Command allowlist with `/safemode` management	✅ Good
Permission System	Path-prefix based `read`, `write`, `exec`, `delete` controls	✅ Good
Audit Logging	In-memory ring buffer tracks command execution, file ops, skill runs	✅ Good
Editor Validation	`validateEditorPath()` rejects shell metacharacters and `..` traversal	✅ Good
Resource Cleanup	Proper `defer` usage for file handles, HTTP responses, contexts	✅ Good
Timeout Protection	Configurable `RunTimeout` for shell commands via `exec.CommandContext()`	✅ Good

Fixed Issues (Previously Reported, Now Resolved)

Issue	Location	Status	Fix Applied
`/run` command lacked shell metacharacter validation	`commands.go:2190`	✅ FIXED	Now uses `parseCommandLine(cmdLine)` at line 2190
Incomplete sensitive env var list	`skill_dispatch.go`, `commands.go`	✅ FIXED	Added COHERE_API_KEY, GROQ_API_KEY, PERPLEXITY_API_KEY
Missing SSH key file blocking	`tools.go`	✅ FIXED	Added `id_rsa`, `id_ed25519`, `authorized_keys` to deny list

Remaining Medium Priority Issues

Issue	Location	Risk	Recommendation
No TLS verification for remote routes	`routing.go`, `anyllm_client.go`	API key leakage over plain HTTP	Add TLS verification option, warn on HTTP URLs for cloud providers
Install script uses pipe-to-shell	`installer.sh`, `INSTALL.md`	Supply chain risk	Document SHA256 verification step before piping to shell
No rate limiting on tool calls	`builtin_tools.go`	DoS via rapid tool invocations	Add per-turn and per-session tool call limits (configurable)
Symlink race condition window	`workspace.go:AbsPath()`	TOCTOU between path resolution and file access	Consider using O_PATH file descriptors for atomic resolution
Agents dir check is string-based	`tools.go:isAgentsDir()`	Could be bypassed by symlink	Use `filepath.EvalSymlinks` in `isAgentsDir()` for defense in depth
No validation of route URLs	`routing.go:clientForEndpoint()`	SSRF potential	Validate URL format, restrict to known schemes (http/https)
No input length validation on tool args	`builtin_tools.go`	Memory exhaustion via large inputs	Add max length checks on all string inputs (e.g., 64KB)
PowerShell execution on Windows	`skill_dispatch.go:287`	Full script execution with bypass	Consider same path restrictions as bash; restrict to workspace

Clarifications (Documentation Updates Needed)

parseCommandLine() Behavior

Documentation (corrected): SECURITY.md originally stated that parseCommandLine() “rejects shell metacharacters.” This has been corrected — SECURITY.md now accurately describes that security comes from exec.Command() not invoking a shell.

Actual Behavior: The function parses command lines with quote handling but does NOT reject shell metacharacters like |, >, <, &, ;.

Why This Is Still Secure: Security comes from using exec.CommandContext(program, args...) which does NOT invoke a shell. Metacharacters are passed as literal string arguments to the program, not interpreted by a shell. This is actually a strength - it avoids shell injection entirely.

Recommendation: Update documentation to clarify that: - Shell metacharacters are NOT filtered/rejected by parseCommandLine() - Security is achieved by NOT using a shell (direct exec) - Metacharacters are passed as literal arguments

Code Quality & Hardening

Item	Status	Notes
Go version	1.26.2	Current, good
Dependencies	Reviewed	Uses `mozilla-ai/any-llm-go`, `go-sqlite` - no known CVEs in current versions
Error handling	Good	Errors are checked and wrapped
Resource limits	Partial	Timeouts exist, but no memory limits on skill execution
Logging	In-memory only	Audit buffer doesn’t persist - good for privacy but consider optional file logging

Security Scorecard

Category	Score	Notes
Command Injection	9/10	Strong protections via direct exec + parseCommandLine. PowerShell on Windows needs review.
Path Traversal	9/10	Well protected, minor symlink TOCTOU remains
Secrets Management	9/10	API keys filtered comprehensively, but pattern-based filtering would catch edge cases
Input Validation	7/10	Tool args need length limits; URL validation needed for routes
Network Security	6/10	No TLS enforcement for remote routes; SSRF possible
Sandboxing	8/10	Workspace containment strong, but pure Go has inherent limits

Overall: 8.5/10 - Production-ready with minor hardening recommended

Recommendations for Next Release

High Priority (Address Before Production Deployment)

Network Security Hardening
- Add TLS certificate verification for all remote LLM endpoints
- Warn or block HTTP URLs for cloud providers (Anthropic, OpenAI, Mistral, etc.)
- Validate URL schemes in clientForEndpoint() to prevent SSRF
Windows PowerShell Security
- Restrict PowerShell script execution to workspace directory
- Consider using -NoProfile flag to prevent profile-based attacks
- Add execution policy restrictions

Medium Priority (Address in Next Minor Release)

Input Validation
- Add maximum length limits on tool arguments (e.g., 64KB per string argument)
- Validate route URL formats before connection
- Consider pattern-based filtering for environment variables (e.g., *_API_KEY, *_SECRET)
Rate Limiting
- Add configurable rate limits on tool calls per turn and per session
- Consider resource quotas (CPU, memory) for skill execution
Symlink Hardening
- Review TOCTOU windows in workspace path resolution
- Consider using O_PATH file descriptors for atomic file access

Low Priority (Future Enhancements)

Audit Persistence
- Add optional file-based audit logging (disabled by default for privacy)
- Consider remote audit log shipping for enterprise deployments
Sandboxing Improvements
- Consider using Linux namespaces/seccomp for stronger isolation
- Evaluate gVisor or similar for untrusted code execution

Detailed Analysis

1. Environment Variable Filtering

Current State: Harvey filters known LLM API keys from child processes executed via skills and /run commands.

Strengths: - Comprehensive filtering in filterSkillEnvironment() and filterCommandEnvironment() - Includes all major providers: Anthropic, OpenAI, Mistral, DeepSeek, Google/Gemini, Cohere, Groq, Perplexity - Always ensures PATH is set - Both functions maintain synchronized lists

Gaps: - No pattern-based filtering (e.g., *_API_KEY, *_SECRET, *_TOKEN) - Could miss custom/named API key variables

Recommendation: Consider adding pattern-based filtering as a fallback, but be cautious of false positives.

2. Command Execution

Current State: Harvey uses exec.CommandContext() directly without shell interpretation.

Strengths: - No shell injection via sh -c pattern - parseCommandLine() handles quoted arguments properly - Timeout support via context - Environment filtering applied to all child processes - Safe mode restricts executable commands

Implementation Details: - /run command: Uses parseCommandLine() validation (line 2190) - run_command tool: Uses parseCommandLine() validation (line 268) - git_command tool: Uses parseCommandLine() for args (line 342) - Skill execution: Direct bash/powershell invocation with filtered environment

Note: The unused cmdRunCtx function (line 3469) does NOT use parseCommandLine(), but since it’s not registered as a handler, this is not a security issue. However, it should either be removed or updated for consistency.

3. Path Traversal Protection

Current State: Workspace paths are validated to stay within the workspace root.

Strengths: - Workspace.AbsPath() uses filepath.Clean() and prefix checking - resolveWorkspacePath() additionally checks symlinks via EvalSymlinks - Sensitive files blocked by extension/name pattern - Agents directory explicitly blocked

Implementation: - AbsPath(): String-based prefix comparison (potential TOCTOU) - resolveWorkspacePath(): Uses EvalSymlinks for symlink resolution - sensitiveFileDenied(): Checks against deny list including SSH keys

Gaps: - Small TOCTOU window between path resolution and actual file access - String-based comparison in isAgentsDir() could theoretically be bypassed

4. Sensitive File Access

Current State: Tools reject access to known sensitive file patterns.

Deny List (sensitiveDenyPatterns in tools.go): - .env, .pem, .key, .p12, .pfx (extension matching) - authorized_keys, harvey.yaml, id_ed25519, id_rsa (exact name matching) - Files starting with .env (prefix matching)

Strengths: - Comprehensive coverage of common sensitive files - Covers SSH keys, certificates, config files

Gaps: - No size limits on files that can be read - Could miss custom sensitive filenames

5. Network Security

Current State: Remote routes can use HTTP or HTTPS.

Strengths: - Cloud providers use environment variables (which are filtered from child processes) - Routing can be disabled via configuration

Gaps: - No TLS certificate verification enforced - HTTP URLs accepted for cloud providers (potential credential leakage) - No validation of endpoint URLs (SSRF potential) - No hostname validation

URL Schemes Supported: - ollama://, llamafile://, llamacpp:// → converted to HTTP - Direct http:// and https:// URLs also accepted

Conclusion

Harvey has matured into a capable tool with a strong security foundation, suitable for personal and experimental use. All previously identified critical issues have been addressed. Review SECURITY.md for known limitations before deploying in a sensitive or multi-user environment.

Key Achievements: 1. ✅ Command injection protection via direct exec (no shell) 2. ✅ Comprehensive API key filtering 3. ✅ Workspace sandboxing with path traversal protection 4. ✅ Sensitive file blocking 5. ✅ Safe mode for command restrictions 6. ✅ Audit logging

Remaining Work: - Network security hardening (TLS, SSRF prevention) - Windows PowerShell restrictions - Input validation improvements

The codebase demonstrates security-conscious design throughout, with proper use of Go’s security primitives and defense-in-depth approaches.

References

This review was conducted by analyzing the Go source code in the harvey/ directory, including: commands.go, skill_dispatch.go, tools.go, routing.go, anyllm_client.go, workspace.go, permissions.go, audit.go, terminal.go, builtin_tools.go, config.go.