Run your agent stack at expert level
This is the working manual for the always-on AI stack you own: Rook (your primary agent) running on OpenClaw, which drives the Claude Code CLI as its runtime, plus Bolo on its own Hermes gateway. It decodes exactly how the machine is wired today, what you gained and silently lost in the move to the CLI runtime, and the concrete levers to pull next. No marketing — just how to operate it.
The decision
You are committing to the claude-cli runtime. Rook's main loop
is the claude binary (v2.1.156) spawned by OpenClaw, billed through your
Claude Max subscription via a local proxy at 127.0.0.1:18801 instead
of API rates. The live config is ~/.openclaw/openclaw.json
(config.json is dead legacy). Primary model just switched to
claude-opus-4-8.
What you own now
The full Claude Code harness (skills, MCP, hooks, subagents, plan mode) on top of OpenClaw's gateway, cron fleet, and memory core — minus OpenClaw's pi-era context management, which no longer runs.
Today's lesson, in two lines
1. Context blew to "999% / 0 compactions" and crashed because
OpenClaw's safeguard compaction is a pi-runtime feature that does not run
under claude-cli, and the CLI's own auto-compact does not fire in headless
-p mode.
2. Therefore transcripts grow unbounded — context hygiene is now your job via
hooks, session rotation, and fresh-session discipline (see §04).
Architecture Map
Two independent agents on two gateways. Rook bills Anthropic Max; Bolo bills OpenAI Codex. Drawn from the live config — no external libraries.
Bolo is a distinct operator agent — "not Rook in another shell." Coordination with Rook happens through the wiki three-surface model, not a shared runtime.
The Runtime Truth — claude-cli vs pi
Moving Rook's main loop from OpenClaw's embedded pi runtime to the claude-cli runtime was a real trade. You gained the Claude Code harness and Max billing; you lost OpenClaw's native context engineering and accounting. Several config blocks that still look active are now silently inert.
| Dimension | Gained on claude-cli | Lost from pi runtime |
|---|---|---|
| Billing | Claude Max via local proxy :18801 — flat-rate, no API metering | Per-call cost telemetry OpenClaw used to compute |
| Harness | Full Claude Code: skills, MCP, hooks, subagents, plan mode, output styles | — |
| Context mgmt | CLI's own 200k window | Safeguard compaction + memoryFlush + contextPruning (pi hooks) — inert |
| Accounting | — | Accurate cost / token-window math (opus-4-8 isn't even in the provider catalog) |
| Failover | — | Multi-provider failover at the OpenClaw layer (CLI bills one entitlement) |
| Tool governance | CLI hooks (unused today) | OpenClaw's per-call tool policy — bypassed by bypassPermissions |
These keys are still in openclaw.json but do nothing for Rook's main
loop — they were pi-runtime behaviors:
agents.defaults.compaction.*—mode:safeguard,keepRecentTokens,reserveTokensFloor,recentTurnsPreserve,postCompactionSectionsinertcompaction.memoryFlush.enabled— the "flush to memory before trim" hook never fires inertagents.defaults.contextPruning.mode = "cache-ttl"— pi context-engine behavior inertplugins.slots.contextEngine = "legacy"+ Cortex engine disabled — no context engine is running
Treat context hygiene as a CLI-side + skill-side concern, not an OpenClaw-compaction concern. The watchdog (kill/retry on 8-min silence) is the only reliability mechanism that does still run.
Claude Code CLI Mastery
Everything starts with the exact invocation OpenClaw assembles. Master these flags and you understand the whole runtime.
claude \ -p \ # headless / print mode — no TUI --include-partial-messages \ # token-level streaming to Telegram --verbose \ # full event stream on stdout --setting-sources user \ # ONLY ~/.claude/settings.json loaded --allowedTools mcp__openclaw__* \ # baseline allowlist (moot under YOLO) --resume <session-uuid> \ # rehydrate transcript .jsonl --permission-mode bypassPermissions \ # YOLO: no prompts, all tools --strict-mcp-config \ # ignore all other MCP configs --mcp-config /tmp/openclaw-cli-mcp-*/mcp.json \ --plugin-dir /tmp/openclaw/.../openclaw-skills \ --effort high \ # from thinkingDefault: high --model opus \ # ALIAS; real model pinned at proxy --input-format stream-json \ --output-format stream-json \ --permission-prompt-tool stdio \ --replay-user-messages
Flag-by-flag decode (the load-bearing ones)
| Flag | What it means here |
|---|---|
-p | Headless. The fundamental constraint: Rook is non-interactive. Required for stream-json output. Means the CLI's interactive auto-compact loop never runs. |
--setting-sources user | Only ~/.claude/settings.json is loaded. Project .claude/settings.json and .local are ignored — and OpenClaw rewrites this to user even if told otherwise. Per-repo settings have no effect on Rook. |
--permission-mode bypassPermissions | YOLO. No prompts, all tools run. Injected because tools.exec is security:full, ask:off. This is why Rook runs arbitrary Bash with no human in the loop. |
--model opus | Alias, not a pinned version. Every opus-* collapses to bare opus; the real model (claude-opus-4-8) is selected at the proxy. Changing the model config does not change this flag. |
--effort high | Reasoning budget from thinkingDefault. Levels: low / medium / high / xhigh / max. xhigh/max available but unused. |
--resume <uuid> | Rehydrates ~/.claude/projects/-home-botbox--openclaw-workspace/<uuid>.jsonl. Re-sends the whole transcript every turn — the root of the unbounded-growth problem. |
--strict-mcp-config | Ignore every other MCP config; use only the ephemeral /tmp file. Keeps Rook's tool surface deterministic. |
Notable absences: no --append-system-prompt-file on resume (persona is injected as the first user message instead), no --add-dir,
--agents, --exclude-dynamic-system-prompt-sections,
--max-budget-usd, or --fallback-model. Env is scrubbed of all
ANTHROPIC_* / CLAUDE_CODE_OAUTH_* before spawn — auth comes solely
from ~/.claude/.credentials.json (Max OAuth) which the proxy intercepts.
Capability inventory — leverage status
| Capability | Status | Notes |
|---|---|---|
Skills (/skill-name, plugin-dir) | leveraged | 49 OpenClaw skills — the primary capability surface Rook uses. |
| MCP servers (openclaw + wiki) | leveraged | Sessions, memory, web, cron, subagents, wiki tools. |
| Permission mode / YOLO | leveraged | bypassPermissions — full autonomy. |
| Thinking / effort | leveraged | high; xhigh/max available. |
| Stream-json I/O · resume | leveraged | Core transport + session persistence. |
| CLAUDE.md / memory auto-discovery | untapped | No CLAUDE.md exists. Discovery is enabled but finds nothing. |
| Hooks (settings.json) | untapped | No hooks key. Biggest untapped surface — see below. |
Native subagents (--agents) | untapped | CLI-level fan-out unused; Rook fans out via OpenClaw subagents instead. |
| Plan mode | untapped | --permission-mode plan never used. |
| Custom slash commands | untapped | No ~/.claude/commands/. |
Output styles · --json-schema · --name | untapped | Structured output / session naming unused. |
--exclude-dynamic-system-prompt-sections | recommended | Big prompt-cache win on resume — see below. |
Top untapped levers — concrete how-to
Constraint: only ~/.claude/settings.json is loaded
(--setting-sources user). Everything below goes in the user file or
OpenClaw's arg template.
1. Lifecycle hooks in ~/.claude/settings.json recommended
The single highest-leverage surface. Hooks run deterministic shell on
PreToolUse / PostToolUse / Stop /
SessionStart / PreCompact — the harness runs them, so the
model can't forget. Use the update-config skill to edit safely.
{
"hooks": {
"SessionStart": [
{ "hooks": [ { "type": "command",
"command": "cat ~/.openclaw/workspace/memory/handoffs/latest.md" } ] }
],
"PreCompact": [
{ "hooks": [ { "type": "command",
"command": "~/.openclaw/workspace/scripts/flush-working-surface.sh" } ] }
],
"PreToolUse": [
{ "matcher": "Bash", "hooks": [ { "type": "command",
"command": "~/.openclaw/workspace/scripts/bash-denylist.sh" } ] }
]
}
}2. A workspace CLAUDE.md recommended
Add ~/.openclaw/workspace/CLAUDE.md. It is auto-discovered
(--bare is not set), giving stable, cache-friendly project memory
instead of stuffing everything into the turn-1 user blob. Keep it lean — links to
MEMORY.md, the lane index, and a one-skill-per-task routing map.
3. Native subagents via --agents / ~/.claude/agents/ recommended
Define focused in-session sub-agents (e.g. reviewer,
researcher) with scoped prompts and tools. The CLI's own Task tool can
then parallelize within a session — cheaper than spawning whole new
claude processes for in-session fan-out, and complementary to OpenClaw's
cross-session subagents.
4. --exclude-dynamic-system-prompt-sections for cache reuse recommended
Moves per-machine sections (cwd, env, git status, memory paths) out of the system prompt into the first user message → far better cross-turn prompt-cache reuse on resume. A direct, recurring cost win for a long-lived single-user agent. Requires adding the flag to OpenClaw's arg template.
5. A PreToolUse Bash denylist to guard YOLO mode recommended
bypassPermissions + full host access is powerful but unguarded. A
PreToolUse matcher on Bash can block destructive patterns even
under YOLO: a hook that exits non-zero denies the call.
#!/usr/bin/env bash
# stdin = JSON tool input; exit 2 blocks the tool call.
cmd=$(jq -r '.tool_input.command // ""')
case "$cmd" in
*"rm -rf /"*|*"git push"*"--force"*"main"*|*":(){ :|:&"*)
echo "blocked by denylist: $cmd" >&2; exit 2;;
esac
exit 0Optionally pair risky lanes with --bare + a sandbox. The
CLI offers the hook mechanism; nothing uses it today.
Context Hygiene Playbook
Context hit "999% / 0 compactions" and crashed every turn. Root cause:
- OpenClaw's safeguard compaction is a pi-runtime hook — it does not run under claude-cli (OpenClaw reports
Compactions: 0no matter how big context grows). - The CLI's own auto-compact does not fire in headless
-pmode — there's no interactive compaction loop. --resumere-sends the entire transcript every turn, andcontextPruning: cache-ttldeliberately keeps the full history for cache hits. Nothing trims by token count.- Result: transcripts grow unbounded (some
.jsonlfiles are 7–17 MB) until they exceed the window.
Fix patterns
PreCompact + SessionStart hooks
PreCompact: flush the live working-surface to
memory/handoffs/latest.md (the memoryFlush analog).
SessionStart: re-inject the "Every Session / Memory / Safety" blocks as
additionalContext — recreates the pi post-compaction re-prime.
Session rotation
The cheapest reset is a fresh session: drop --resume,
get a new UUID. Use OpenClaw session rotation (not CLI /clear, which doesn't
apply headless). Tighten the 365-day group idle-reset for noisy groups.
Bootstrap trim
The fixed turn-1 injection (MEMORY.md + context-tree + DREAMS + handoffs + SOUL/IDENTITY) is large and pushes context up fast on the fixed 200k window. Trim to load-bearing lines; move the rest to read-on-demand docs and a lean CLAUDE.md.
Fresh-session discipline
Start a new session at thread forks and at high context pressure
(SESSION-RITUAL's >70% trigger). Run memory-close / Cortex distill
before rotating so nothing salient is lost.
Do not trust these on-screen numbers — they reflect the pi context engine that isn't running:
- Compaction count — always
0; compaction isn't happening. - Context % / utilization — can read absurd values ("999%") because OpenClaw's 100k working-context budget doesn't match the CLI's real 200k window.
- OpenClaw cost accounting —
claude-opus-4-8isn't in the provider catalog, so cost/token-window math for the model Rook actually runs is missing.
OpenClaw Control Surface
The knobs that matter in ~/.openclaw/openclaw.json (live, 20.5 KB).
Scannable, with status badges.
| Knob | Value | Status |
|---|---|---|
model.primary | anthropic/claude-opus-4-8 | leveraged |
model.fallbacks | opus-4-7, sonnet-4-6 | leveraged |
agents.list[main].model | claude-opus-4-7 ⚠ overrides default to 4-7 | fix |
thinkingDefault | high | leveraged |
maxConcurrent / timeoutSeconds | 5 / 900s | leveraged |
subagents.model | claude-opus-4-8 (most expensive) | cost |
watchdog noOutputTimeoutMs | 480000 (8 min) kill/retry | leveraged |
contextPruning / compaction.* | cache-ttl / safeguard | inert |
gateway port:18789, bind:loopback | localhost only | leveraged |
tailscale.mode | off | untapped |
heartbeat | {} (off) | untapped |
| memory-core plugin · dreaming | enabled | leveraged |
| cortex-engine plugin | enabled:false, slot = legacy | disabled |
session.resetByType.group | idle 525600 min ≈ 365 days | tighten |
- Subagents run on Opus-4-8 (input/output ~15/75 vs sonnet ~3/15). Pin
subagents.model = anthropic/claude-sonnet-4-6for fan-out / grunt work → roughly 5× cheaper. Single biggest multi-agent cost lever. - 5 wiki crons run as Rook on Opus (Topology Steward 4h, Intake Curator 2h,
Governance Sentinel daily, Decision Synthesizer daily). Each firing is a full Opus
turn — route the routine ones to
sonnet-4-6. - Disable the temporary Voice-Note Backfill cron (every 30 min) once the Mar 19–Apr 16 batch is done — it's a backlog task, not an ongoing need.
Cortex context-engine — installed but disabled disabled
The Cortex context engine is path-loaded but switched off
(plugins.entries.cortex-engine.enabled:false), and the
contextEngine slot is pinned to legacy. Cortex injects context
via prompt injection rather than relying on pi compaction — so it could plausibly
run alongside the claude-cli runtime and is the one piece that might actually
improve Rook's context under the CLI. Worth a dedicated experiment.
How We Work, Leveled Up
The system is architecturally rich and operationally drifting: the design (rituals, two-agent split, five memory layers, intelligence-loop crons) is excellent, but it relies on the operator remembering to keep it fresh, and durable state has fallen weeks behind the live system.
Current rituals current
- Session open/close (
docs/SESSION-RITUAL.md): name session → read MEMORY → today's note →latest.md→ inbox → checkpoint; close with a Landed/Open/Next handoff. Inconsistently applied — handoffs have a 10-day gap andlatest.mdis stale. - Five memory layers: Cortex (recall), ByteRover (context-tree), LCM, Memory Wiki, Dreaming. Overlapping; MEMORY.md is ~6 weeks stale.
- Rook/Bolo split: Rook = synthesis/architecture; Bolo = last-mile implementation. Coordinated via the wiki three-surface model.
Upgrades upgrade
- Hook-enforce the ritual.
SessionStartauto-reads MEMORY/today/latest and prints the working surface;Stop/PreCompactverify a fresh close-note exists or draft one. Turns the most-violated discipline into a default. - Auto-fresh MEMORY.md. A daily cron regenerates the volatile half (crons, services, model config, active-project manifest) from live queries — kills freshness debt at the root.
- One canonical skill per task. Map each recurring task (close session, ship PDF, sweep intel, build a phase, health-check) to exactly one skill; demote the ~80-skill sprawl.
Net-new, now possible new
- Subagent fan-out — one Rook session spawns N parallel agents (audit backend / audit frontend / write deploy doc concurrently) and aggregates. The real implementation of the orchestrator vision.
- Plan mode — produce a reviewable plan before touching the Tend VPS or proxy (the "deploy, not just build" lesson).
- git-worktree parallel work (
EnterWorktree) — advance Tend-solo and Tend-team in isolated worktrees with no branch collisions. - MCP-native recall — unify the five-system recall surface behind
mcp__openclaw__memory_search+mcp__wiki__*instead of shelling out.
Ported Pi-Era Goodies
Migration to claude-cli happened ~2026-05-18 → 05-22. Some good habits survived; some died at the boundary. Here's what to bring forward.
| Pi-era thing | Verdict | Port path |
|---|---|---|
| Compaction safeguard + memory-flush | lost | PreCompact + SessionStart hooks |
Dream Diary (DREAMS.md) | still-works (diary degraded) | Nightly cron to render from memory/dreaming/light/ |
| Continuity reload ritual | still-works | None — it's the template (file-driven) |
| Calibration / reflex loop | lost | CLAUDE.md line + Stop hook to append corrections |
| Heartbeat | lost (by design) | Use crons / loop skill, not heartbeat |
| Ritual crons (Distill/Close/Brief/Patrol) | lost (disabled) | Flip enabled:true; crons still work |
| Mind state (decisions/threads JSON) | lost (frozen) | Don't revive server; fold into handoff note |
| Personality / SOUL | still-works | Trim for context budget only |
Top 3 ports (highest value, lowest effort)
- Re-enable the ritual crons portable — Daily Log Close (02:00), Cortex Distill (01:00), Morning Brief (06:30), Patrol (4h). They were switched off, not broken; crons demonstrably still work on claude-cli. Verify each fires once before enabling the next; watch token cost on Opus-routed jobs.
- Add PreCompact + SessionStart hooks portable — replaces the dead pi memory-flush / post-compaction re-prime. Directly addresses today's context-blowup incident.
- Revive the Dream Diary writer portable —
candidate data (
memory/dreaming/light/*.md) is still produced nightly; only theDREAMS.mdlast-mile render broke. A tiny nightly cron closes it.
Operator Runbook / Cheat-Sheet
Config files map
| File | Role |
|---|---|
~/.openclaw/openclaw.json | LIVE OpenClaw config — model, agents, crons routing, tools |
~/.openclaw/config.json | LEGACY / dead — do not edit, not loaded |
~/.claude/settings.json | The only CLI settings file loaded (user source). Hooks go here. |
~/.claude/.credentials.json | Max OAuth creds the proxy intercepts (never quote contents) |
~/.openclaw/cron/jobs.json | Cron fleet (6 enabled, 27 disabled) |
~/.claude/projects/-home-botbox--openclaw-workspace/*.jsonl | Session transcripts (grow unbounded) |
Key paths
- CLI binary —
/home/botbox/.local/bin/claude(v2.1.156) - Workspace —
/home/botbox/.openclaw/workspace - Memory store —
~/.openclaw/lcm.db· dreamsmemory/.dreams/ - Wiki vault —
~/.openclaw/wiki· MCPprojects/wiki-mcp/src/server.mjs - Billing proxy —
http://127.0.0.1:18801· gateway:18789(loopback)
Common commands
openclaw gateway restart # or: /restart from Telegram (restart:true)
# Edit ~/.openclaw/openclaw.json: # agents.defaults.model.primary -> "anthropic/claude-opus-4-8" # agents.list[main].model -> "anthropic/claude-opus-4-8" (or remove to inherit) # Note: the --model CLI flag stays "opus"; real model is pinned at the proxy.
# Fresh session = drop --resume, new UUID (OpenClaw session rotation). # From Telegram, end the topic with the close skill, then start fresh. # Old transcripts live in ~/.claude/projects/.../*.jsonl (prune large stale ones).
openclaw cron list # enabled vs disabled # Re-enable a ritual cron: set "enabled": true in ~/.openclaw/cron/jobs.json
systemctl status hermes-gateway.service systemctl restart hermes-gateway.service journalctl -u hermes-gateway.service -n 100 --no-pager
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
| "999% context / 0 compactions", crashes each turn | No compaction runs under claude-cli | Rotate to a fresh session; add PreCompact/SessionStart hooks |
| Rook seems to run the wrong model | agents.list[main].model still opus-4-7 | Set it to opus-4-8 or remove the override |
| Rook killed mid-deep-think | Watchdog 8-min no-output timeout | Raise noOutputTimeoutMs for long silent turns |
Per-repo .claude/settings.json ignored | --setting-sources user forced | Put settings in ~/.claude/settings.json |
| Cron output written but never read | Write-mostly automation | Route output to Wake Digest / HQ flag / no-op |
Bolo: TypeError 'NoneType' object is not iterable | Hermes outage — stale gateway + model allowlist | Update Hermes + add the model to the allowlist, then restart the service (the recent fix) |
Bolo / Hermes runbook
Bolo is a separate agent on the Hermes gateway
(systemd: hermes-gateway.service), billing via OpenAI Codex (gpt-5.5).
The recent outage surfaced as TypeError: 'NoneType' object is not iterable;
the fix was updating Hermes and adding the in-use model to the model allowlist,
then restarting the service. Standard triage: systemctl status →
journalctl -u hermes-gateway.service → verify the model is allow-listed →
restart. Bolo coordinates with Rook through the wiki three-surface model
(canonical / coordination queue / scratch), not a shared runtime.
Your Next Moves
Prioritized. Quick wins (green) first, bigger plays (blue) after.
- Re-enable the ritual crons
Daily Log Close, Cortex Distill, Morning Brief, Patrol — flipenabled:trueone at a time, verify each fires. - Add PreCompact + SessionStart hooks
In~/.claude/settings.json— flush working-surface on compact, re-inject the Every Session / Memory / Safety blocks on start. - Move subagents + routine wiki crons to sonnet-4-6
~5× cost cut on fan-out and recurring work; reserve Opus for interactive Rook. - Fix the model override
Setagents.list[main].modeltoclaude-opus-4-8(or remove it) so Rook actually runs 4-8, not 4-7. - Adopt fresh-session discipline + disable the Voice-Note Backfill cron
Rotate sessions at thread forks / high pressure; kill the every-30-min backlog cron. - Add a workspace
CLAUDE.md+ a PreToolUse Bash denylist
Cache-friendly project memory; guard YOLO mode against destructive commands. - Revive the Dream Diary writer
Tiny nightly cron rendersDREAMS.mdfrommemory/dreaming/light/— data already exists. - Auto-generate the volatile half of MEMORY.md
Daily cron from live queries (crons, services, model, project manifest) — kills freshness debt. - Exercise native parallelism
Subagent fan-out, plan mode, and git-worktree parallel project work on the next Tend build. - Experiment with Cortex under claude-cli
It injects context via prompt injection — the one engine that could improve context without pi compaction.
Sources: research files 01–04 in
projects/operator-manual/research/. This page hosts on Surge and contains no
secrets — credentials are referred to by path only.