Skip to content

Add claude-code harness runtime — agents into running processes#11

Merged
orveth merged 2 commits into
masterfrom
agent-runtime
May 27, 2026
Merged

Add claude-code harness runtime — agents into running processes#11
orveth merged 2 commits into
masterfrom
agent-runtime

Conversation

@orveth

@orveth orveth commented May 27, 2026

Copy link
Copy Markdown
Collaborator

Wires declared agents into actual running processes — closing the loop from #7's schema and #9's design to a deployable system.

What's in

  • modules/harnesses/claude-code.nix (new) — per-harness module that, for each services.forge.agents.<name> with harness = "claude-code":
    • generates a wrapper script forge-agent-<name> (sets up state dir, writes CLAUDE.md from the declared role, plumbs the discord bot token from the configured secret file, execs claude through a shared tmux server)
    • registers a systemd service running as the agent's runAs user, with restart-on-failure and a start-rate limit
  • flake.nixclaude-code input (github:sadjow/claude-code-nix); package passed to modules via specialArgs
  • modules/default.nix — imports the new harness
  • deployments/agicash-team-forge/configuration.nix — commented-out example agent paired with the sops bot-token declaration, ready to uncomment after the secrets bootstrap
  • flake.lock (new) — first lockfile committed; pins all current inputs

Runtime model

  • PTY via tmux. claude-code is interactive; bare systemd kills stdin. The wrapper execs through tmux -L forge new-session -A -s agent-<name> — same socket for every agent on a user, so tmux -L forge ls enumerates them.
  • System services (not user services). User = agent.runAs, wantedBy = multi-user.target. Native autostart, no linger gymnastics. Logs via standard journalctl -u forge-agent-<name>.
  • State dir at ~/.local/state/forge/agents/<name>/ (XDG, per-user, no privilege elevation).
  • Restart = on-failure, RestartSec=5s, StartLimitBurst=3 / StartLimitIntervalSec=60.

Decisions on PR #9 open questions

# Question Decision
Q1 Linger Not needed — system services with User= autostart natively
Q2 State dir ~/.local/state/forge/agents/<name>/
Q3 Restart on-failure, 5s, burst-3
Q4 Flake attr (deferred — wrappers live in systemPackages; nix run form is a follow-up)
Q5 Identity separation Deferred — Mercury/Nostr modules not in this PR
Q6 tmux Always on (claude-code needs PTY); not opt-in
Q7 Multi-host registry Deferred

Demo flow

After bootstrap (per docs/secrets-bootstrap.md):

  1. Operator generates age key, encrypts secrets.yaml with the bot token
  2. Uncomment the secrets+bot+agent block in configuration.nix
  3. nix run .#deploy.agicash-team-forge
  4. systemd starts forge-agent-coordinator on boot; the agent joins Discord and responds to @mentions
  5. sudo -u gudnuf tmux -L forge attach -t agent-coordinator to read along

What's NOT in (deferred)

  • nix run .#agent.<name> form — flake-app form for manual invocation. Wrappers are in systemPackages so forge-agent-<name> works on PATH; flake-app form is cosmetic and can land in a follow-up.
  • Mercury, pikachat, Nostr — separate modules when they arrive.
  • Multiple harnesses — codex / custom processes slot in as modules/harnesses/<name>.nix later, no changes to the agent schema.
  • Cross-machine forge CLI — punt per the runtime design doc until pain shows up.

Validation

nix eval .#nixosConfigurations.agicash-team-forge.config.system.build.toplevel.drvPath evaluates cleanly to a valid derivation (with a placeholder deploy-config.nix for testing only — gitignored, not committed).

Wires declared agents into running processes. For each
services.forge.agents.<name> with harness = "claude-code", the new
modules/harnesses/claude-code.nix generates:
- a wrapper script forge-agent-<name> that sets up the agent's state
  dir, writes CLAUDE.md from the declared role, plumbs the discord bot
  token from the secret file (if discordBot is set), and execs claude
  through a shared tmux server (-L forge) for PTY + attachability
- a systemd service running as the runAs user with restart-on-failure
- claude-code, tmux, and the wrappers in environment.systemPackages

Adds claude-code-nix (github:sadjow/claude-code-nix) as a flake input
and passes it to modules via specialArgs.

Decisions baked into this PR (open questions from PR #9):
- linger NOT used; system services with User=<runAs> autostart natively
  via wantedBy = [ "multi-user.target" ]
- state dir at ~/.local/state/forge/agents/<name>/ (XDG, per-user)
- Restart=on-failure, RestartSec=5s, StartLimitBurst=3
- tmux always on (claude-code needs a PTY); shared "forge" socket
- harness lives in modules/harnesses/<harness>.nix so codex and future
  custom processes swap in cleanly without touching the schema

Updates deployments/agicash-team-forge/configuration.nix with a
commented-out example agent paired with the sops bot-token declaration,
ready to uncomment after the secrets bootstrap.

Commits the first flake.lock for reproducibility.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread modules/harnesses/claude-code.nix Outdated
export DISCORD_STATE_DIR="$STATE_DIR"
''}

export CLAUDE_CODE_MAX_THINKING_TOKENS=-1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this does anything anymore. Can you verify that?

Verified against claude-code 2.1.150 unwrapped binary: only
CLAUDE_CODE_EFFORT_LEVEL and CLAUDE_CODE_SIMPLE are read.
CLAUDE_CODE_MAX_THINKING_TOKENS is not referenced anywhere in the
binary; the setting was carried over from v1's launch-agent but no
longer does anything.

Replace with `--effort max` on the claude invocation, matching v1's
default effort level. Per-agent override via an `effort` agent
option is a follow-up if/when we want it.

Addresses gudnuf's review comment on PR #11.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@orveth orveth merged commit 7a988cf into master May 27, 2026
@orveth orveth deleted the agent-runtime branch May 27, 2026 17:41
orveth added a commit that referenced this pull request May 27, 2026
Verified: --bare with HOME set but no ANTHROPIC_API_KEY reports
"Not logged in" — confirms --bare is mutually exclusive with
subscription OAuth (which is the working auth model on turtle and
the desired flow for laptops). Per gudnuf: "when we auth now, we
use OAuth to log into our subscription account."

Drop --bare from the default recipe. systemd User= sets $HOME, and
the wrapper only changes CWD (not HOME), so claude-code reads
~/.claude/.credentials.json automatically. ONE LOGIN PER LINUX USER,
ALL AGENTS INHERIT — for free, no env-var plumbing or apiKeyHelper.

Tradeoff documented: ~12 binary-bundled skills appear in system
prompt regardless of declared whitelist. Acceptable noise; the
bounded-context property is "mostly achieved" without --bare given
gudnuf's setup (no user-level skills, per-agent CWD isolates MCPs,
--setting-sources project blocks settings leak).

Added an Authentication section explaining the one-login-per-user
property and prerequisite (`claude login` once per Linux user).

Recipe also adds explicit --effort max (matches v1 default, ties
into the dead CLAUDE_CODE_MAX_THINKING_TOKENS fix from PR #11).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
orveth added a commit that referenced this pull request May 27, 2026
…s) (#12)

* Spec: per-agent environment isolation (skills + MCPs + plugins + tools)

Designs how forge v2 makes the trust gradient enforceable and bounded
context concrete at the agent boundary — each agent runs claude-code
under --bare with explicit per-agent skills/MCPs/plugins/permissions,
drawn declaratively from a shared library.

Uses existing claude-code flags (--bare, --strict-mcp-config,
--plugin-dir, --add-dir, --settings, --allowedTools) — no new
abstractions. Submodule-merging pattern mirrors modules/discord.nix.

Includes nix develop .#agent.<name> as a debug affordance, validation
criteria, and seven open questions for review.

This is a spec PR (markdown only). Implementation lands in a follow-up
after the runtime PR (#11) merges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Rewrite spec to ground in v1 launch-agent reality

Three corrections after auditing /home/gudnuf/forge/scripts/launch-agent
and /srv/forge/.claude/skills:

1. Skills are discovered from CWD/.claude/skills/, not user-level.
   v1 launch-agent already cd's into /srv/forge so /srv/forge/.claude
   /skills/ resolves as project-level. The v2 seam is per-agent CWD,
   not --add-dir or --bare. Drop --bare entirely.

2. Skills, MCP servers, and plugins are three distinct mechanisms in
   claude-code. Spec now separates them: skills via CWD symlink farm,
   MCPs via --strict-mcp-config + .mcp.json, plugins via --plugin-dir.
   v1 used --channels for plugins; v2 prefers --plugin-dir pointing at
   Nix-managed sources, but the open question of whether the official
   discord plugin loads identically under --plugin-dir is flagged.

3. Added an explicit "How v1 already does this" section grounding the
   four seams (per-agent state dir, CWD discovery, --mcp-config,
   --channels) the design extends. v2 is a declarative reorganization
   of mechanisms that already work, not new abstractions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Synthesize review findings into spec amendments

Technical review (claude-code 2.1.150) refuted three load-bearing
claims; design review surfaced critical deferrals. Major changes:

- --bare + add-back flags, NOT no-bare. Without --bare, 12+
  binary-bundled skills + auto-MEMORY.md + user settings leak
  through regardless of other flags. --bare is the only way to
  actually achieve bounded context. Add --add-dir, --setting-sources
  project, --append-system-prompt-file to restore what we want.
- Drop --strict-mcp-config. It silently strips plugin-bundled MCPs
  (discord, mercury, pikachat, nwc), breaking those plugins entirely.
  --bare already gates user-level MCPs; strict is redundant + harmful.
- --setting-sources project required to prevent user settings.json
  merging in.
- Skill discoverability resolved via generated skill-catalog.md
  appended to system prompt — narrow context + discoverable invocation.
- --dangerously-skip-permissions stays default per gudnuf's call.
  Scoped-down agents need their own design pass.
- Defer: per-agent Linux user, plugin/comms split, hooks family,
  library namespacing, harness portability. All named as known gaps.
- source type discipline: always Nix path.
- Implementation order rewritten with trimmed demo minimum (steps
  1+2+3+4+6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Distinguish marketplace plugins from local MCP servers

Empirical verification of v1 plugin sources on turtle revealed v1's two
"channel" mechanisms load structurally different things:

- --channels plugin:discord@claude-plugins-official loads a REAL plugin
  with .claude-plugin/plugin.json + bundled .mcp.json. Maps cleanly to
  v2's services.forge.plugins.<name> + --plugin-dir.

- --dangerously-load-development-channels server:mercury loads a Bun
  MCP server with NO plugin manifest (just server.ts + package.json).
  Maps to services.forge.mcpServers.<name> + entry in agent's .mcp.json.

The spec's prior example wrongly put Mercury under plugins. Corrected:
Mercury / pikachat / nwc all belong under mcpServers. Real plugins
(discord from the marketplace, future custom claude-code plugins) under
plugins.

Both v1 flags are undocumented in claude-code 2.1.150 --help — on
borrowed time. The v2 mapping uses stable documented flags
(--plugin-dir, --mcp-config) for both.

Added allowedTools example showing the two distinct tool-name prefixes:
mcp__plugin_<plugin>_<server>__* (from plugin-loaded MCPs) vs
mcp__<server>__* (from .mcp.json-loaded MCPs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Drop --bare from default recipe to preserve OAuth auth

Verified: --bare with HOME set but no ANTHROPIC_API_KEY reports
"Not logged in" — confirms --bare is mutually exclusive with
subscription OAuth (which is the working auth model on turtle and
the desired flow for laptops). Per gudnuf: "when we auth now, we
use OAuth to log into our subscription account."

Drop --bare from the default recipe. systemd User= sets $HOME, and
the wrapper only changes CWD (not HOME), so claude-code reads
~/.claude/.credentials.json automatically. ONE LOGIN PER LINUX USER,
ALL AGENTS INHERIT — for free, no env-var plumbing or apiKeyHelper.

Tradeoff documented: ~12 binary-bundled skills appear in system
prompt regardless of declared whitelist. Acceptable noise; the
bounded-context property is "mostly achieved" without --bare given
gudnuf's setup (no user-level skills, per-agent CWD isolates MCPs,
--setting-sources project blocks settings leak).

Added an Authentication section explaining the one-login-per-user
property and prerequisite (`claude login` once per Linux user).

Recipe also adds explicit --effort max (matches v1 default, ties
into the dead CLAUDE_CODE_MAX_THINKING_TOKENS fix from PR #11).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants