Files

T

Backtalk6858 db6cbbdec1 init: add claude-config and agent-builder context files

Initial commit tracking session context, playbooks, and automation specs
for claude-config and agent-builder Claude Code conversations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-06-17 23:08:23 -05:00

5.9 KiB

Raw Blame History

project_name: agent-builder

Agent Builder — Session Context

What this project does

Design, build, and test autonomous N8N agents on server-01 sandbox before any production promotion. First two agents: Agent Builder Agent + N8N Builder Agent.

Scheduled work (2026-06-16, running behind — started ~6:38 PM)

Vision Alignment Grill-Me — Agent Builder + N8N Builder vision + testing methodology
Agent Builder Agent — Deploy + Test (server-01 sandbox)
N8N Builder Agent — Deploy + Test (server-01 sandbox)

Architecture

Agents run as N8N workflows on server-01 (n8n-sandbox, port 5679)
Sandbox-first: all agents tested in sandbox before any production promotion
server-01 sandbox stack: n8n-sandbox, postgres-sandbox, vault-sandbox, bitwarden-bridge-sandbox, vaultwarden-sandbox
Sandbox N8N API key: prod Vault at secret/sandbox/n8n
Sandbox reachable at 192.168.1.90

Key decisions (set during vision grill-me — 2026-06-16)

Agent Builder Agent: builds claude_agent and script types — Ollama (llama3.1:8b) does the building, claude -p is overseer/validator
N8N Builder Agent: builds n8n_automation types — Ollama generates workflow JSON, imports via N8N API, assigns credentials
automation_ideas schema changes needed: rename description → task_description (full structured spec), add type (n8n_automation/claude_agent/script), add builder_status
New agent_test_results table needed in api_business DB
Sandbox must mirror production: AppRole, Vaultwarden, bridge all configured before any agent deploys
Promotion = user approval required after all 4 test levels pass (not auto-promote in v1)
Dedicated backfill session needed for all 48 existing automation_ideas rows (type + task_description)
claude -p uses SDK credits (Pro = $20/month hard limit) — use sparingly, Ollama does the heavy lifting
Local model: llama3.1:8b already pulled on server-01 (4.9GB, fits in RTX 2060 Super 8GB VRAM)

Testing methodology

Four levels: Structure → Deployment → Smoke → Assertion
LLM outputs validated on structure/side-effects only, never exact string match
All results logged to agent_test_results table
NTFY notification on pass and fail
Full methodology: .claude/playbook_testing_methodology.md

Agents

Agent Builder Agent

Status: pending — prereqs not complete
Purpose: Receives automation spec from automation_ideas DB, uses Ollama to build claude_agent or script type automations, deploys to sandbox, runs automated tests, notifies user for promotion approval
Builds: claude agents (via claude -p) and Python scripts (Docker containers)

N8N Builder Agent

Status: pending — prereqs not complete
Purpose: Receives automation spec from automation_ideas DB, uses Ollama to generate N8N workflow JSON using n8n_automations playbook as context, imports to sandbox N8N via API, assigns credentials, runs automated tests
Will be used to build: id=12 (Media Pipeline Learning), id=7 (Friday Research Session Prep)

id=4: N8N Workflow Builder Script (pending, weekend_block1)

Prereq checklist (must complete before any agent deployment)

Schema: rename automation_ideas.description → task_description, add type, add builder_status, add priority
Create agent_test_results table in api_business
Sandbox Vault: set up AppRole auth method (credentials at /opt/appdata/docker/docker-compose/vault/approle/ on server-01)
Sandbox Vault: store sandbox N8N API key at secret/sandbox/n8n (key name: claude-sandbox, verified working)
Verify sandbox Bitwarden bridge ↔ Vaultwarden sandbox end-to-end (bridge on port 8080, returns [] for empty vault — correct)
Write Agent Builder Agent playbook → .claude/playbook_agent_builder_agent.md
Write N8N Builder Agent playbook → .claude/playbook_n8n_builder_agent.md
[~] Backfill session: Resume from priority 28 (id=34, CalDAV Auto-Refresh Trigger) next session. ~30 automations remain. This session reviewed ids 28, 40, 53, 4, 7, 5, 14, 11, 44, 33, 29, 20, 41, 12, 1, 19, 39, 6. Key changes: id=7+50 merged into id=18 (full 10-stage business pipeline); id=18 expanded with Business Research Agent + Development Agent + parked idea email-to-Tyler flow; id=14+20 blocked (already built by schedule workflows); id=12+1+41 blocked (redundant); id=19 blocked (pending Jenkins); id=5 blocked (Obsidian vault not set up yet); id=40 pending with Jenkins conditional note. New rows added: id=54 (NTFY Topic Provisioner, p54), id=55 (Business Research Agent, p51), id=56 (Business Development Agent, p52), id=57 (Sandbox Environment Deployment Completion, p9 — NOTE: conflicts with id=28 priority 9, fix next session). Calendar events pushed to Nextcloud: Thu 6/18 12:45-3PM backfill + 3-4:30PM readiness, Fri 6/19 12:45-4:30PM agent builds.

Final readiness check items (scheduled June 18)

All 8 prereq checklist items verified complete
Sandbox mirrors production: AppRole, bridge, Vaultwarden all confirmed functional
Sensitive output interception system — design and implement before any agent goes live:
- Agents must scan their own stdout/logs before writing/sending output and redact anything matching secret patterns (tokens, keys, passwords, API keys)
- Pattern list at minimum: hvs\., eyJ, bearer tokens, anything from known env var names (BRIDGE_API_KEY, VAULT_TOKEN, N8N_ENCRYPTION_KEY, etc.)
- Root cause: docker inspect --format '{{range .Config.Env}}...' dumps all env vars including secrets; agents will reach for broad diagnostic commands without filtering — local models even more so
- Production exposure is a serious risk; sandbox exposure is acceptable but still undesirable
- This system needs to exist at the agent level (not just Claude Code rules) because once agents run autonomously the user will not be watching

Update instructions

Update at the end of every agent-builder session. Keep agent status, key decisions, and prereq checklist current.

5.9 KiB Raw Blame History