# Playbook: N8N Builder Agent ## Purpose Builds `n8n_automation` type automations from the `automation_ideas` table. Uses Ollama (llama3.1:8b) to generate N8N workflow JSON, imports to sandbox N8N via API, assigns credentials, runs all 4 test levels, notifies user for promotion approval. ## Trigger Scheduled or manual. Queries `automation_ideas` for the next row where: - `type = 'n8n_automation'` - `status = 'ready_to_build'` - `builder_status = 'not_started'` - Ordered by `priority ASC NULLS LAST, id ASC` Only processes one automation per run. ## Infrastructure - Runs on: server-01 (n8n-sandbox, port 5679) - Ollama endpoint: http://localhost:11434 (server-01 local) - Model: llama3.1:8b - Claude overseer: `claude -p` (non-interactive, SDK credits — use sparingly) - Sandbox N8N API: http://192.168.1.90:5679 — API key from Vault at secret/sandbox/n8n - Vault: sandbox AppRole at /opt/appdata/docker/docker-compose/vault/approle/ - Database: production api_business (read automation_ideas, write agent_test_results) - NTFY: production NTFY instance for notifications ## N8N Workflow JSON Structure (required knowledge) Every valid N8N workflow JSON must include: ```json { "name": "Workflow Name", "nodes": [...], "connections": {...}, "active": false, "settings": {"executionOrder": "v1"}, "tags": [] } ``` Nodes have: `id` (UUID), `name`, `type` (e.g. n8n-nodes-base.httpRequest), `typeVersion`, `position` ([x, y]), `parameters`. Connections map node outputs to node inputs by node name. All workflows imported as `active: false` — never activate automatically in sandbox. ## Step-by-Step ### Step 1 — Claim the automation ```sql UPDATE automation_ideas SET builder_status = 'queued' WHERE id = AND builder_status = 'not_started'; ``` If 0 rows updated: another builder claimed it — stop, notify, exit. ### Step 2 — Fetch sandbox N8N API key from Vault Use sandbox AppRole to read secret/sandbox/n8n. Extract `api_key` and `base_url`. Never log the key value. Pass it in memory only. ### Step 3 — Discover available N8N credentials Before generating, query the sandbox N8N for existing credentials so the generated workflow references them by name: ``` GET {base_url}/api/v1/credentials X-N8N-API-KEY: {api_key} ``` Extract credential names and types. Pass this list to the Ollama prompt so the generated workflow uses real credential names. ### Step 4 — Build the prompt for Ollama ``` You are an expert N8N workflow engineer. Generate a valid N8N workflow JSON for the following automation. Name: {name} Infrastructure available: {infrastructure_requirement} Available N8N credentials: {credential_names_and_types} Specification: {task_description} Requirements: - Output ONLY valid N8N workflow JSON. No explanation, no markdown fences, no commentary. - The workflow must be importable via the N8N API without modification. - Set active: false. - Reference credentials by the exact names listed above — do not invent credential names. - Use realistic node positions (spread nodes 200px apart on x-axis starting at x=250). - Every node must have a unique UUID for its id field. - The workflow must fully implement the specification — do not stub or placeholder any steps. ``` ### Step 5 — Generate with Ollama ``` POST http://localhost:11434/api/generate { "model": "llama3.1:8b", "prompt": "", "stream": false } ``` Set builder_status = 'building' before calling. Extract JSON from response — strip any surrounding text if Ollama adds it. Validate it parses as JSON before proceeding. If invalid JSON: set builder_status = 'failed', log error, notify, stop. If Ollama call fails or times out (>120s): set builder_status = 'failed', log error, notify, stop. ### Step 6 — Overseer validation with claude -p Pass the generated JSON to `claude -p` for structural review. Keep prompt minimal to conserve SDK credits: ```bash claude -p "Review this N8N workflow JSON for the following only: 1. Is it valid N8N workflow JSON with required fields (name, nodes, connections, active, settings)? 2. Do all nodes have id, name, type, typeVersion, position, parameters? 3. Do connections reference node names that exist in the nodes array? 4. Does the workflow logic match this spec: {name} — {task_description[:200]} Respond with: PASS or FAIL, then one sentence explaining why. Do not rewrite the workflow." ``` If FAIL: log claude's reason, set builder_status = 'failed', notify via NTFY with reason, stop. If PASS: proceed. ### Step 7 — Import to sandbox N8N ``` POST {base_url}/api/v1/workflows X-N8N-API-KEY: {api_key} Content-Type: application/json Body: {generated workflow JSON} ``` On success: capture the returned workflow `id` from N8N. Store in notes or a temp variable. On failure (non-2xx): set builder_status = 'failed', log the N8N error response, notify, stop. ### Step 8 — Assign credentials For each node in the workflow that references a credential: ``` GET {base_url}/api/v1/workflows/{workflow_id} ``` Verify credential references resolved correctly. If any credential reference is broken (credential name not found), attempt to match by type — if unresolvable, set builder_status = 'failed', notify user with list of missing credentials, stop. ### Step 9 — Run 4-level automated tests Run each level in order. Stop and fail if any level fails. Log every result to `agent_test_results`. **Level 1 — Structure** Validate the imported workflow via the N8N API: - `GET {base_url}/api/v1/workflows/{workflow_id}` returns 200 - Response contains correct node count - All required fields present - Insert result to agent_test_results (test_level=1) **Level 2 — Deployment** - Verify workflow exists in sandbox N8N and is not active - Verify all credential references are valid (no broken credential links) - Insert result to agent_test_results (test_level=2) **Level 3 — Smoke** - Trigger a manual execution via N8N API: ``` POST {base_url}/api/v1/workflows/{workflow_id}/run ``` - Poll execution status until complete or timeout (60s) - Must reach status 'success' or 'waiting' (not 'error' or 'crashed') - Insert result to agent_test_results (test_level=3) **Level 4 — Assertion** - Verify the correct side effect occurred based on what the workflow is supposed to do - Check system state, not output strings: DB row written, API called, file created, webhook fired, etc. - The specific assertion depends on the automation — derive it from task_description - Insert result to agent_test_results (test_level=4) ### Step 10 — Notify user for promotion approval If all 4 levels pass: 1. Set builder_status = 'awaiting_approval' 2. Send NTFY notification: ``` Title: N8N Workflow Ready for Promotion — {name} Body: All 4 test levels passed in sandbox. Automation id={id} (n8n_automation) is ready for production promotion. Sandbox workflow id={n8n_workflow_id}. Reply to approve or reject. ``` User must explicitly approve before production import. No auto-promotion in v1. ### Step 11 — On approval 1. Import the same workflow JSON to production N8N (port 5678) 2. Assign production credentials (different credential names from sandbox) 3. Set builder_status = 'deployed' 4. Update automation_ideas status = 'deployed' ## Error handling - Any unhandled exception: set builder_status = 'failed', log to agent_test_results (test_level=0, status='fail'), send NTFY alert - Always release the claim (reset to 'not_started') if failing before Step 5 so another run can retry - After Step 5: leave as 'failed' — requires manual review before retry - If workflow was imported before failure: delete it from sandbox N8N to keep sandbox clean ``` DELETE {base_url}/api/v1/workflows/{workflow_id} ``` ## NTFY notification patterns - Build started: `[N8N Builder] Building {name} (id={id})` - Overseer FAIL: `[N8N Builder] FAIL — Overseer rejected {name}: {reason}` - Import FAIL: `[N8N Builder] FAIL — {name} failed N8N import: {error}` - Missing credentials: `[N8N Builder] BLOCKED — {name} needs credentials: {list}` - Test level fail: `[N8N Builder] FAIL — {name} failed Level {n}: {error}` - Ready for approval: `[N8N Builder] READY — {name} passed all tests, awaiting your approval` - Unhandled error: `[N8N Builder] ERROR — {name}: {exception}` ## SDK credit budget `claude -p` is called once per automation (Step 6 only). Keep the overseer prompt under 500 tokens. Do not call claude -p for retries or debugging — only for the initial validation pass. ## N8N credential naming convention Sandbox credentials must be named with a `-sandbox` suffix to distinguish from production: - `postgres-sandbox` (not `postgres`) - `vault-sandbox` (not `vault`) - `n8n-internal-sandbox` (not `n8n-internal`) This prevents the N8N Builder Agent from accidentally referencing production credentials when building sandbox workflows.