Every Monday at 7:00 AM Eastern, a Hetzner VPS wakes up, runs Claude Code against a 536-line prompt, refreshes competitive + VoC + AEO signals across 5 LLM platforms, generates publish-ready content drafts, and deploys an updated showcase preview.
Infrastructure → Trigger → Execution. Each layer hands off cleanly to the next.
One Monday morning, end-to-end. Read top to bottom.
For each step: what runs, why, and which tool category (color-coded below).
Seeds the durable state: 7 tracked competitors, 5 default subreddits, 8 buyer-intent AEO prompts, last_run_timestamp = 30 days ago, auto_deploy_to_prod = false.
Load .routine_state.json → record now_utc and previous_date for diff baselines.
Two-phase: scripts collect data, agent analyzes. WebFetch directly to competitor sites hits bot protection — so we route through Firecrawl + dedicated scrapers.
Same scripts-collect / agent-analyzes pattern. Reddit's auth-walled — we use Pullpush (free archive). When that's down, fallback to NotebookLM or WebSearch with site:reddit.com.
⚠ Pulse #2 detected: Pullpush returned 2/8 queries for 2nd consecutive week — pivoted to Congressional hearing testimony as a fallback VoC source. The routine self-degrades, doesn't fail.
Tests whether the brand shows up when buyers ask AI for recommendations. Single API (DataForSEO) hits 5 surfaces: ChatGPT, Claude, Gemini, Perplexity, Google AI Overview. Same 8 prompts every week → real time-series.
Hard cap: 40 calls (8 × 5). Soft cap: 30 if budget tight (drop Claude as lowest priority).
After Steps 2-4 finish, a Python script diffs this week's data against last week's frozen snapshot. Every subsequent step that needs "how many changed" reads from pulse_diff_latest.json — never recomputes. Replaced LLM-counted deltas (which drifted).
First time the LLM gets to do real synthesis. Writes weekly_pulse_<date>.md with 5 sections: executive summary, competitive moves, VoC trending pains, AEO visibility status, recommended actions. Grounded in pulse_diff_latest.json — counts come from the deterministic diff, not LLM eyeballing.
Two outputs for the client-facing showcase:
Converts the pulse's findings into social posts, blog drafts, and sales battle cards the client can copy/edit/publish. Never auto-publishes — drafts land in the Content Inbox.
Voice is non-negotiable. Reads strategy.json + content_guardrails.json + avatars.json before drafting.
Aggregates counts (competitor count, VoC phrase count, AEO entry count, last_updated) for the showcase's top-of-page hero.
Persists last_run_timestamp, last_pulse_id, consecutive_failures per source, last_preview_url. This is how the next run knows what changed.
Creates branch claude/weekly-pulse-<date>, commits all changed JSON / MDs / snapshots, pushes to GitHub. Never pushes to main — Chase promotes manually (or via auto_deploy flag).
Pushes the updated showcase to a Cloudflare Pages preview with a date-stamped subdomain. Auto-promotes to prod only if auto_deploy_to_prod == true (currently false — manual promotion).
RFC822 message built in bash, base64-url encoded, posted via gws gmail users drafts create. Auth: pre-installed service account (project apt-rope-495015-a5, scope gmail.modify, auto-refreshing token).
Note: this replaces the Gmail MCP connector that worked in Cloud Routines. On the server, gws CLI is more reliable.
Single curl POST to $DISCORD_WEBHOOK_URL. 3-line summary + preview + branch URL, under 2000 chars.
Final session message with the run summary — what shows in the log footer and the cron's _cron_ets-pulse.log.
In execution/. The deterministic tools the LLM calls. No LLM in these — pure data collection / diffing / validation.
In .claude/skills/. Markdown SOPs Claude reads when it needs that domain expertise. Two groups: intel/research and content generation.
In .claude/agents/. The main session delegates focused work to these — they get a fresh context window and a narrow brief.
Every external dependency, what it does, how authentication works.
| Service | Used in | Auth (env var) | Purpose | Est. cost / run |
|---|---|---|---|---|
| Anthropic / Claude Code | Always | CLAUDE_CODE_OAUTH_TOKEN | The LLM runtime itself | ~$0.30 (subscription token) |
| Firecrawl | Step 2 | FIRECRAWL_API_KEY | Scrape competitor sites, G2, Capterra (bypasses bot protection) | ~$0.05–0.30 |
| SerpAPI | Step 2 (fallback) | SERP_API_KEY | Fallback when Firecrawl can't reach G2/Capterra | ~$0 (rarely fires) |
| Pullpush.io | Step 3 | (no key — free archive) | Reddit thread search (Reddit's auth-walled bot path) | free |
| DataForSEO | Step 4 | DATAFORSEO_LOGIN / _PASSWORD | AEO via ChatGPT / Claude / Gemini / Perplexity / Google AIO | ~$1.00–2.00 (the big spender) |
| GitHub | Step 8 | git over HTTPS w/ stored cred | Push claude/weekly-pulse-<date> branch | free |
| Cloudflare Pages | Step 9 | CLOUDFLARE_API_TOKEN | Preview deploy (+ prod if auto_deploy flag set) | free |
| Google Workspace (Gmail) | Step 10 | gws CLI service account | Create Gmail draft (not send) | free |
| Discord | Step 10 | DISCORD_WEBHOOK_URL | Post summary to channel | free |
How this week knows what last week did. The append-only files are the backbone of the time-series.
Every colored tag in this doc maps to one of these execution patterns.
Direct shell commands the main Claude session runs — curl, jq, mkdir, date.
Deterministic Python in execution/. The LLM calls these but never replicates their logic itself.
Markdown SOP in .claude/skills/. Claude reads these when it needs that domain expertise mid-task.
Fresh-context delegate spun up via the Task tool. In .claude/agents/. Returns a single message back.
HTTP call to a third-party service (Firecrawl, DataForSEO, Cloudflare, etc.). Auth via env vars.
Pre-installed binary on the server (git, wrangler, gws). Cheaper + more reliable than raw HTTP.
Model Context Protocol connector. Not used on the server — MCPs were Cloud-Routine-only. Server replaced them with CLIs.
A file the step produces. Always relative to repo root.