ETS Pulse — System Architecture

Seeds the durable state: 7 tracked competitors, 5 default subreddits, 8 buyer-intent AEO prompts, last_run_timestamp = 30 days ago, auto_deploy_to_prod = false.

read markdown writes .routine_state.json

Step 1

Read state

Load .routine_state.json → record now_utc and previous_date for diff baselines.

jq / cat

Step 2

Competitive refresh

Two-phase: scripts collect data, agent analyzes. WebFetch directly to competitor sites hits bot protection — so we route through Firecrawl + dedicated scrapers.

2a — Collect (bash, main session)

execution/g2_research.py queries G2.com via Firecrawl + SerpAPI/Perplexity fallback for competitor reviews
execution/capterra_research.py same shape, against Capterra
curl Firecrawl POST /v1/scrape per competitor homepage (7 competitors × 1 page)

2b — Analyze

marketing-competitive-intel reads last week's snapshot, diffs vs this week, identifies pricing changes / new features / positioning shifts
beat-competitors converts deltas to action items
firecrawl-best-practices reference for scrape patterns

Firecrawl API G2 (via Firecrawl) Capterra (via Firecrawl) competitive_snapshots/<date>.json competitors.json competitive_landscape.md

Step 3

VoC pulse

Same scripts-collect / agent-analyzes pattern. Reddit's auth-walled — we use Pullpush (free archive). When that's down, fallback to NotebookLM or WebSearch with site:reddit.com.

3a — Collect

execution/reddit_research.py × 4-6 pain-point queries against Pullpush API

3b — Analyze

marketing-voc-researcher extracts pain phrases, dedupes vs existing voc_data.json, stamps first_seen_at

Pullpush (Reddit archive) voc_data.json (swipe file) voice_of_customer.md (append-only weekly section)

⚠ Pulse #2 detected: Pullpush returned 2/8 queries for 2nd consecutive week — pivoted to Congressional hearing testimony as a fallback VoC source. The routine self-degrades, doesn't fail.

Step 4

AEO visibility across 5 platforms

The biggest API spend of the run (~$1-2)

Tests whether the brand shows up when buyers ask AI for recommendations. Single API (DataForSEO) hits 5 surfaces: ChatGPT, Claude, Gemini, Perplexity, Google AI Overview. Same 8 prompts every week → real time-series.

Per platform × prompt

DataForSEO /v3/ai_optimization/chat_gpt/llm_responses/live
DataForSEO /v3/ai_optimization/claude/llm_responses/live
DataForSEO /v3/ai_optimization/gemini/llm_responses/live
DataForSEO /v3/ai_optimization/perplexity/llm_responses/live
DataForSEO /v3/serp/google/organic/task_post → wait 15s → /v3/serp/ai_summary (two-step)

Per response, parsed

brand_mentioned (bool), brand_position (1-N or null)
competitors_mentioned (which of 7), citation_count
summary (how brand was characterized)

dataforseo-best-practices aeo-audit optimize-for-ai diagnose-seo seo-technical-auditor aeo_visibility_history.json (append entries[])

Hard cap: 40 calls (8 × 5). Soft cap: 30 if budget tight (drop Claude as lowest priority).

Step 4.5

Compute pulse diff (deterministic)

No LLM judgment — script is source of truth

After Steps 2-4 finish, a Python script diffs this week's data against last week's frozen snapshot. Every subsequent step that needs "how many changed" reads from pulse_diff_latest.json — never recomputes. Replaced LLM-counted deltas (which drifted).

execution/compute_pulse_diff.py pulse_diff_latest.json snapshots/pulse-<N>/ (frozen)

Step 5

Synthesize the weekly brief

First time the LLM gets to do real synthesis. Writes weekly_pulse_<date>.md with 5 sections: executive summary, competitive moves, VoC trending pains, AEO visibility status, recommended actions. Grounded in pulse_diff_latest.json — counts come from the deterministic diff, not LLM eyeballing.

weekly_pulse_<date>.md

Step 5.5

Write what's-new banner + updates feed

Two outputs for the client-facing showcase:

whats_new.json — overwritten each run, drives the "Last updated" banner
updates_feed.json — append-only, drives the Updates sidebar (every weekly pulse as a card, NEW badge if < 14 days old)

whats_new.json updates_feed.json

Step 5.6

Generate publish-ready content drafts

The "machine that pays for itself" step

Converts the pulse's findings into social posts, blog drafts, and sales battle cards the client can copy/edit/publish. Never auto-publishes — drafts land in the Content Inbox.

5.6.1 — Select opportunities (deterministic)

select_content_opportunities.py maps diff → content types via fixed decision table

5.6.2 — Draft (LLM, skill-guided)

Voice is non-negotiable. Reads strategy.json + content_guardrails.json + avatars.json before drafting.

social_post

social-content hook-i fascinate-i indelible copy-editing testimonial-social-proof marketing-content-proof

blog_post

brief classic-direct-response case-study-marketing copy-editing marketing-content-proof

battle_card

sales-enablement copy-editing marketing-content-proof

5.6.3 — Validate (deterministic)

validate_content_voice.py rejects drafts that use do_not_say phrases or miss must_include_per_avatar requirements

generated_content_candidates.json blog_drafts.json sales_battle_cards.json

Step 6

Refresh dashboard summary

Aggregates counts (competitor count, VoC phrase count, AEO entry count, last_updated) for the showcase's top-of-page hero.

execution/generate_dashboard_summary.py dashboard_summary.json

Step 7

Update routine state

Persists last_run_timestamp, last_pulse_id, consecutive_failures per source, last_preview_url. This is how the next run knows what changed.

.routine_state.json

Step 8

Commit + push

Creates branch claude/weekly-pulse-<date>, commits all changed JSON / MDs / snapshots, pushes to GitHub. Never pushes to main — Chase promotes manually (or via auto_deploy flag).

git GitHub (Bowers-Enterprises/ets-pulse)

Step 9

Preview deploy

Pushes the updated showcase to a Cloudflare Pages preview with a date-stamped subdomain. Auto-promotes to prod only if auto_deploy_to_prod == true (currently false — manual promotion).

npx wrangler pages deploy Cloudflare Pages pulse-<date>.ets-showcase-v3.pages.dev

Step 10

Deliver brief externally

Gmail draft

RFC822 message built in bash, base64-url encoded, posted via gws gmail users drafts create. Auth: pre-installed service account (project apt-rope-495015-a5, scope gmail.modify, auto-refreshing token).

gws (Google Workspace CLI) Gmail API

Note: this replaces the Gmail MCP connector that worked in Cloud Routines. On the server, gws CLI is more reliable.

Discord post

Single curl POST to $DISCORD_WEBHOOK_URL. 3-line summary + preview + branch URL, under 2000 chars.

curl Discord webhook

Step 11

Done

Final session message with the run summary — what shows in the log footer and the cron's _cron_ets-pulse.log.

Python Scripts (Deterministic Layer)

In execution/. The deterministic tools the LLM calls. No LLM in these — pure data collection / diffing / validation.

g2_research.py

Pulls competitor reviews from G2 via Firecrawl (with SerpAPI/Perplexity fallback chain). Step 2a.

capterra_research.py

Same shape as g2, against Capterra. Step 2a.

reddit_research.py

Queries Pullpush API (the Reddit archive) for thread URLs matching pain-point queries. Step 3a.

compute_pulse_diff.py

Diffs this week's data against last week's frozen snapshot. Writes pulse_diff_latest.json + new snapshot. Step 4.5.

select_content_opportunities.py

Maps diff items → content types using a fixed decision table. No LLM. Step 5.6.1.

validate_content_voice.py

Rejects drafts that violate do_not_say or miss must_include_per_avatar. Step 5.6.3.

generate_dashboard_summary.py

Aggregates counts for the showcase hero block. Step 6.

dataforseo_client.py

Shared HTTP basic-auth client + retry logic for all DataForSEO calls. Used in Step 4.

deploy_showcase.py

Wrapper around wrangler pages deploy. Step 9.

validate_showcase_json.py

Schema check on showcase JSON files before deploy. Catches malformed JSON that would 500 the page.

Bundled Skills (16)

In .claude/skills/. Markdown SOPs Claude reads when it needs that domain expertise. Two groups: intel/research and content generation.

Intel & Research (Steps 2–4)

aeo-audit 5-platform AEO audit playbook + ranking heuristics

beat-competitors Converts competitor deltas into action items

dataforseo-best-practices API patterns + cost guardrails for DataForSEO calls

diagnose-seo Ranking decline diagnosis

firecrawl-best-practices Competitor scraping patterns

optimize-for-ai AEO query construction

Content Generation (Step 5.6)

social-content Platform-aware social post structure

classic-direct-response DR formulas (AIDA, PAS, etc.)

hook-i Opening hook engineering

fascinate-i Curiosity loops and bullets

indelible Memorability mechanics

brief Long-form outline-first drafting

sales-enablement Battle card structure + objection handling

copy-editing Final-pass tightening

testimonial-social-proof Proof anchoring for social posts

case-study-marketing Story-driven case study structure

Subagents (4)

In .claude/agents/. The main session delegates focused work to these — they get a fresh context window and a narrow brief.

marketing-competitive-intel Step 2b

Receives last week's snapshot + this week's collected data. Diffs, classifies, writes competitive_snapshots/<date>.json. Appends to competitive_landscape.md.

marketing-voc-researcher Step 3b

Receives Reddit threads from Pullpush. Extracts pain phrases, dedupes against existing swipe file, stamps first_seen_at.

seo-technical-auditor Step 4 (orchestration)

Coordinates the DataForSEO + AEO audit pattern. Knows how to parse 5 different platform response shapes.

marketing-content-proof Step 5.6.2

Reads draft content + adds proof anchors (VALD force plates, specific metrics, named coaches, $0 franchise fee facts). Prevents "vibes-based" copy.

External APIs & Services

Every external dependency, what it does, how authentication works.

Service	Used in	Auth (env var)	Purpose	Est. cost / run
Anthropic / Claude Code	Always	CLAUDE_CODE_OAUTH_TOKEN	The LLM runtime itself	~$0.30 (subscription token)
Firecrawl	Step 2	FIRECRAWL_API_KEY	Scrape competitor sites, G2, Capterra (bypasses bot protection)	~$0.05–0.30
SerpAPI	Step 2 (fallback)	SERP_API_KEY	Fallback when Firecrawl can't reach G2/Capterra	~$0 (rarely fires)
Pullpush.io	Step 3	(no key — free archive)	Reddit thread search (Reddit's auth-walled bot path)	free
DataForSEO	Step 4	DATAFORSEO_LOGIN / _PASSWORD	AEO via ChatGPT / Claude / Gemini / Perplexity / Google AIO	~$1.00–2.00 (the big spender)
GitHub	Step 8	git over HTTPS w/ stored cred	Push claude/weekly-pulse-<date> branch	free
Cloudflare Pages	Step 9	CLOUDFLARE_API_TOKEN	Preview deploy (+ prod if auto_deploy flag set)	free
Google Workspace (Gmail)	Step 10	gws CLI service account	Create Gmail draft (not send)	free
Discord	Step 10	DISCORD_WEBHOOK_URL	Post summary to channel	free

Data Flow & State Persistence

How this week knows what last week did. The append-only files are the backbone of the time-series.

State files (memory across runs)

.routine_state.json

Overwritten each run. Contains last_run_timestamp, tracked competitors/subreddits/prompts, auto_deploy flag, consecutive_failures counters.

competitive_snapshots/<date>.json

Append-only by date. Frozen weekly snapshot. Diff baseline for next week.

aeo_visibility_history.json

Append-only entries[]. The time-series that drives decline detection (3+ weeks of position drop → flag).

voc_data.json

Swipe file. Each phrase keeps first_seen_at. Showcase reads this directly.

snapshots/pulse-<N>/

Frozen showcase-data snapshot, taken by compute_pulse_diff.py. Used as the diff baseline for pulse N+1.

Showcase-data refreshed each run

competitors.json

Drives the Competitor Pages section.

whats_new.json

Drives the "Last updated" banner.

updates_feed.json

Append-only. Drives the Updates sidebar (every pulse, reverse-chronological).

pulse_diff_latest.json

The deterministic diff. Source of truth for all delta counts. Drives NEW/UPDATED badges.

generated_content_candidates.json

Social-post drafts for the Content Inbox.

blog_drafts.json

Long-form drafts.

sales_battle_cards.json

Per-competitor sales talking points.

dashboard_summary.json

Hero-block counts.

Tool Category Legend

Every colored tag in this doc maps to one of these execution patterns.

bash

Direct shell commands the main Claude session runs — curl, jq, mkdir, date.

python script

Deterministic Python in execution/. The LLM calls these but never replicates their logic itself.

skill

Markdown SOP in .claude/skills/. Claude reads these when it needs that domain expertise mid-task.

subagent

Fresh-context delegate spun up via the Task tool. In .claude/agents/. Returns a single message back.

external API

HTTP call to a third-party service (Firecrawl, DataForSEO, Cloudflare, etc.). Auth via env vars.

CLI tool

Pre-installed binary on the server (git, wrangler, gws). Cheaper + more reliable than raw HTTP.

MCP

Model Context Protocol connector. Not used on the server — MCPs were Cloud-Routine-only. Server replaced them with CLIs.

writes file

A file the step produces. Always relative to repo root.