# Initial GTM Audit

> A first-run growth diagnostic for a DTC brand. Runs end-to-end from a single Claude Code prompt. Produces one consolidated report with access-gated sections that unlock when logged-in tool access is wired up. Feeds the prioritised action items directly into the brand's `actions.md`.

---

## Purpose

Compress what a traditional agency would bill for over 2 to 4 weeks into a single Claude Code session. The audit is the deliverable for Step 1 of the Operating System (Map the Growth Equation), the first-cut of Step 3 (the prioritised view of where the gaps are), the seed of Step 4 (Highest-ICE Initiatives), and the initial load of Step 5 (prioritised action list). It produces a landscape, competitive gaps, benchmarks, hygiene issues, and a prioritised roadmap — not a single declared bet.

## When to run it

- **Week 1 of a new engagement.** Audit output replaces the proposal doc and becomes the shared source of truth with the founder.
- **Standalone diagnostic.** A self-contained run for a brand that wants the full diagnostic without an ongoing engagement.
- **Strategic refresh every 6 months.** Re-run on the brand to check whether the competitive gaps and priorities have shifted.

## Who this is for (ICP)

Premium DTC brands, AOV $500+ or 2x category median, founder-led, US / western / English-speaking market focus. The playbook works outside that ICP but is tuned for it.

---

## Two axes, four modes

The audit flexes on two independent dimensions.

### Axis 1. Data maturity

Does the brand have historical performance data the audit can analyse?

- **Path A. Established.** Has 3+ months of traffic, conversions, and spend data.
- **Path B. New.** Minimal or zero historical data. Pre-launch, pre-revenue, or first-quarter trading.

### Axis 2. Access grant

Has the client shared admin access to their internal stack?

- **Level 1. External only.** Every check runs off operator-side API keys and public data. No logged-in access required. This is the default state at kick-off.
- **Level 2. External plus Internal.** Logged-in viewer or admin access has been granted on GA4, Search Console, Google Ads, Meta Ads, Shopify, Klaviyo, Clarity, and the rest of the stack. The External layer runs first. The Internal layer is layered on top.

### The four modes

Path crosses level. Each mode has a different centre of gravity.

| Mode | Data maturity | Access level | Centre of gravity |
|---|---|---|---|
| A1 | Established | External only | Competitor audit, public-data SEO diagnostic, Meta Ad Library scan, public-data Growth Equation via Ahrefs + SERP + product-count estimates |
| A2 | Established | External + Internal | Full audit. External layer plus monthly correlation analysis across GA4, Shopify, Klaviyo, ad platforms. Daily dashboard becomes possible |
| A2-thin | Established by calendar, thin by data (< 3 months of conversion history OR < 50 sales events in the last 90 days OR < 20 GA4 `purchase` events in the same window) | External + Internal | Same shape as A2 but SEO pivots from leaking-click recovery to targeting; CRO pivots from funnel optimisation to friction removal on traffic that is arriving; the paid section becomes a readiness audit plus starter campaigns, not a performance teardown. Priority calls triangulate across three or more internal tools for confidence. |
| B1 | New | External only | Audience × founder vision × competitor audit carries the weight. Growth Equation converts to a structural map |
| B2 | New | External + Internal | Same diagnostic shape as B1. Internal access informs the readiness checklist and pre-wires the dashboard for day-one feedback |

### Practical overlap

B1 and B2 produce near-identical diagnostic output because Internal tools have little history to analyse on a new brand. The useful difference: B2 lets us pre-wire the daily dashboard so the feedback loop is live from the first experiment.

A1 and A2 differ substantially. A1 alone is a genuinely valuable report on its own. A2 is the full-stack diagnostic that powers the daily dashboard.

The daily dashboard requires Level 2 access. A1 and B1 stop at the audit report.

### Mode detection

At kick-off, detect the mode:

1. **Path classification.** Check for GA4 credentials in `.env` plus Shopify order history in the client folder. 3+ months of history defaults to Path A. Otherwise Path B. Confirm with the user.
2. **Data-volume check (Path A only).** If Shopify order count is below 50 over the last 90 days, OR GA4 `purchase` events are below 20 over the same window, re-classify from A1/A2 to **A2-thin**. A2-thin runs with full Internal access but adapts section depth per the row in the modes table. Do not quietly run as A2 and produce a report that apologises for thin data.
3. **Access classification.** Inventory the Internal tool list below. If 3 or more Internal tools have working credentials in `.env`, default to Level 2. Otherwise Level 1. Confirm with the user.
4. **Mode sticker.** Record the mode (A1, A2, A2-thin, B1, or B2) in the report header.

---

## Tool-availability gate (strict, two layers)

The audit runs in two layers. The External layer always runs. The Internal layer runs only when the client has granted access and the credentials live in `.env`. Both layers honour the same rule: no fabricated data, no placeholders, no guessed numbers. Mark missing checks N/A and file an access-grant item.

### External layer (operator-side API keys, public data, always runs)

These tools run on any brand without needing the brand owner to grant access. Credentials are held by the operator and live in `.env`. The External layer alone produces a valuable report.

| Tool | Used for | Data source |
|---|---|---|
| DataForSEO | Ranked keywords, SERP data, backlinks for client and competitor domains | Public SERPs |
| Ahrefs | Keyword footprint, backlink profile, content gap, historical rank changes | Public SEO graph |
| Serper | Live SERP snapshots, paid-ad copy capture | Public SERPs |
| Keywords Everywhere | Search volume, trend, related queries | Public keyword data |
| Screaming Frog CLI | Full-site crawl, technical SEO, schema, broken links | Public-facing site |
| PageSpeed Insights | Core Web Vitals per template | Public-facing site |
| Meta Ad Library | 30-day competitor ad snapshot | Public ad library |
| Public audience research (Reddit, Quora, Hacker News, X, YouTube comments, niche forums, blog comments) | Audience insights one-pager | Public conversations |
| AEO citation checks (ChatGPT, Claude, Perplexity) | AEO presence on commercial queries | Public LLM answers |

### Internal layer (logged-in tool access, runs in Level 2 only)

These tools require the brand owner to grant the operator access on their own accounts.

| Tool | Used for | Access required |
|---|---|---|
| GA4 | Traffic, conversions, funnel, attribution | Viewer on GA4 property for our service account |
| Google Search Console | Owned-domain queries, CTR, indexation | Restricted access on GSC property for our service account |
| Google Ads | Account-level spend, campaigns, search terms | MCC invite or user access |
| Google Merchant Center | Feed health, Shopping approvals | User access |
| Meta Ads API | Account-level spend, campaigns, creatives | Ad-account partner invite |
| Meta Pixel / CAPI | Event health on the client's domain | Business Manager asset access |
| Shopify Admin API | Orders, products, customers, AOV, LTV, cohort data | Custom app token or collaborator account |
| Klaviyo | Email and SMS flows, revenue per flow | Private API key |
| Brevo | Transactional and bulk email | API key |
| Microsoft Clarity | Session recordings, heatmaps, rage clicks | Project token |
| Google Tag Manager | Tracking-tag inventory | Container access |
| Airtable | CRM, pipelines, custom records | Base access |

The daily dashboard requires the Internal layer. No dashboard without Level 2.

### Gate behaviour

For every check the audit plans to run:

1. If the check uses External tools only, run it. External tools are always available.
2. If the check uses Internal tools, verify the credential exists in `.env`. If present, run it. If absent, skip — and **do not stub the section with "Requires access" framing**. Sections 7 / 8 / 9 lead with what was observed publicly; Internal-layer gaps surface as forward-looking notes inside the section, not as access-grant CTAs. See `feedback_audit_no_access_asks_in_audit.md`.
3. Track every missing Internal tool **internally** (operator working notes). Access asks land in the SOW deliverable, not the audit. The audit itself does not enumerate access asks. (2026-05-13 update: the Access-Grant Checklist section is removed from the audit deliverable entirely.)
4. Never block the audit on a missing Internal tool. Ship the External layer.

Honest gaps beat invented numbers every time.

### On-page-signal fallback when External SEO tools are all unavailable

Rare but real: a first-meeting audit on a prospect's laptop where DataForSEO, Ahrefs, and Serper are all missing. Keyword research does not stub in that scenario, it degrades to on-page-signal fallback.

For each competitor and for the brand itself, WebFetch:

1. Homepage: title, meta description, H1, H2 block, schema markup, primary nav anchor text, og:title and og:description.
2. Robots.txt and XML sitemap: count indexed-looking paths, log the top 20 URLs by directory prefix.

   **Verify before asserting absence (added 2026-06-16 after the Vicinus false positives).** Never report a technical element "missing" off a single homepage fetch or a redirected/errored request — the most damaging audit error, and an instant credibility loss with a sophisticated reader.
   - **XML sitemap:** fetch `/sitemap.xml` *following redirects* (it routinely 301s to `/sitemap_index.xml`). A 200 means it EXISTS — never say "no sitemap" or "publish a sitemap". A sitemap not declared in `robots.txt` is at most a minor hygiene note (the site likely submitted it in Search Console; the robots.txt line mainly helps crawlers without a submission channel) — never a Key issue or Top 5 item.
   - **Page-type existence:** enumerate the sitemap before claiming any page type is absent ("no product / category / industry / comparison pages", "build the category page set"). A matching URL prefix means the pages EXIST — reframe to "optimise / retarget / consolidate", never "build". Only comparison/"vs" and about pages are commonly genuinely absent; verify each.
   - **Analytics / tags:** if Google Tag Manager is present, GA4, conversion tracking and ad pixels load through it client-side and an outside-in fetch cannot see them — mark "via GTM, confirm in container", never "no analytics / install GTM". Same logic for a cookie-consent gate.
   - **Hygiene ceiling:** sitemap-in-robots, llms.txt, Twitter cards, meta-description presence, alt text and title casing are hygiene — Quick wins at most, never Top 5 / P0, and only when verified.
3. Top 10 internal-nav links: anchor text as a head-term proxy.
4. Top 5 collection or category pages: title, H1, H2/H3, URL slug tokens, schema, faceted-filter labels.
5. Top 5 product or service pages: same extraction plus pricing and trust-signal copy.
6. Blog/resource index (if present): 20 newest post titles and slugs.

Produce `{data_folder}/fallback-onpage-signals.csv` per domain with columns: page_type, url, title, meta_description, h1, h2_list, slug_tokens, schema_types, nav_anchor_text. Manually assign intent per page based on title and H1 patterns. Cluster slug tokens and H1 keywords across competitors to surface the category's head-term vocabulary.

Label the section clearly: "on-page signals only; volume, difficulty, and rank position unavailable without external keyword API access." Section 4C runs on this reduced dataset, 4C.5 consolidation still runs but with the same caveat.

---

## Inputs the playbook accepts

The canonical input is a **structured payload**. Only three fields are guaranteed; everything else the audit derives from public signals and never narrates as missing.

| Field | Required | Notes |
|---|---|---|
| `store_url` | Yes | Lead domain for crawls + enrichment |
| `business_type` | Yes | Sets the analytical frame (see below) |
| `revenue_stage` | Yes | Calibrates realism + proportionality (see below) |
| `business_type_other` | No | Free text when `business_type` = Other; choose the closest frame from it, never run a frameless audit |
| `sells` | No | Derived from the scraped site if blank |
| `audience` | No | Derived from what the brand sells and who it sells to |
| `channels` (multi) | No | Weights recommendations (see below) |
| `primary_market` (multi) | No | Frames keyword / SERP / competitor / AEO / language work |
| `competitors` (3–5) | No | Each entry is a brand **name or a URL**. A typed name is resolved to its canonical domain and validated by two independent signals — a homepage content check (right product/industry) and a cross-check against the client's keyword-overlap competitor set; accepted on either, dropped silently if both fail (never narrated — see the no-data-artefact rule). If the usable set falls below three (blank, or names that did not resolve), derive real category competitors to top it up — Section 3 is never thin for a missing or partial list |
| Client name + project folder | No | Used in output paths/headings on the interactive path; the headless runner uses the audit token |
| Founder-volunteered free text | No | A hypothesis to test against the evidence, never evidence itself, never quoted back as research |

**Blank optional fields are derived, never narrated.** No "the founder did not specify", "not provided", "no competitors were given". The reader never learns a field was blank.

**The structured fields shape the audit, they do not just describe it:**
- **`business_type` sets the frame and what "good" looks like.** A multi-product catalogue gets attribute-gap depth; a subscription brand gets retention/LTV framing; a financial-services / fintech brand (lending, payments, insurance, investing) gets a regulatory/compliance-constrained frame where trust and E-E-A-T are a primary ranking *and* conversion lever and the comparison-site / aggregator / broker backlink ecosystem is the core authority channel; a non-ecommerce type gets the closest-fit treatment and the report says what it would need to go deeper.
- **`revenue_stage` calibrates realism** (budget assumptions, what "good" looks like at this size, motion maturity). It does **not** predetermine the channel — paid acquisition can be the fastest path to first revenue even pre-launch. Derive the levers from the evidence, never from the stage alone.
- **`primary_market`** frames keyword / SERP / competitor / AEO / language work and supersedes any generic "geo strategy" question.
- **`channels`** *weight* the recommendations (do not optimise a thin DTC site for a brand that is mostly wholesale or B2B); they do not set the frame. If "Subscription / recurring revenue" is among them, treat it as a monetization signal, not a distribution one, and weight retention/LTV accordingly.

Every finding is **derived from data**. Do not crown a single "binding constraint", "biggest lever", or "the one thing" — the audit's value is the landscape, the competitive gaps, the benchmarking tables, the hygiene issues, the obvious fixes, and a prioritised roadmap built from them. A founder whose stated problem the evidence refutes is a high-value insight, not an error, and is never restated as a finding.

The two-layer tool gate auto-detects the mode (`A1` / `A2` / `B1` / `B2`); it is recorded in the report header, never asked for as an input. See the **Execution modes** subsection below: an interactive (operator-present) run may additionally gather a founder interview and geo / capital / channel constraints and run Level-2 data — optional deepening, captured as hypotheses, that never gates the run and never changes the output contract.

---

## Execution modes

The audit runs in one of two modes. **The output contract is identical in both** — same two deliverables, same structure, same sections, same tables, same voice and readability rules. Mode changes only the process latitude, never what ships.

- **Headless (the cloud audit-runner).** Driven by a structured form payload, scraped homepage data for the store and competitors, and an enriched external-API bundle. One pass, one response, no follow-up questions. The enriched bundle and scrape stand in for the founder interview and any Level-2 access; there is no interview and no operator present. The runner wraps the model output for parsing — that wrapper contract lives in the runner harness, not in this spec.
- **Interactive (operator-present, e.g. run from the client folder via Claude Code).** An operator may go deeper on the same contract: conduct the founder interview, gather geo / capital / channel constraints, pull Level-2 data (granted GA4 / Search Console / ad-account / Shopify / Klaviyo / Clarity access), and run more or deeper tool passes. All of that is **optional deepening**, captured as hypotheses to test against the evidence; none of it gates the run, and none of it changes the output contract or licenses a different report shape.

A finding's confidence comes from the evidence behind it, not from the mode. Neither mode narrates which mode it ran in, what access it had, or what it did or did not gather (see the Founder-readability rules).

---

## Spec authority (single source of truth)

This spec is the single authoritative source for everything the audit *produces*: structure, sections, required tables, the synthesis pass, voice, the Founder-readability rules, confidence calibration, source attribution, and the input model. The cloud audit-runner's system header is a thin **runtime adapter** only — it states the execution mode, the input bundle, and the output-wrapper parsing contract, then defers to this spec. The header must not re-specify, summarise, or override any output rule, and carries no clause that overrides this spec. Any change to what the audit produces lands here, in this file, and propagates to the runner automatically (it composes this spec at runtime via `compose_playbook.py`) and to the public and Managed copies on sync. This generalises the byte-identical-rubric contract already stated for the confidence pass below ("byte-identical to the runner's `RUBRIC` constant so the two surfaces never drift"). If the runner header and this spec ever disagree on output, this spec wins and the header is the bug.

---

## Output: two deliverables per run

Every audit ships two artefacts. The full report is the reference document; the email overview is the headline read inside the founder's Gmail thread. Both required on every run. **Do not produce a standalone executive summary document.** The report's own front matter (Executive Summary + Top 5 Actions to Ship First, ~2 pages) is engineered to be the founder skim and replaces the separate exec-summary doc that earlier versions of this playbook produced. **Front matter is two sections, not four** — Cross-cutting Themes and Key Findings as separate blocks are dropped (they duplicated content in the Exec Summary and the body section openings). See `feedback_audit_front_matter_two_sections.md` (added 2026-05-13 from the Forest of Chintz audit).

```
{client_folder}/docs/gtm-audit/{YYYY-MM-DD}/
├── {client-slug}-gtm-audit-report.md                 (1) full report, canonical markdown source
├── {client-slug}-gtm-audit-report.html               (1) full report, branded HTML (also hosted at entirecommerce.ai/audits/{slug}.html)
├── {client-slug}-gtm-audit-report.pdf                (1) full report, PDF for email attachment
└── {client-slug}-reopener-email-draft.txt            (2) paste-ready email overview, plain text
```

**Deliverable 1: the full report (.md + .html + .pdf, plus a hosted HTML link).** The ten-section canonical document. Report-voice register (no em-dashes, technical register in Sections 1-10; plain English in the front matter), version-controlled in Git, what future weekly-review cycles read from. The front matter — Executive summary and Top 5 actions to ship first — is explicitly designed as the founder skim, ten minutes between meetings. Sections 1 through 10 are the marketing-team deep read at 40 to 45 minutes and the reference document the weekly-review cadence reads from going forward. The `.html` and `.pdf` are produced by the render stack documented under "HTML + PDF render" below. A parallel hosted HTML version is published at `entirecommerce.ai/audits/{client-slug}-{YYYY-MM-DD}.html` so the founder can read on a phone or tablet without opening an attachment.

**Deliverable 2: the email overview.** A reply-to-thread email (~600-700 words) that summarises the findings in twelve high-impact bullets for the founder who may not open the attached PDF until later. Personal voice, "here you go" register per `feedback_warm_reopener_no_meeting_ask.md` in curated memory. No meeting-ask, no CTA. Saved as `.txt` for clean paste into Gmail (per `feedback_email_draft_format.md`). Structure and voice specified in the "Email overview" section below.

**All report blocks (header, executive summary, top 5 actions to ship first, sections 1 through 11, conclusion) live in the one report markdown file.** Front matter is two sections only (executive summary + top 5 actions); there is no separate cross-cutting-themes or key-findings block (see `feedback_audit_front_matter_two_sections.md`). Do not split sections into separate files. Sub-agents drafting individual sections must return their output to the main session for merge into the single consolidated file. Per-section files in `data/gtm-audit/{YYYY-MM-DD}/` are raw-evidence artefacts only (JSON pulls, CSV exports, per-competitor crawl data, the 15-quote verbatim appendix), never a substitute for the consolidated report. Do not produce a table-of-contents index file that links out to separate section files.

Length target: 4,000 to 8,000 words. Exceeds 8,000 only when data volume genuinely demands it.

Supplementary evidence lives alongside the report in:

```
{client_folder}/data/gtm-audit/{YYYY-MM-DD}/
```

The report is the one thing the founder reads. Everything else is appendix data the report links to.

### HTML + PDF render (Chrome headless via `render-proposal.py`)

After the voice enforcement gate passes on the markdown, render the branded HTML and PDF in a single pass using the agency's render stack at `EntireCommerce/playbooks/render-proposal.py`. The same script powers proposals and audits; both run through HTML + CSS + Chrome headless. The pandoc + LibreOffice path that earlier versions of this playbook prescribed is **deprecated for first-touchpoint client artefacts** (see `feedback_html_render_for_client_artefacts.md` in curated memory). LibreOffice flattened tables, mis-aligned headings, and dropped page-break safety on bullet-heavy content; HTML + Chrome gives full design control with embedded Google Fonts, accent colours, numbered chips, callouts, polished tables.

Standard invocation:

```bash
python3 EntireCommerce/playbooks/render-proposal.py \
  {client_folder}/docs/gtm-audit/{YYYY-MM-DD}/{client-slug}-gtm-audit-report.md \
  {client_folder}/docs/gtm-audit/{YYYY-MM-DD}/{client-slug}-gtm-audit-report.pdf \
  --client-logo EntireCommerce/site/public/logos/clients/{client-slug}.png \
  --client-logo-tint none \
  --eyebrow "Go-to-Market Audit"
```

**Client-logo tint flag.** Inspect the source logo before rendering. If it is white-on-transparent (designed for a dark brand background, e.g., Protelicious's SVG with `fill="#fff"` throughout), pass `--client-logo-tint dark` so the renderer applies `filter: brightness(0)` and the logo lands as solid black on the light cover band. If the logo is dark-on-transparent (most logos), keep `--client-logo-tint none` (the default). Quick SVG check: `grep -oE 'fill="#[0-9a-fA-F]+"' {logo}.svg | sort -u`. PNGs need a visual inspection. Skipping this check produces invisible white-on-white logos.

Output: `{client-slug}-gtm-audit-report.html` (branded, fonts inlined, client logo embedded as base64) and `{client-slug}-gtm-audit-report.pdf` (Chrome headless print) in the same directory.

**Required render flags for audits:**

- `--client-logo {path}`. Path to the client logo PNG / SVG / JPG / WEBP. Convention: drop the asset into `EntireCommerce/site/public/logos/clients/{client-slug}.png` so the same asset is reusable across the playbook page, future proposals, and any other touchpoint. Embedded as base64 inline so the rendered HTML is portable. Existing client logos live there: `protelicious.svg`, `riyanika.png`, `fourth-frontier.png`, `incred-golf.png`, `sandr.png`. Add the new client's logo before the first audit ship.
- `--eyebrow "Go-to-Market Audit"`. Overrides the default proposal-eyebrow ("Proposed Scope of Work") that the same template uses for SoWs. The default would be wrong on an audit.

**Hosted HTML.** Copy the rendered `.html` to `EntireCommerce/site/public/audits/{client-slug}-{YYYY-MM-DD}.html`. Next.js serves it as a static asset at `entirecommerce.ai/audits/{client-slug}-{YYYY-MM-DD}.html`. The template carries `<meta name="robots" content="noindex, nofollow">` so the link stays private (only those who get it can find it). Email both the PDF (attachment) and the hosted-HTML link (in body) — some clients prefer one or the other.

**Title format on the rendered cover.** The H1 in the markdown source must read **"Go-to-Market Audit: {Client Name} ({YYYY-MM-DD})"** — write it out in full. Do **not** use the abbreviation "GTM Audit" in the H1, because batch jargon-replacement passes that gloss `GTM` as `Google Tag Manager` will catch the title and break the cover. The eyebrow above the title carries "Go-to-Market Audit" as well; the title and eyebrow read together as a clear, founder-readable header pair.

**Install check.** `python3 -c "import markdown"` and `ls "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"` should both succeed. If markdown is missing, `pip install markdown`. Chrome headless is the default macOS path on the operator's Mac.

**Clickable URLs and emails.** The render script's `autolink_urls_and_emails` post-processor wraps every bare URL (`https://...`) and email address (`x@y.z`) in an anchor tag automatically. Bare URLs in the markdown source are fine; the renderer handles them. Markdown link syntax `[Book a call](https://entirecommerce.ai/book-a-call)` and angle-bracket form `<https://entirecommerce.ai/book-a-call>` also work — both produce clickable output. The `a` style in `proposal-template.html` is teal-accented + underlined so links are visibly clickable. Bare URLs that ship as plain text in the rendered HTML are a render defect; verify with a grep on the rendered HTML if in doubt.

### Email overview (reopener-style paste-ready draft)

A paste-ready email summarising the full audit in twelve high-impact bullets. Lands in an active Gmail thread with the founder. Does the heavy lifting when the founder will not open the attached PDF until later.

**Length target**: 550 to 700 words of body, plus sign-off.

**Structure** (use this template for every client; adapt bucket names if findings warrant):

1. **Opener** (1 to 2 sentences): relationship-aware greeting.
2. **Framing paragraph**: one paragraph validating what the brand has genuinely built right. Anchors the reader in trust before critique.
3. **Transition line**: "Quick version of what I'd do differently this time" or equivalent.
4. **Three buckets of 4 bullets each (12 bullets total, 10 to 15 acceptable if density warrants)**:
   - **What you've built, hidden in plain sight** — strategic opportunities (founder credentials buried, positioning unclaimed, category whitespace, competitor wedges the client uniquely can exploit).
   - **What's slipping away right now** — decay vectors (branded-traffic dependency fading, AI Overviews compressing clicks, funnel leaks, missed capture, stock-outs without follow-up).
   - **Money already on the floor** — tactical quick wins (review schema, unlinked press reclaim, shipping friction, striking-distance content).
5. **Close** (1 paragraph): "this isn't urgent because you've done anything wrong" frame + the real decay drivers + why the timing matters.
6. **Execution-tag summary line** (1 sentence): name the count of Claude-executable, Hybrid, and Human items so the prospect sees what we run for them. Format: "Of the N action items in the audit, X are Claude-executable with API/MCP access — we set up access once and then run them for you. Y are Hybrid (we build, you approve and ship). Z need your team's hands directly." This is the speed-thesis made tangible and a natural lead into the close.
7. **Sign-off**: "here you go" register per `feedback_warm_reopener_no_meeting_ask.md`. A relationship-anchored line then a first name.

**Subject line rule**: when replying inside an existing live thread with the founder (default for warm re-engagement prospects), omit the subject line entirely. When sending cold to a contact with no active thread, include one. Ask the operator if unclear.

**Voice**: plain English per the Step 6B gate in the prompt. Em-dashes allowed. No unexplained jargon. No meeting pitch, no CTA, no book-a-call link.

**File path**: `{client_folder}/docs/gtm-audit/{YYYY-MM-DD}/{client-slug}-reopener-email-draft.txt`. Plain text, not markdown. Clean paste into Gmail.

### Report structure (fixed order)

**Front matter: two sections only** (not four as in the prior spec). Cross-cutting Themes and Key Findings are dropped as separate blocks; they duplicated content in the Exec Summary and body section openings. See `feedback_audit_front_matter_two_sections.md`.

```
# Go-to-Market Audit: {Client Name} ({YYYY-MM-DD})

[2026-05-15: the reading-frame quoted block that used to sit here is DROPPED. The report opens straight into the Executive Summary. See `feedback_audit_synthesis_first_compression.md`.]

## Executive summary

One short framing sentence (not a 3-to-4-sentence paragraph), then bold sub-blocks each carrying a **bullet list** (What the brand built right / Key issues identified). The Executive Summary sub-blocks are bullets, never prose paragraphs. **There is no "What we recommend" sub-block.** It was dropped because it restated the "Top 5 actions to ship first" table that immediately follows and the Section 11 roadmap (three copies of the same list). The forward-looking priorities live once in the "Top 5 actions to ship first" table; the Executive Summary never restates them as its own bullet list. See `feedback_audit_exec_summary_bullets_and_dejargon.md` and `feedback_audit_front_matter_two_sections.md`. Then the bold blocks:

**Key issues identified.** Three bullets in plain English. Each bullet states the issue, the consequence, and one supporting fact from the audit body. No more than 3 issues — pick the dominant ones. Use plain language; banned consultant jargon (sweep before render): load-bearing, productize, architectural move, leaking conversion, structurally non-negotiable. PDP → product page. Schema markup → page-level tags (with one-clause gloss on first use).

**Commercial-frame lead (mandatory).** The **first** bullet of "Key issues identified" must be a commercial, positioning, competitive, or business-model issue, drawn from Section 2 (Vision / Positioning), Section 3 (Competitor Audit), or the cross-cutting synthesis. Examples that qualify: a control-of-funnel gap (the brand does not own its booking flow / customer relationship / margin), a positioning collision with a category leader, a channel-power asymmetry (marketplace, distributor, franchise, OTA dependency), a missing wedge against the incumbent set. Examples that do NOT qualify for the first bullet (move to the second / third bullet or to "Quick wins"): homepage title tag, mobile LCP, missing schema, llms.txt 404, per-template metadata gaps. Founders have received countless SEO audits; leading the Executive summary with SEO hygiene reads the audit as an SEO audit, not a growth diagnostic. The technical findings are real and stay in the body — they are not the headline.

**Quick wins sub-block (new, after "Key issues identified").** A two-to-three-bullet callout titled **"Quick wins"** sitting between "Key issues identified" and the "Top 5 actions to ship first" table. Each bullet names a single 30-second-credibility item the audit surfaced — a fix the team could ship inside ten minutes that demonstrates the audit was specific and read the site closely. Source these from the SEO / AEO / technical findings (homepage title literally saying "Home page", a missing meta description, an llms.txt 404, an inverted canonical, a logo with double-domain in JSON-LD). These items belong here, not in "Key issues identified" — they are tactical credibility, not strategic findings. Render as plain bullets, no table, no headers other than the **"Quick wins."** bold lead.

The 3 to 5 forward-looking priorities the evidence points to are **not** written as an Executive Summary bullet list. They are expressed once, as the rows of the "Top 5 actions to ship first" table that immediately follows (concrete task + the actual fix + execution Type), and again at full depth in the Section 11 roadmap. The Executive Summary stops at the "Quick wins" sub-block and never adds a recommendations block. **Never crown a single move** ("the single most useful / most important / highest-impact move is X") in the Top 5 table or anywhere else. The audit's value is the landscape, the competitive gaps, the benchmarks, the hygiene issues, and a prioritised roadmap, not one prescriptive bet we might have wrong.

Frame everything as evidence-ranked recommendations, not dictation. **Avoid strong-opinion language**: do not write "mandatory", "non-negotiable", "must do", "not subject to experimentation". Also banned: any "the single most/biggest/highest- … move/lever/thing" singular-crown phrasing. The audit reports evidence and ranks; the founder decides. Use "the priorities the evidence points to", "the highest-confidence fixes", "the audit evidence is unanimous", "strongly recommended" instead.

## Top 5 actions to ship first

Legend immediately above the table, rendered as this exact three-item list (never a prose paragraph; the no-paragraph-over-3-sentences rule applies here too):

- **Claude:** run end-to-end by Claude Code with the right access; a one-time grant, zero ongoing team load.
- **Hybrid:** Claude Code builds; the team approves, refines, or adds creative direction.
- **Human:** the team's hands directly, e.g. founder conversations, design-taste decisions, internal-only credentials.

Then the 3-column markdown table:

| # | Task | Type |
|---:|---|---|
| 1 | **Bolded task title.** Plain-English description with the actual fix specified. *Why this first*: one-line rationale tied to the audit evidence. | Claude / Hybrid / Human |
| 2 | ... | ... |
| 3 | ... | ... |
| 4 | ... | ... |
| 5 | ... | ... |

**Top 5 content mix (mandatory).** The five rows must include at least two commercial, positioning, competitive, or business-model items — drawn from Section 2 (Vision / Positioning), Section 3 (Competitor Audit), the cross-competitor synthesis, or Section 6 (paid creative and account structure when commercial in nature). A Top 5 that is five SEO / AEO / technical-hygiene rows fails this mix requirement; surface the top-2 SEO/AEO items but rerank to make room for the commercial ones. The "Quick wins" sub-block above the table absorbs the additional sub-ten-minute hygiene items the Top 5 cannot accommodate. Rationale: SEO hygiene is real and fixable, but the audit's value to a founder is the strategic frame, not the title-tag sweep. The Top 5 is the founder's skim list — it sets the perceived shape of the audit.

Close the front matter with one neutral line pointing to Section 11 for the full ranked list. **Do not point to an Access-Grant Checklist** — that section is removed from the audit deliverable; access asks move to the SOW. See `feedback_audit_no_access_asks_in_audit.md`.

## 1. Audience
## 2. Vision
## 3. Competitor Audit
## 4. SEO
## 5. AEO
## 6. Paid Ad Recommendations
## 7. Email, Lifecycle, and CRM
## 8. Conversion Optimisation
## 9. Analytics and Tracking
## 10. Organic Social Presence
## 11. Prioritised Action Items

## Conclusion

[REPORT ENDS HERE — at the last line of the Conclusion. The runner
deterministically appends "## Next Steps" (the strong CTA + booking
button), the single supporting-data availability line, and the
signature, AFTER the gate and voice-cleanup passes. Do NOT model-write a
"## Next Steps" section, a "## Methodology and reading frame" block, a
"## Supporting data files" list, or any sign-off: a model-written copy
is stripped, and any drift in the conversion-critical CTA is the single
worst thing that can ship on a cold prospect's first touch.]
```

### Required report structure (hard contract — non-droppable)

The report is incomplete and is rejected and retried by the quality gate if any item below is absent. This contract overrides brevity: the universal ≤3-sentence / bulletise rule governs the **prose within** a section; it never licenses dropping a section, a sub-section that carries a required table, or a required table. Compress prose to fit everything in; never delete a required block to stay short.

**Front matter, two sections only, in order:** Executive summary (bulleted, never prose paragraphs) · "Top 5 actions to ship first" (bulleted Claude / Hybrid / Human legend, then the 3-column table). There is no separate Cross-cutting-themes or Key-findings block; the synthesis-pass cross-cutting insights are folded into the Executive summary narrative and the Section 11 re-rank (see `feedback_audit_front_matter_two_sections.md` and the Synthesis pass section below).

**Executive summary required sub-blocks, in order:** "What the brand built right" (3 bullets) · "Key issues identified" (3 bullets, **first bullet must be commercial / positioning / competitive / business-model**, not SEO hygiene) · **"Quick wins"** (2-3 bullets, sub-ten-minute fixes the team could ship today — homepage title, missing meta, llms.txt 404, schema gaps — distinct from "Key issues" because they are tactical credibility, not strategic findings).

**Top 5 content-mix requirement:** the five-row table must include at least two commercial / positioning / competitive / business-model items. A Top 5 dominated by SEO / AEO / technical hygiene fails the mix; rerank to make room for commercial items, push surplus hygiene into the Executive summary Quick wins block.

**The eleven sections, every one present and substantive, in order** (titles de-jargoned per founder-readability rule 5 — Section 5 is "Visibility in AI Answer Engines", Section 7 names the customer database not "CRM", Section 8 is "Conversion Optimisation"): 1 Audience · 2 Vision/Positioning · 3 Competitor Audit · 4 SEO · 5 Visibility in AI Answer Engines · 6 Paid Ad Recommendations · 7 Email, Lifecycle, and Customer Database · 8 Conversion Optimisation · 9 Analytics and Tracking · 10 Organic Social Presence · 11 Prioritised Action Items.

**Required tables, every one present as a real markdown table** (the section that owns each is marked REQUIRED — non-droppable below):

- Section 2: **commercial benchmark** table (columns: Company | Revenue | Employees | Funding | Ownership; brand-first, competitors after) when public firmographics exist, with basis-tagged figures (filed / reported / estimate) and a scale-context sentence; omit silently when no data.
- Section 2: **Recent signals** — a short bulleted current-news snapshot (brand first, then competitors with material news; at most two dated, source-linked items each). Light, not a structural table; render only when public news exists, omit silently otherwise.
- Section 2: **problem → feature → benefit** table (columns: # | Problem statement | Feature/attribute that solves it | Benefit; top 5 rows).
- Section 3: **competitor positioning** table (columns: Website | SEO Title | SEO Meta Description | Tagline/key headline; audited brand as the first row).
- Section 3: cross-competitor **ranked-keyword distribution** table (brand in the first column or row).
- Section 3: **pricing-and-positioning ladder** table (brand first).
- Section 4B: **backlink / domain-authority benchmark** table (brand first).
- Section 4C.5: **industry keyword universe page-map** — the 6 to 12 cluster summary table plus the top-10 build recommendations.
- Section 4E: **content-scoring rubric** table (factor / weight / question).
- Section 5E: **per-template metadata + schema matrix** — ten page archetypes (homepage, collection/category index, product detail page, blog index, blog article, about/team, FAQ page, technology/methodology, refund/policy, comparison/vs page) as rows against the full field list (title tag, meta description, open-graph and twitter pairs, canonical, every required structured-data type per archetype) as columns, one ✅/❌/⚠️ per cell.

Then the Conclusion. **The report ends at the last line of the Conclusion** (the runner appends Next Steps + supporting-data + signature).

**Required is not licence to fabricate.** Populate every required table from measured or evidenced data. A cell or row with no data is marked "not sampled" / "—" / "not detected" — never an invented number, competitor, or keyword. An honest empty cell is correct; a fabricated row is a failure. A required table the data cannot fill is still rendered with its columns and an honest empty body — the empty table is itself the finding (it shows the gap to close). For §5E, a page archetype with no measured record is a "not sampled" row, never a hedge-prose paragraph.

**Section 11 task tables.** Section 11A keeps the foundational-fixes narrative (the structural argument, evidence bullets, sequencing paragraph, expected impact). The granular, execution-tagged task items live in the Section 11B prioritised roadmap — 11A summarises, 11B is the list.

Section 11B presents three tables, one per bucket:

```
**Top priority** (top 5 by ICE score).

| # | Task | Type |
|---:|---|---|
| 1 | ... | Claude |

**Near-term.**

| # | Task | Type |
|---:|---|---|
| 6 | ... | Hybrid |

**Backlog.**

| # | Task | Type |
|---:|---|---|
| 15 | ... | Claude |
```

Numbering is continuous across the three tables (1-5, 6-N, N+1-end). Close 11B with a one-line tag-distribution summary: "Of the N total items, X are Claude, Y are Hybrid, Z are Human." See `feedback_audit_execution_tags.md` (updated 2026-05-13).

**Plain-English voice (mandatory in the Executive summary block).** The executive summary is the founder skim. Most founders are not marketing or web-engineering experts. Translate every technical term the first time it appears, or substitute a plain-English equivalent. Apply the "Standard plain-English glossary" (see Voice conventions section below) to the Executive summary block. The body of the report (Sections 1 through 10) can use technical register since marketing-team readers will be reading those.

Examples of the rewrite the executive summary needs:

| Technical (banned in exec summary) | Plain English (use in exec summary) |
|---|---|
| PDP | product page |
| Schema markup | page-level tags (with one-clause gloss: "the small bits of code that AI engines and Google read to decide which pages to cite") |
| Productize the international storefront | Build a proper international shopping experience |
| Zero server-rendered titles in the SSR HTML | When Google or an AI engine asks for the homepage, the response is an empty shell with no title. Content only appears after JavaScript runs in a browser. |
| Zero structured data anywhere on the marketing surface | None of the small structured-data tags that Google and the AI engines need to cite the page, anywhere on the site. |
| Every blog post canonicalised to a non-indexable URL | Every blog post carries a tag in its HTML telling Google "do not index this page". |
| AEO whitespace | The AI engines have not yet picked a winner in this category, so moving early is the play. |

No kill criteria anywhere in the audit deliverable — not in the executive summary, not in Section 11A, not in Section 11B. The audit surfaces insights and action items. The weekly review cadence is the kill mechanism for the audit's recommendations: every week reads what shipped, scores progress, and reranks. Hard-coded kill criteria at the audit stage read as overkill before any budget has been deployed.

**Hard timelines are banned in the audit body.** No "inside 90 days", "first thirty days", "first ninety days", "inside the first 30 days", etc. Use directional sequencing ("first wave / second wave / third wave", "as the work compounds") or the Section 11B bucket names (Top priority / Near-term / Backlog). External-system pacing references stay welcome ("the AI-search window is open for the next twelve to eighteen months", "Google indexing latency is 1 to 8 weeks"). SOW-stage timeline language still applies per `feedback_compress_timelines_for_ai_native_delivery.md`. The audit-body restriction is from `feedback_audit_no_hard_timelines.md`.

**"Claude Code" not "we" in the audit-stage register.** The audit may ship before any engagement is signed. Say "Claude Code can run end-to-end with the right API or MCP access" rather than "we run them for you". Named actor is the agent; engagement is conditional. "We" flips back in at SOW stage. See `feedback_audit_claude_code_not_we.md`.

## Conclusion
Bullet-led structure, not a wall of prose paragraphs. The Conclusion is the last block the founder reads; it is a target for the bullets-over-paragraphs rule, not an exception to it. Standard pattern:

- **Opening framing paragraph** (3 sentences max — universal rule, see Output formatting). Names the asset the brand sits on and the one-sentence narrative the audit reveals. No enumeration of action items inside this paragraph.
- **"The N top-priority moves to ship first"** — bullet list of the top 5 from Section 11B's Top priority table, each one short and declarative. One sentence per bullet.
- **"Tag distribution across the full action list"** — three bullets: X Claude, Y Hybrid, Z Human. No fourth conditional bullet (the four-tag bracketed system is dropped).
- **"How this routes into ongoing work"** — short paragraph (2 to 3 sentences) on the weekly review cadence and how the audit becomes the baseline that the cadence reranks weekly.
- **No CTA inside the Conclusion.** The call to action is the `## Next Steps` section that the runner appends deterministically after the Conclusion (do not model-write it). The Conclusion ends on the routing paragraph, and the report ends with the Conclusion.

Banned in the Conclusion: any paragraph that enumerates 3 or more action items, top-priority moves, or tag-distribution numbers inline. Convert to bullets per the bullets-over-paragraphs rule. The Conclusion's job is to be skim-readable on the founder's last scroll; a wall of prose at the end of an 8,000-word report is the regression.

## Next Steps — appended deterministically by the runner. DO NOT WRITE IT.

The "## Next Steps" section is no longer model-written. The runner appends a constant closing block — the strong CTA, the styled "Book a call" booking button, the single supporting-data availability line, and the signature — verbatim AFTER the gate and voice-cleanup passes, so it is never scrubbed, gated, or re-worded and is word-perfect every run. Any model-written `## Next Steps` (or anything after the Conclusion) is stripped before the constant is appended. End the report at the last line of the Conclusion.

Reason: the CTA is the conversion-critical surface on a cold prospect's first touch; it must not vary with model output. Final brand-polished copy, the product name, and the booking link are owned by the runner constant and the focused SaaS-conversion session (`EntireCommerce/saas/03-audit-funnel-saas-conversion-brief-2026-05-16.md`), not the model.

## Methodology and reading frame — REMOVED from the deliverable

Do not write a "## Methodology and reading frame" block or a "## Supporting data files" list. Both are banned in the body (founder-readability rule 2 — no methodology trailer, no enumerated supporting-data list, no internal filenames). The single "supporting data available on request" line is part of the runner-appended closing block, not model-written. Mode / Access-level notation is internal-operator metadata and never appears in the founder-facing report.
```

The report opens straight into the Executive summary (the title line is on the rendered cover) and ends at the Conclusion. Everything after the Conclusion is runner-appended.

Sections 7, 8, and 9 are Internal-layer-dependent. In Level 1 mode they appear as short stubs with a "grant access to populate" CTA. Sections 1 through 6 and section 10 always populate.

### Synthesis pass (mandatory before writing the report to disk)

After drafting all eleven sections in first-pass form, **do not write to disk yet**. Assemble the complete draft (exec summary placeholder + sections 1 through 11) as a single document in your working context and read it end-to-end as one block. Then ask:

- What pattern emerges across three or more sections that no single section captures on its own?
- What does a finding in Section X imply for the recommendations in Section Y? (For example: audience language in Section 1 that the Section 4 SEO cluster ignores; competitor whitespace in Section 3 that Section 6 paid recs do not exploit; Section 5 AEO gap that Section 4 content plan can fill.)
- Is there a strategic narrative connecting the data that the Executive Summary (as drafted) is missing?
- Does any Section 11 action address multiple section findings at once? Promote it.
- Does any high-ranked Section 11 action solve only one problem? Consider demoting it in favour of multi-leverage plays.

Produce 3 to 5 cross-cutting insights from this synthesis. These insights do **not** get their own block. The front matter is two sections only (Executive summary + Top 5 actions). Fold the synthesis in two places:

- The **Executive Summary** leads with the narrative these insights reveal. A cross-cutting insight that spans sections becomes a "Key issues identified" bullet (or, when it is an action to take, a "Top 5 actions to ship first" row), phrased as the insight itself, not as a meta-statement that an insight exists. There is no "What we recommend" block to fold into. State each insight once, at full depth, in the body section that owns it; the Executive summary names it in one sentence and never re-argues its evidence.
- **Section 11** re-ranks actions that hit multiple leverage points: promote actions that address two or more of these cross-cutting insights, demote single-leverage actions previously ranked above them. Do not add or delete items; only re-order.

Do not write a "Cross-cutting themes" heading, a "Key findings" heading, or any standalone front-matter block for these insights. They were dropped (they duplicated the Executive summary and the body section openings); see `feedback_audit_front_matter_two_sections.md`.

Only after the synthesis pass completes do you write the final consolidated file to disk.

### Section-by-section layer coverage

Tags: `[E]` External layer, runs on all modes. `[I]` Internal layer, runs only in A2 or B2. `[E+I]` runs both, with External as the fallback when Internal is absent.

| Section | A2 (full) | A1 (external) | B2 | B1 |
|---|---|---|---|---|
| 1. Audience | `[E]` Full | Full | Full | Full |
| 2. Vision | Founder input | Founder input | Founder input | Founder input |
| 3. Competitor Audit | `[E]` Full | Full | Full | Full |
| 4. SEO | `[E+I]` Full | 85% coverage | Full | 85% coverage |
| 5. AEO | `[E]` Full | Full | Full | Full |
| 6. Paid Ad Recommendations | `[E+I]` Full | Competitor layer only | Readiness check | Readiness check |
| 7. Email, Lifecycle, and CRM | `[I]` Full | Stub | Readiness check | Stub |
| 8. Conversion Optimisation | `[I]` Full | Trust-signal + heuristic only | Checklist framing | Checklist framing |
| 9. Analytics and Tracking | `[I]` Full | Lightweight public check | Readiness check | Readiness check |
| 10. Prioritised Action Items | Full | Full | Full | Full |

---

### Action item execution tags — three-type table system (updated 2026-05-13)

Every action list in the audit renders as a 3-column markdown table with a **Type** column carrying one of three values: **Claude**, **Hybrid**, **Human**. No bracketed prefixes inside item titles. No fourth conditional type. The Type column does the work the prior `[Claude]` / `[Hybrid]` / `[Human]` / `[Hybrid → Claude on access]` bracketed system was doing. See `feedback_audit_execution_tags.md` (rewritten 2026-05-13 from the Forest of Chintz audit).

```
| # | Task | Type |
|---:|---|---|
| 1 | ... | Claude / Hybrid / Human |
```

Legend immediately above the table, rendered as this exact three-item list (never a prose paragraph; the no-paragraph-over-3-sentences rule applies here too):

- **Claude:** run end-to-end by Claude Code with the right access; a one-time grant, zero ongoing team load.
- **Hybrid:** Claude Code builds; the team approves, refines, or adds creative direction.
- **Human:** the team's hands directly, e.g. founder conversations, design-taste decisions, internal-only credentials.

**Voice register:** say "Claude Code can run end-to-end" not "we run them for you". The audit may ship before the engagement; named actor is the agent, engagement is conditional. "We" flips back in at the SOW stage. See `feedback_audit_claude_code_not_we.md`.

| Type | Definition | Examples |
|---|---|---|
| **Claude** | Claude Code can run end-to-end with the right API or MCP access. Zero load on the team beyond the one-time access grant. | Schema injection; slug typo fix; image compression with width/height attributes; ad-account configuration; dashboard build; robots.txt edit; Conversions API setup; brand-mention reclaim sweep; ongoing review monitoring; press blog (`/blogs/press`) scaffold and posts; AI-bot allowlist; policy-page indexability review; technical SEO fixes via GitHub PR. |
| **Hybrid** | Claude Code builds the work; the team approves, refines, or contributes creative direction. Most client-facing deliverables involving founder voice or operational decisions live here. | International storefront setup with founder-approved payment / fulfilment choices; occasion-led shop with founder-voiced editorial; worn-on-model video production (requires creator / filming); plating warranty terms requiring founder / legal sign-off; stockist page with founder-approved retailer list; craft journal with founder editorial voice; CMS migration where the founder picks the platform. |
| **Human** | The founder or in-house team's hands are required directly. Judgement calls, founder conversations, customer interviews, board approvals, design-taste decisions, vendor negotiations. Typical count: 1-2 items per audit. | Founder Vision interview (Section 2 confirmation); customer-interview wave; board-level positioning call; partnership negotiations; legal sign-off; in-person investor meetings. |

**Classification rule.** The Type names the work's dominant nature, not the current access state. An item that "could be Claude once access is wired" is still classified Claude. The access ask becomes a separate engagement-scoping conversation that belongs in the SOW.

**Mandatory tag-distribution close.** After every action-list table (Top 5 + Section 11B Top priority + Near-term + Backlog), write a one-line summary: "Of the N total items, X are Claude, Y are Hybrid, Z are Human." Repeat in the Conclusion and in the email overview close. Speed-thesis is the closing pitch.

**Banned in the audit body** (deprecated from 2026-05-13):
- Bracketed prefixes inside task titles: `[Claude]`, `[Hybrid]`, `[Human]`, `[Hybrid → Claude on access]`. Use the Type column instead.
- "We run them for you" framing in the legend or tag-distribution lines. Use "Claude Code can run end-to-end" instead.
- The fourth `[Hybrid → Claude on access]` tag. Items either are Claude (dominant nature is automatable) or Hybrid (dominant nature needs human creative or approval). Access state lives in the SOW conversation.

**Edge cases:**

- An item that requires an external vendor (designer, lawyer, agency) is Human. We do not classify external-vendor work as Hybrid unless the vendor is part of the EC delivery.
- When in doubt between Hybrid and Claude, default to Hybrid — most knowledge work has a founder-voice or approval component even if Claude Code can draft the artefact.

---

## Section 1. Audience

Runs entirely on the External layer. The canonical master prompt lives at `EntireCommerce/playbooks/audience-insights-master-prompt.md`. That prompt is the authoritative spec. Do not rewrite it inline. Edits to the master prompt propagate to every audit.

### How to execute

1. Read the master prompt in full.
2. Feed it the three required inputs: target audience, #1 struggle, what the brand sells and how it helps. These come from the founder interview and the brand brief.
3. **First pass — review walls and standard sources.** Mine Wedmegood, Wanderlog, Trustpilot, Reviewfeeder, Reviews.io, brand-owned review pages, ProductReview.com.au, SmartCustomer (ex-Sitejabber), and any Reddit / Indie Hackers / Hacker News / X long-form / Quora / YouTube comments / niche forums that index the segment. Last 12 to 24 months. Target: 15 verbatim quotes.
4. **Second pass (mandatory when first pass yields under 20 quotes).** Apify `apify/instagram-scraper` on the brand-and-competitor Instagram accounts. Last 30 posts per account, all comments, target long emotional comments of 10+ words. Plus brand-direct review walls not captured in the first pass. Common case where this is mandatory: Indian, South Asian, MENA, and Southeast Asian segments where Reddit / forum discourse is genuinely thin and conversation happens on Instagram comments and WhatsApp groups. Cost: ~$1-3 per audit. See `feedback_audit_apify_defaults.md` (2026-05-13 from Forest of Chintz).
5. Combined first-pass + second-pass target: 30-60 verbatim quotes total. Log platform, approximate date, primary emotion, and any recurring phrase per quote.
5. Produce the Audience section. Front page ≤500 words. Nine sub-headings as specified: avatar, core tension, what they tried and why it broke, market gap, recurring language, three headlines, top three objections and responses, positioning angle, belief shift.
6. Append the 15-quote verbatim table in the supplementary data folder. Cross-link from the report.

### Quality bar

- Every claim in the section traces to a quote in the appendix.
- No speculation, no invented stats, no fabricated sources.
- Uses the audience's own register in headlines and objection responses.

### Founder review gate

The founder reviews and approves this section before downstream work ships. Positioning drift downstream is more expensive than a 20-minute founder review upstream.

### Supplementary Voice-of-Customer mining (when first-party reviews exist)

If the brand has a review corpus (Shopify product reviews, Amazon, Google Business, niche forums like GolfWRX, competitor review pages the client's product solves against), layer a 5-category VOC mining pass on top of the master-prompt output. This is the quarterly section from the weekly review, seeded at first-run.

Categories:

1. **Pain points (before product).** Specific pain in the reviewer's exact words. How long they suffered. What they tried before. Emotional state (frustrated, embarrassed, desperate, skeptical).
2. **Desires and outcomes (after product).** Specific outcomes mentioned. Timeline to results. Emotional state after using the product. Unexpected benefits.
3. **Objections and hesitations.** Price concerns and how they overcame them. Skepticism about claims. Competitor comparisons. Fear the product would not work for them.
4. **Exact customer language.** Direct quotes usable verbatim in ads. Vivid descriptions of the problem. Emotional language about results. Phrases starting with "I finally...", "I wish I had...", "I can't believe...".
5. **Transformation stories.** Reviews with a clear before-and-after arc. UGC script goldmines.
6. **Post-purchase / product-experience friction.** Recurring complaints reviewers report *after* buying — hardware faults, app or software problems, durability, fit, shipping and returns, support quality. A different lens from the pre-purchase pain in category 1: it is the operational signal that may be driving returns and weakening word-of-mouth, not copy raw material. Capture the exact phrasing and rank by frequency.

Preserve exact customer language. Do not paraphrase or clean up grammar. Rank everything by frequency. Save to `{data_folder}/audience-voc-mining.md` alongside the main quote table.

**Product-experience themes (required in-report block when the corpus supports it).** Beyond the verbatim buyer-language block, Section 1 carries a short **"Product-experience themes"** sub-block that synthesises category 6 above into 3 to 5 named, frequency-ranked themes, each one plain sentence, with a representative verbatim phrase where a clean one exists. Frame it as an operational signal — these may raise return rates and weaken word-of-mouth, and they point at product / support fixes the brand owns — and frame the caveat without limitation language: it is review-derived qualitative signal, and quantifying the return-rate impact is the confirming next step. Omit silently when the corpus holds no material post-purchase complaints; never invent friction. This is the post-purchase lens; the verbatim quote block remains the pre-purchase buyer-language artefact.

When reviews are mined from competitor products, note which complaints the client's product solves. That cross-reference feeds the objection-handling sub-section of Audience and the whitespace map in the Competitor Audit.

### Self-audit and operator-authored variant (narrow)

The 15-quote VOC mining is **mandatory** on every run, including standalone audits and self-audits. The operator is not a substitute for the audience, even when the operator belongs to the audience. Operator knowledge describes the audience; the 15 verbatim quotes produce the raw language (exact phrases, emotional register, recurring idioms) that becomes copy for headlines, ads, and email. Skipping the appendix sacrifices the copy raw material.

The only permitted skip: when the operator has mined this exact audience in a prior engagement and that mining lives on disk at `{operator_folder}/audience/{audience_slug}.md` with a date within the last 6 months. In that case: cross-link to the prior mining from Section 1.10 of the report, note "audience mined in prior engagement dated {YYYY-MM-DD}", and proceed. Otherwise always run the master prompt.

---

## Section 2. Vision

Short structured section from a 30-minute founder conversation. Captures the things the data cannot reveal. Neither layer is required. The founder is the source.

### Fields to fill

| Field | Prompt |
|---|---|
| 12-month revenue goal | What does the P&L look like 12 months from now if this works? |
| Target customer (founder's own words) | Who does this product exist for? Describe in the founder's vocabulary |
| Positioning hypothesis | Why does this brand win? What does it do that nobody else does? |
| Product origin story | Why did the founder build this? What was the triggering pain or insight? |
| Known constraints | Capital, team, time, supply chain, regulatory |
| What has been tried | Every channel and experiment run so far, success or failure |

### How to execute

- Either conduct the interview live (founder present) or ask the founder to fill a structured form.
- Transcribe verbatim where the founder's phrasing is distinctive. Exact words become positioning raw material.
- If the founder volunteered a hypothesis about what is holding growth back, cross-reference it against the evidence in Section 11. If the data disagrees, flag the disagreement explicitly. That flag is load-bearing.

### Commercial benchmark (firmographics — render when data is available)

Before the commercial-structure table, size the prospect against the field on real commercial facts, not just the marketing surface. A "commercial view" means external firmographics — revenue, headcount, funding, ownership — pulled from public sources (company filings, press, data brokers), NOT a re-labelled framework table. Without it the audit can silently measure a 17-person seed-stage challenger against a venture-scaled leader on backlinks alone and read as if it did not understand who the brand is.

Render as a benchmark table, the audited brand FIRST (client-first), competitors after:

| Company | Revenue | Employees | Funding | Ownership |
|---|---|---|---|---|

Calibration is mandatory and mirrors the rest of the audit. Tag every figure by basis: a company-registry or annual-report number (e.g. UK Companies House) is **filed** and citable as fact; a named-press or company figure is **reported**; a third-party data-broker number (Growjo, GetLatka, ZoomInfo, RocketReach) is an **estimate** and is labelled in the cell ("~$2.8M (est.)"). Never state an estimate as fact, never average or infer a missing number — an unknown cell is an honest "—". After the table, add one or two sentences of scale context that reframe the roadmap: when the prospect is materially smaller than the category leaders, say so plainly and justify a winnable-wedge strategy over a match-the-leader one.

Conditional: render this table only when public firmographics exist for at least the brand or one competitor. When no firmographic data is available, omit the table and the scale-context framing silently — never narrate the absence.

### Recent signals (render when public news exists)

After the commercial benchmark table, give a brief, current snapshot of what the open web is saying about the prospect and its competitors right now — funding, launches, expansion, leadership changes, partnerships, results, acquisitions, regulatory moves. This is a factual current-state read, not a strategic framework: it tells the founder what is live in their landscape this quarter, and it surfaces competitor moves the rest of the audit should be read against.

Source from a news search over roughly the last 12 months (the runner supplies `commercial_news`). Render as a compact bulleted list, the audited brand first, then each competitor that has material news:

- **Brand** — one-sentence development *(date)* — [source]
- **Competitor** — one-sentence development *(date)* — [source]

At most two items per entity, most material first. One plain sentence each; keep the date and link from the source. Drop low-signal results (listicles, generic pages, unrelated same-name companies) and drop any entity with no material news rather than writing "no recent news". Report only what a result states — never extrapolate a trend or a motive from a single headline.

Conditional and light: render only when public news exists for at least the brand or one competitor. When no news data is available, omit the block silently — never narrate the absence. This is not a required structural table and is not gate-checked.

### Problem → feature → benefit table (REQUIRED — non-droppable)

Section 2 ships a five-row table that connects the buyer's problem to the product and to the payoff. Exactly four columns, in this order:

| # | Problem statement | Feature/attribute that solves it | Benefit (problem restated positively) |
|---:|---|---|---|
| 1 | … | … | … |

Rules:

- **Top 5 rows.** The five problems that matter most to the buyer, highest-frequency or highest-stakes first.
- **Problem column is grounded in real customer language where it exists.** Draw it from the verbatim Voice-of-Customer set (the mined Reddit / Trustpilot / review quotes that also feed Section 1). Use the buyer's own short phrasing. Where no VOC evidence covers a problem, infer it from what the brand sells and category knowledge and mark that row's problem `(inferred)`. Never invent a customer quote or attribute a problem to customers the evidence does not support. This stays inside the no-overconfidence rule and the VOC-lens pass: a problem with no VOC backing is inference, labelled as inference, never stated as a measured customer finding.
- **Feature/attribute column** names the specific product feature, material, design choice, service, or guarantee that addresses that problem. Concrete, not a slogan.
- **Benefit column is the problem restated positively** — the mechanical inversion of the problem once the feature is applied. This is the safe column: it follows from the first two.
- Render as a real markdown table. It is checked by the structural-completeness quality gate; compress prose elsewhere if length is tight, never drop this table.

### No-interview variant

When no founder interview has happened, build this section from public signals: homepage copy, About page, LinkedIn/X of the founder, podcast appearances, press mentions. Label the section "Inferred Vision" and note that founder confirmation is pending. Every inferred claim must link to its public source. The problem → feature → benefit table still ships in this variant; its Problem column still leans on the VOC set first and marks non-VOC rows `(inferred)`.

---

## Section 3. Competitor Audit

Runs entirely on the External layer. For each of the 3 to 5 competitor URLs the founder provides, produce a full in-depth teardown. Quality bar: full catalogue crawl, keyword footprint, content gap, attribute-tuple gap analysis for catalog businesses, ad-library scan, paid-SERP intel, and cross-competitor synthesis. Anything shallower gets rejected.

**Step 0 — confirm the competitor set + benchmarking opt-in.** Run the competitor-set decision tree from `PLAYBOOK-TEMPLATE.md` ("Benchmarking tables (conditional, opt-in)") before touching Section 3. If `{client}/CLAUDE.md` already has a populated `## Competitors` block, use it. Otherwise auto-propose a set from the SERP and keyword data this audit has already pulled (Section 4 typically runs before Section 3 in the Internal-layer ordering, but if not, kick off a Serper head-term scan first). Present the proposed set to the user with a one-line rationale per competitor, and ask: "Confirm, swap, or skip benchmarking?" If the user provides or confirms, write the set back into `{client}/CLAUDE.md` so future runs inherit it. If the user skips, render Section 3 without BTs and note "Benchmarking skipped — no competitor set provided."

**Conditional BT deliverables for Section 3+ (when a competitor set exists):**
- `BT-VOC-trustpilot-google.md` — Trustpilot + Google Maps + Amazon review counts and ratings per competitor. Stats-only refresh: `kaix/trustpilot-reviews-scraper` with `maxItems=1` per domain (~$0.001 per BT refresh). See PLAYBOOK-TEMPLATE for schema and pull paths.
- `BT-SEO-organic-visibility.md` — Section 4 deliverable, populated alongside the Section 4 SEO work.
- `BT-PAID-meta-google-presence.md` — Section 6 deliverable.
- `BT-EMAIL-lifecycle-coverage.md` — Section 7 deliverable.
- `BT-CRO-page-experience.md` — Section 8 deliverable.
- `BT-SOCIAL-channel-strength.md` — captured here in Section 3 if not already in a parallel competitor-creative-audit run.
- `BT-CATALOG-breadth-depth.md` — for Shopify competitors (`autofacts/shopify` Actor).

Each BT is conditional on the Step 0 outcome. Skip cleanly if the user opted out.

### Benchmarking-table layout rule (universal across the audit)

Every benchmarking table in the report places the **client's metrics in the first column** (when entities are columns) or the **first row** (when entities are rows). Competitors fill the columns or rows after that. Empty cells are fine — `—`, `0`, "not detected", or a brief honest framing — but the client's column or row is always present, even when every cell is empty. The empty column itself is the narrative: it shows the gap to close, the headroom available, the growth potential of catching up to the competitive field.

Applies across (at minimum): Section 3.3 cross-comp synthesis, Section 4B backlinks comparison, Section 4C ranked-keywords distribution, Section 5 AI search visibility citation counts, Section 6 paid-creative volume comparison, Section 8 Core Web Vitals comparison. Do not bury the client in a "for comparison" paragraph below the table — that defeats the purpose of the benchmarking format. See `feedback_benchmarking_table_client_first.md` in curated memory.

### Per-competitor sub-section

**2026-05-15 — report-output rule (overrides the structure below for the founder-facing document).** Section 3 in the rendered report is **synthesis-first**: lead with the whitespace map, the shared-weakness map, the pricing-and-positioning ladder, and the benchmarking tables (client first). Each competitor then gets a **3-4 line highlight only** (the one or two findings that matter for the client). The six-subsection teardown below is the *internal research depth* the sub-agent produces; it does NOT ship verbatim in the founder-facing report. Compress on merge. The granular per-competitor keyword/backlink/ad data lives in the supporting-data bucket, referenced only by the single closing email line. See `feedback_audit_synthesis_first_compression.md`. The subsections below define the research the analyst runs:

Every competitor gets six research subsections. Every check is External layer.

**3.1 Technical SEO.** Screaming Frog crawl. Core Web Vitals by template via PageSpeed Insights. Schema coverage. robots.txt and XML sitemap check.

**3.2 Keyword footprint.** Top 100 ranking keywords, monthly organic traffic estimate, top landing pages by traffic. Source order: Ahrefs first (primary), DataForSEO Labs second (fallback). If Ahrefs is absent, DataForSEO Labs alone is acceptable. DataForSEO Backlinks is **not** a keyword source and is being retired 2026-06-06 — keyword footprint runs on Ahrefs or DataForSEO Labs only.

**3.3 Content gap.** Keywords the competitor ranks for that the client does not. Quantified by volume and difficulty. Cluster the gap by intent (informational, commercial, transactional). **Tool note:** do not use `seo_toolkit.py keyword-research` gap mode directly. Its underlying DataForSEO `labs_keyword_intersection` call returns keywords BOTH domains rank for (intersection), not the set difference (true gap). Correct pattern: pull `labs_ranked_keywords` on the client domain and each competitor separately, then compute `competitor_keywords - client_keywords` in Python. Quantify from the resulting delta. Confirmed bug across four runs. Save per-competitor output as `content-gap-vs-{competitor}.csv` and a top-30 shortlist as `content-gap-vs-{competitor}-top30.csv`. Do not treat per-competitor CSVs as the final keyword artefact; Section 4C.5 consolidates them into the category universe with page-mapping.

**3.4 Meta Ad Library scan.** 30-day snapshot of active ads. Apply the 6-dimension analysis specified in Section 6 (hooks, messaging angles, formats, production style, CTA and offer strategy, volume and rotation). Flag any ad running more than 14 days as a probable winner. Output the whitespace map and the "steal this" list for this competitor.

**Default method (updated 2026-05-13): `apify/facebook-ads-scraper` via the Apify API.** Input via `startUrls`, one URL per brand: `https://www.facebook.com/ads/library/?active_status=all&ad_type=all&country=ALL&q={brand_name}&search_type=keyword_unordered`. Count = 30 per brand. Run cost: well under $5 for a 7-brand scrape returning 4,000-5,000 ads. Filter the raw dataset to the actual competitor pages by matching `snapshot.pageName` to the competitor brand list (the keyword search returns noise from unrelated advertisers that happen to match the keyword). Surface long-runner count (active 14+ days = probable winner), media mix (carousel / image / video), top CTAs, and sample copy per brand. See `feedback_audit_apify_defaults.md` and the FOC 2026-05-13 worked example.

**Fallback (rare):** if `APIFY_API_TOKEN` is genuinely missing, use a Playwright or headless-browser script. Manual screenshot capture is the last resort and is no longer the default. Do not ship the section with "manual screenshot pass required" framing — that was deprecated 2026-05-13.

**3.5 Google Ads intel.** Pull paid SERP copy via Serper on the top 20 commercial queries. Log headline patterns, description patterns, sitelink usage, offer language.

**3.6 Summary verdict.** Three things this competitor does better than the client. Three things the client can beat them on. Two paragraphs max.

### Cross-competitor synthesis

**REQUIRED — non-droppable.** The **competitor positioning table**, the cross-competitor **ranked-keyword distribution table**, and the **pricing-and-positioning ladder table** (all three with the brand in the first column or row) are required in every report and are checked by the structural-completeness quality gate. Compress prose elsewhere if length is tight; never drop these tables.

**Competitor positioning table (lead the synthesis with this).** Four columns, in this exact order: **Website | SEO Title | SEO Meta Description | Tagline/key headline**. The audited brand is the **first row** (its homepage was scraped). Each competitor follows, one row each: provided competitors come from the scraped competitor homepages; when the founder supplied none, the runner auto-derives 3 to 5 category competitors and fetches their homepages, so this table populates either way. Source mapping per row: SEO Title = the homepage `<title>` tag; SEO Meta Description = the homepage meta description; Tagline/key headline = that site's `h1` or `og:title`, bounded to 1 to 3 short lines (never the whole hero block). A cell with no fetched value is an honest `—`; never invent a title, description, or tagline. If a derived competitor turns out to be the wrong entity (a same-name company in a different industry, a parked or error page), drop that row silently. Never add a sentence explaining the mismatch, that a URL resolved to a different company, or that the competitor set was auto-built or "should be defined more precisely" — that is a banned data-collection-artefact narration (see the no-limitations rule below). The reader never learns the set was machine-derived. This models the supplied S&R Jewellers reference layout.

After the 3 to 5 individual teardowns, produce a combined summary:

- Competitor positioning table: the four-column Website / SEO Title / SEO Meta Description / Tagline snapshot specified above, audited brand first row. This is the opening artefact of the synthesis.
- Whitespace map: what none of the competitors are doing that the client could own.
- Shared weakness map: what all the competitors do badly that the client can arbitrage.
- Pricing + positioning ladder: where each competitor sits on price and positioning, where the client sits. Render as a table with the client in the first column and competitors arranged in the columns to the right, per the benchmarking-table layout rule above.
- Catalog attribute gap: for product-catalog businesses (jewellery, apparel, accessories), produce a full attribute-tuple gap analysis. Normalise the competitor catalogue and the client catalogue into attribute tuples, then compute the gap table (tuples the competitor has and the client does not), matched tuples (both have), and client-only tuples. Rank gap tuples by competitor depth as a proxy for market demand. Tuple definition varies by category. Jewellery example: (product_type, subtype, stone_type, stone_shape, metal_type, metal_colour, carat_range). Apparel example: (product_type, fabric, cut, colour, size_range).

Per-competitor supporting data lands in `{data_folder}/competitors/{competitor-slug}/`.

### Competitive creative (Meta Ad Library — rendered when `creative_audit` is present)

A live read of each advertiser's **active** ads from the Meta Ad Library, folded into the Section 3 synthesis. The runner pulls the active ads per advertiser and a synthesis pass maps them (`creative_audit`). Render as a table after the positioning synthesis — Brand | Dominant format | Typical CTA | Messaging angle (verbatim hooks) — audited brand first, then each competitor; quote 1-2 hooks verbatim in the last column. Then one or two sentences on the whitespace (`creative_audit.whitespace`): the positioning angle the field leaves open that the brand could own. These are real scraped ads — quoting the hooks verbatim is required; never invent a format, CTA, or hook. A competitor with no active ads is an honest "—" row or dropped. Conditional: omit silently when no `creative_audit` data exists; never state that ad creative was not scraped. Not gate-checked. (The deeper creative-fatigue and hook-pattern teardown stays in Section 6 when paid access is granted; this Section 3 block is the outside-in landscape read.)

---

## Section 4. SEO

Technical SEO, keyword and content diagnostics, backlink profile, on-page opportunity patterns. External layer covers most of it. GSC is the key Internal-layer add-on when access exists.

### 4A.0. Stack and infrastructure identification (run first, before any technical recommendation)

**Why this is here.** The audit makes structural recommendations (CMS migration, SSR retrofit, headless commerce, etc.). Wrong recommendations land the agency in a credibility hole when the founder reads "migrate to a CMS" and we already know the blog is on Sanity. **Always identify the stack before recommending a stack change.** This subsection runs first in Section 4. Every other 4A finding interprets through it.

**Domain-disambiguation pre-flight (runs before anything else in 4A.0).** Before any stack work, confirm the submitted `storeUrl` is the brand's primary commercial site — not a sister property, a brochure split, a parent-group page, a legacy domain, or a marketplace listing. Signals that the submitted domain may be wrong:

- The homepage redirects (301/302) to a different root domain than the one submitted.
- The homepage names a different domain as the "shop", "book", "buy", or "store" link target (e.g. submitted `whalesborough.co.uk` but the booking CTA links to `whalesboroughliving.co.uk`; submitted a brand page on a parent group site but the actual brand domain is different).
- The submitted domain's homepage has no commerce / lead-capture / booking flow and the brand's commercial flow lives elsewhere.
- The submitted domain returns a thin landing page or a coming-soon shell while a parallel domain is the live commercial site.
- The founder brief lists a different domain as the "primary URL" than the form-submitted one.

When the pre-flight surfaces a likely-wrong primary, **do both of the following, in this order:**

1. Add a single bullet to the Executive summary "Key issues identified" block (or as the very first item before any other content if the Key issues block is otherwise commercial-frame heavy) reading **"Audit scoped to {submitted-domain}, but the brand's primary commercial site appears to be {detected-primary-domain}. The structural and SEO findings below describe {submitted-domain}; running the audit against {detected-primary-domain} would change most of Sections 3 to 9."** Render the alternative domain as a clickable link.
2. Continue running the audit against the submitted domain as scoped — do not silently switch. The founder is paying attention to the domain they submitted; switching surfaces is the model's call to make explicit, not silent. Note the alternative-domain finding once in the Executive summary and let the founder request a re-run.

Mark `none` and skip the bullet when the pre-flight clears (submitted domain IS the primary commercial site).

**Inputs to gather (in this order):**

1. **Ask the founder directly** (highest-confidence source). One-line questions, paste the answers in `CLAUDE.md` under "Tech stack":
   - What CMS is the marketing site on? What CMS is the blog on? (They often differ.)
   - What framework / runtime? (Next.js, Astro, Hugo, Gatsby, WordPress + theme, Shopify Liquid, Webflow, Framer, custom.)
   - What rendering mode? (SSR, SSG, ISR, CSR / SPA — confirm per route if there is a mix.)
   - Where is it hosted? (AWS, Vercel, Netlify, Cloudflare Pages, Render, self-hosted, Shopify-managed, WordPress.com, etc.)
   - What is the deploy pipeline? (GitHub-wired CI/CD, manual deploys, theme editor, etc.)
   - Is the codebase open-source friendly (Payload, Ghost, WordPress) or proprietary?
   - Any legacy redirect / old-domain history? (Domain consolidations are a common source of canonical bugs — see Askria 2026-05 case study below.)

2. **Verify via live HTML inspection** (always run, even when the founder has answered). For each major route archetype (homepage, blog post, persona page if applicable):
   - `curl -sL -H "User-Agent: Mozilla/5.0" {URL} -o /tmp/page.html`
   - Inspect the head for: `<title>`, `<link rel="canonical">`, `<meta name="description">`, Open Graph tags, JSON-LD blocks, `<meta name="generator">` (often reveals CMS).
   - Inspect the body for actual rendered text vs an empty Next.js `__next_error__` shell.
   - Inspect the script src paths for framework fingerprints: `_next/static` (Next.js), `_nuxt/` (Nuxt), `_astro/` (Astro), `wp-content/` (WordPress), `cdn.shopify.com` (Shopify), `assets.webflow.com` (Webflow), etc.
   - Inspect the head for CMS-specific meta tags: `name="generator"` for WordPress, `next-head-count` for Next.js Pages router, `data-precedence="next"` for Next.js App router, `framer-` classnames for Framer, etc.
   - Run on each route archetype because the site may use different stacks per route (Askria: marketing pages on bespoke Next.js, blog on Sanity).

3. **Verify via DNS and headers**:
   - `dig +short {domain}` and `dig +short www.{domain}` — reveals hosting (Vercel CNAME, Cloudflare, AWS CloudFront, etc.).
   - `curl -sI {URL}` — `Server` header, `X-Powered-By` header, `X-Vercel-*` headers, `CF-Ray` (Cloudflare), `via` header (CDN), etc.
   - WHOIS on legacy / variant domains the founder mentions (often reveals canonical-bug root causes).

**Output: a `stack-evaluation.md` file in the supporting data folder, following this template:**

| Surface | Stack |
|---|---|
| Marketing pages (`/`, persona pages, `/pricing`, `/about`) | e.g. Next.js App Router, bespoke React, no CMS |
| Blog (`/blog/...`) | e.g. Sanity (headless CMS) + Next.js renderer |
| Product app (if applicable) | e.g. Next.js + custom auth + AWS |
| Hosting | e.g. AWS (marketing + blog), separate AWS for product app |
| Rendering mode per surface | e.g. marketing = SSR (body) + empty head; blog = SSR with Sanity-driven head |
| Deploy pipeline | e.g. GitHub → GitHub Actions → AWS |
| Source-of-truth: founder confirmation Y/N | If N, mark every claim "verified via live HTML inspection only" |

**The audit's structural recommendations in Section 4A and Section 11 reference this stack-evaluation table.** Do not recommend "migrate to CMS X" if the surface in question is already on a comparable CMS. Do not recommend "retrofit SSR" if the route already SSRs. Do not recommend "switch hosting" without naming a concrete cost-benefit. **Every structural recommendation must name the surface it applies to (marketing pages, blog, product app, etc.) and acknowledge what is already in place.**

**Worked example (Askria, 2026-05-08).** May 7 audit recommended a CMS migration as the highest-leverage move. May 8 founder note: "we are running the blog on Sanity already, not sure why it's not indexing". Live HTML inspection confirmed: blog on Sanity (SSR + full head metadata + Article JSON-LD); marketing pages on bespoke Next.js (SSR body, empty head). The actual problems were a canonical-domain bug in the Sanity blog template (5-min fix) and a `generateMetadata` gap on the marketing pages (3-5 dev days). The original CMS-migration recommendation was misframed for the blog and overscoped for the marketing pages. **Cause: stack identification was run via Screaming Frog crawl alone, which did not capture the per-surface stack split.** Stack-evaluation step would have caught this on day one.

### 4A. Technical SEO and site health

> **Runner context:** when the enriched data carries an `onpage` array (the headless funnel runner — Screaming Frog cannot run there), that MEASURED per-template data (see §5E) is the technical-crawl source for §4A. Build the technical findings from it; do not narrate a missing crawl. The Screaming Frog bullet below is for the desktop / interactive flow where the licensed app is available. Site-wide full-crawl breadth (every URL, orphan pages) is deferred to EC Operator, not the free funnel.

- `[E]` **Screaming Frog crawl (mandatory, desktop/interactive flow).** Run a Screaming Frog SEO Spider headless crawl on the brand's primary domain. EntireCommerce holds a paid licence so cost is zero at the margin. Standard invocation:
  ```bash
  "/Applications/Screaming Frog SEO Spider.app/Contents/MacOS/ScreamingFrogSEOSpiderLauncher" \
    --crawl https://{brand-domain}/ \
    --headless \
    --output-folder {cwd}/data/gtm-audit/{YYYY-MM-DD}/seo/screaming-frog \
    --export-format csv \
    --overwrite \
    --export-tabs "Internal:All,Internal:HTML,Response Codes:All,Response Codes:Client Error (4xx),Response Codes:Redirection (3xx),Page Titles:All,Page Titles:Missing,Page Titles:Duplicate,Meta Description:All,Meta Description:Missing,H1:All,H1:Missing,H2:All,Images:All,Images:Missing Alt Text,Canonicals:All,Directives:All,Structured Data:All" \
    --bulk-export "Links:All Inlinks,Links:All Outlinks,Web:All Page Source" \
    --save-report "Crawl Overview,Issues Overview,Orphan Pages,Redirects:All Redirects,Structured Data:Validation Errors & Warnings Summary"
  ```
  Default rendering (HTML, no JS) is sufficient for SSR-hydrated stacks (Shopify, Kajabi, Webflow, WordPress). For SPAs (React/Vue/Next.js client-rendered, headless commerce, etc.) re-run with a config file that enables JavaScript rendering. Typical crawl time: 30 seconds for a 10-page site, 5-30 minutes for 100-1000 pages, longer for larger.
  Findings to surface in the report from the crawl: full page inventory (often reveals pages the founder forgot existed), H1 issues (missing / duplicate / multiple), structured-data status at scale, image hygiene (alt text + width/height attributes — the latter is a frequent CLS root cause), broken internal links, redirect chains, and the canonical paid-offer / checkout endpoint set (often visible as auth-gated 4xx responses).
  Save a `screaming-frog-findings.md` summary in the same folder with: site inventory table, material discoveries, critical issues by priority, page-speed root causes, redirect map, and any active-paid-offer endpoints surfaced. The full CSV export set goes alongside as raw evidence.
- `[E]` **Screaming Frog crawl on one peer-tier competitor (optional).** Worth doing when a same-band benchmark on H1 hygiene, schema adoption, page count, and image discipline would sharpen the Section 3 cross-competitor synthesis. Skip when the scale gap is too wide for the comparison to be actionable (e.g., crawling Kotter when the brand is a 3-person boutique). Pick the closest peer-by-style match. Output to `data/gtm-audit/{YYYY-MM-DD}/seo/screaming-frog-competitors/{competitor-slug}/`.
- `[E]` Meta tag audit: titles, descriptions, canonicals, duplicates.
- `[E]` **Core Web Vitals per template (mandatory).** Run PageSpeed Insights API on at least 4 page archetypes (homepage, persona / category page, product / blog page, pricing or contact). Both mobile and desktop strategies. Endpoint: `https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url={URL}&strategy={mobile|desktop}&category=PERFORMANCE&category=ACCESSIBILITY&category=BEST_PRACTICES&category=SEO&key=$GOOGLE_AI_KEY`. Save raw Lighthouse JSON per URL plus a summary table. Surface in Section 4A as a CWV table covering Performance score, LCP, CLS, TBT, SEO score per URL × strategy. Thresholds: LCP < 2.5s, CLS < 0.1, TBT < 200ms, mobile Perf ≥ 70, desktop Perf ≥ 80. **Listing PageSpeed as a tool is not the same as running it; the May 2026 Askria audit had PageSpeed in the tool list but the actual run was skipped, which missed a 0.911 mobile CLS catastrophe on `/pricing` and a 15-second mobile LCP on a blog post.** Make this a hard requirement, not an implicit one. **PageSpeed Insights IS Lighthouse.** When the enriched data carries PageSpeed / Core Web Vitals numbers, the lab and field values are authoritative — state them plainly in Section 4A as facts about the brand's pages (load time, layout shift, responsiveness) with the fix. Never hedge a measured Core Web Vitals number with "verify against PageSpeed Insights", "confirm against real-user monitoring", or "subject to the operating baseline", and never use the jargon "RUM" or "operating baseline" in the report.
- `[E]` Schema markup validation: Product, Organization, Breadcrumb, FAQ, Review, HowTo, Article. Tested against Google Rich Results Test. The Screaming Frog `Structured Data:All` export covers most of this at scale.
- `[E]` robots.txt plus XML sitemap coverage.
- `[E]` AI-bot robots.txt allow-list check. Must allow: GPTBot, ChatGPT-User, PerplexityBot, ClaudeBot, anthropic-ai, Google-Extended, Bingbot. Missing any of these blocks AEO surface and feeds Section 5.
- `[E]` Broken internal links plus orphan pages. Both surfaced directly by the Screaming Frog crawl above.
- `[E]` E-E-A-T signals audit: author credentials visible on content, About page with team info, contact information accessible, privacy policy and terms present, original research or expert quotes in content, external citations in the niche.
- `[E]` Shopify-specific redirect hygiene (Shopify stores only): scan for redirect chains A→B→C and duplicate targets via the `shopify-admin-url-redirect-audit` skill. Critical after platform migrations.
- `[E]` Shopify-specific catalog hygiene (Shopify stores only): orphan products (in zero collections) and over-collected products via the `shopify-admin-collection-membership-audit` skill.
- `[I]` GSC indexation report and coverage errors, any 404s or soft 404s the External crawl missed.

### 4B. Backlink and domain authority

**REQUIRED — non-droppable.** The backlink / domain-authority benchmark table (brand in the first column or row, competitors after) ships in every report and is checked by the structural-completeness quality gate. An honest empty/"not detected" cell is fine; the empty column is the gap-to-close narrative **when it is the client side that is empty**. When the entire competitor side comes back empty (no competitor benchmark could be measured), do NOT write the prose as a "vs competitors" comparison and do NOT narrate that competitor data was unavailable (no-limitations rule). Instead frame the section as the client's standalone authority profile read against the category-typical range (e.g. "a domain rank of N and N referring domains is a moderate profile for a brand of this age and regulatory class; the priority is X"). The table still renders with its competitor rows present and honest `n/a` cells; the prose simply does not claim a comparison that the data cannot support. With auto-derived competitors now wired through the same backlink + ranked-keyword pulls as supplied ones, an all-empty competitor side is rare; this is the safety net, not the default.

**Link-gap prospect table (rendered output when `link_gap` data is present).** The competitor-comparison method below ("surface domains linking to competitors but not the client") now ships as a real table in the report, grounded in the `link_gap` enrichment (referring-domain lists for brand + competitors, intersected and de-noised). Render it as a sub-block **after** the benchmark table: one method sentence (these domains already link into the category, so they are pre-qualified outreach targets), then a table — Domain | Authority (Ahrefs DR 0-100) | Links to (which competitors) — using the supplied ranked order, capped at the strongest 12 to 15 rows. Close with one sentence tiering the list into quick-wins (directories, unlinked mentions, review/podcast placements) versus earned digital PR (named publications, journals). Conditional: omit silently when no `link_gap` data exists; never narrate the absence. Not gate-checked.

- `[E]` Serper brand-mention scan. Runs a `backlink_discovery.py` script adapted from the internal EntireCommerce reference archive. Outputs every domain mentioning the client brand plus top competitor queries. Cost roughly $0.05 per run.
- `[E]` Ahrefs backlink snapshot for the client domain (**primary**): `site-explorer/backlinks-stats` (live + all-time referring domains and backlinks) and `site-explorer/domain-rating` (DR). Capture new and lost links over the last 90 days and any spam/disavow signal. Ahrefs is the sole backlink source (DataForSEO Backlinks retired 2026-06-06).

**Spam-score reporting rule (mandatory).** When any backlink table or finding reports a spam score (Moz Spam Score, Ahrefs spam metrics, DataForSEO spam_score), the cell or finding must be followed by a one-line caveat sentence in its own line, not a buried parenthetical: **"Tool-reported spam scores are noisy heuristics — they are not a Google ranking signal and do not indicate a manual penalty."** Place the caveat directly under the spam-score line/cell where it cannot be missed. A founder reading "Spam score: 32" without that caveat reads it as "Google has penalised my site"; that misreading is the failure mode the rule prevents.
- `[E]` Competitor backlink comparison (Ahrefs — sole backlink source): referring-domain gap vs the 3 to 5 competitors from Section 3. Surface domains linking to competitors but not to the client.
- `[E]` Unlinked brand mentions via `"{client-brand}" -site:{client-domain}` SERP check. Each unlinked mention becomes a reclaim opportunity.
- `[E]` Build a 10-tier backlink target list for the first 90 days (directories, niche publications, podcast placements, HARO responses, data-driven PR).
- `[I]` GSC "Top linking sites" report to cross-validate the External view and catch links only GSC sees.

### 4C. Keywords and content gaps

Apply the **Domain-Rating-anchored winnability clamp** from `seo-audit.md` Section 5.0 before scoring any opportunity:

- Pull DR (Ahrefs `site-explorer/domain-rating`, **primary**). Ahrefs DR is the sole authority source (DataForSEO Backlinks retired 2026-06-06).
- Set `KD_ceiling = max(DR + 5, 30)` for `DR ≥ 10`; `KD_ceiling = 30` for `DR < 10`.
- Tag every keyword row `winnable / stretch / unwinnable_today`. Exclude `unwinnable_today` from the page-map and from Section 11 action items.
- Carry `DR` and `KD_ceiling` into the report header.

- `[E]` Top 100 ranking keywords with 90-day position trend. Source order: Ahrefs first (primary), DataForSEO Labs fallback.
- `[E]` Lost keywords: positions dropped by 10 or more in the last 90 days.
- `[E]` Content gap vs top 3 category competitors. Long-tail opportunities with estimated volume. **Do not use `seo_toolkit.py keyword-research` gap mode directly** — see Section 3.3 tool note for the correct set-difference pattern. Prioritise keywords where difficulty is below 40 and a competitor ranks in the top 20.
- `[E+I]` Branded vs non-branded traffic split. External estimate via Ahrefs. Precise split via GSC when Internal is available.
- `[E+I]` Cannibalisation check. External first pass via Ahrefs. Query-level confirmation via GSC.

**Multi-domain cannibalisation rule (mandatory when the brand operates more than one domain).** When the scraped input or the founder brief shows the brand running content across multiple domains (sister sites, a brochure + a transactional split, a parent + sub-brand split, a country-specific TLD set), the cannibalisation analysis runs **per-domain first, cross-domain second** — never combined. Report two distinct columns or sub-tables: (a) within-domain duplicate-intent pages competing on the same query inside the same domain; (b) cross-domain duplicate-intent pages competing on the same query across the brand's domains. The two cases have different fixes: within-domain is a redirect / canonical / merge call, cross-domain is a strategic call about which domain owns which intent and which lives on as a redirect or a re-positioned property. Conflating the two reads to the founder as "Google is penalising us for our own site competing with itself", which is rarely the actual issue when the duplicates sit on different domains.

### 4D. Opportunity-patterns table

Surface these patterns in the first-run audit, re-check in every weekly review.

| Pattern | Threshold | Meaning | Action |
|---|---|---|---|
| High impressions, low CTR | ≥100 impressions + CTR below 1% + position 1 to 10 | Page ranks but title and meta do not earn the click | Flag for title and meta rewrite. Run `seo_toolkit.py ctr-optimize` for current-vs-suggested output |
| Striking distance | Position 4 to 15 + impressions ≥50 | One content push away from page 1 | Strengthen H1 and H2 to match query, add a direct-answer paragraph in the first 100 words, expand thin sections, add internal links from authority pages |
| High impressions, zero clicks | ≥200 impressions + 0 clicks + position above 10 | Ranking too deep, title likely off-topic | Content-gap finding. Create a dedicated page if none exists |
| Accidental rankings | Query maps to homepage or an unrelated post | Page is ranking for the wrong intent | Create a dedicated page for the query, link from the accidental ranker |
| Losing clicks over 90 days | Drop of 30%+ on a query that was a top-10 click earner | Ranking slipped or SERP feature captured the click | Check SERP manually for AI Overview, featured snippet, ad block. Re-optimise as direct-answer snippet |
| Brand query CTR below 50% | Query contains brand name, CTR below 50% | Competitors bidding on brand or homepage title failing to own the brand SERP | Flag Brand Defence Google Ads campaign as a high-priority action and tighten the homepage title |

### 4C.5. Industry keyword universe and page-map

**REQUIRED — non-droppable.** The 6-to-12-cluster summary table plus the top-10 build recommendations ship in every report and are checked by the structural-completeness quality gate. When the keyword APIs are unavailable, run the on-page-signal fallback (Section 3.3) and render the table on the reduced dataset, labelled accordingly — never omit it.

The consolidation step. Produces one deduped master dataset covering every query a buyer in the category uses, mapped to a specific existing URL or page-to-build.

Inputs: every `content-gap-vs-{competitor}.csv` from Section 3.3, the brand's own `labs_ranked_keywords` pull, and (where available) the `fallback-onpage-signals.csv` files per competitor.

Methodology:

1. Union every competitor's gap CSV by the `keyword` column via pandas or a Python `set`. Dedupe.
2. For each keyword, compute: `num_competitors_ranking` (how many of the audited competitors rank in positions 1 to 50), `aggregate_volume` (max volume observed), `max_difficulty`, `best_competitor_position`, `brand_ranks` (boolean).
3. Score: `opportunity = aggregate_volume × num_competitors_ranking × max(1, 101 - max_difficulty) × (1 if not brand_ranks else 0.3)`. The `(101 - max_difficulty)` factor keeps scores positive across DataForSEO's 0-100 KD scale. The `0.3` brand-factor deprioritises keywords the brand already ranks for so the universe surfaces whitespace first.
4. Classify intent: informational (how, what, guide, why, learn), commercial (best, top, review, comparison, vs, alternative), transactional (buy, price, shop, order, deal), navigational (brand terms). Pattern-match on phrase tokens, flag ambiguous cases.
5. Cluster by head-term semantic similarity. Aim for 6 to 12 clusters per audit, each with 15 to 150 keywords. Name each cluster by its canonical head term.
6. Page-map each cluster:
   - If the brand has an existing page ranking top 30 on the cluster's head term, action is `expand` (add sub-topic blocks, internal links, PAA answers).
   - If the brand has an existing page ranking below position 30, action is `rewrite` (angle is wrong or content is thin).
   - If the brand has no dedicated page for the cluster, action is `build` (specify page type: pillar 2000+ words, comparison hub, FAQ, buying guide, or category landing).
7. Output artefacts:
   - `{data_folder}/industry-keyword-universe.csv` with columns: keyword, aggregate_volume, max_difficulty, num_competitors_ranking, best_competitor_position, brand_ranks, brand_position, intent, cluster, opportunity_score, priority_rank.
   - `{data_folder}/industry-clusters-page-map.md` with the cluster-level table: cluster_name, intent, size, existing_page_url_or_missing, recommended_action, top_5_keywords_in_cluster, opportunity_score, priority_rank.
8. Inline in the report body: a summary table of the 6 to 12 clusters, plus the top 10 `build` recommendations in priority order. Detail lives in the CSV.

### 4C.6. SERP-feature mining (PAA, related searches, autocomplete)

Pure ranking data answers "what do competitors rank for". PAA, related searches, and autocomplete answer "what does the buyer actually type". They frequently diverge. Mining all three fills the long-tail and AEO-extractable layer the ranked-keyword pull misses.

For the top 20 to 30 head-term queries from 4C.5 (sorted by opportunity score, skipping any the brand already owns position 1):

1. Serper `/search` on each head term. Parse `peopleAlsoAsk[]` (4 to 8 questions) and `relatedSearches[]` (8 to 12 suggestions).
2. Serper `/autocomplete` on each head term plus three pattern-completions: `{head_term}` (raw), `best {head_term}`, `how to {head_term}`.
3. If Serper is unavailable: Google SERP via `WebSearch` works for related searches, and the autocomplete fallback is a manual incognito-Chrome capture exported to CSV. Log the gap honestly.
4. Compile output: `{data_folder}/serp-features-mined.csv` with columns: head_term, feature_type (paa / related / autocomplete), text, implied_intent, semantic_cluster.
5. Cluster the mined phrases. Many will attach to existing 4C.5 clusters. Some will surface new long-tail clusters the ranked-keyword pull missed (flag these explicitly).
6. Cross-check each mined phrase against the brand's existing content. Any phrase not addressed on-site becomes a candidate for: an FAQ block, an H3 section inside an existing page, or a dedicated page if the cluster warrants one.
7. Inline in the report body: 3 to 5 highest-opportunity new angles from the mining step, with recommended action and page target.

Deliverables from 4C.5 and 4C.6 are mandatory on every audit run where at least DataForSEO or Serper is configured. When neither is configured, the on-page-signal fallback in Section 3.3 runs and 4C.5 operates on the reduced dataset, labelled accordingly.

### 4E. Content-scoring rubric

**REQUIRED — non-droppable.** The factor / weight / question rubric table ships in every report and is checked by the structural-completeness quality gate.

Use when converting gap findings into ranked actions.

| Factor | Weight | Question |
|---|---|---|
| Customer Impact | 40% | Does this directly address a buyer question or pain point? |
| Content-Market Fit | 30% | Can we write this better than what already ranks? |
| Search Potential | 20% | Volume plus achievable difficulty for the domain authority? |
| Resources | 10% | Can we produce this in under a week with existing assets? |

---

## Section 5. AEO (answer-engine optimisation)

Answer-engine presence across ChatGPT, Claude, Perplexity, and Google AI Overviews. External-layer only.

### 5A. Presence audit

- `[E]` Test 10 to 20 commercial queries across ChatGPT, Claude, Perplexity, and Google AI Overview. For each query, record: is the brand cited, which competitors are cited, does an AI Overview exist. Use `seo_toolkit.py aeo-serp` for the Google AI Overview scan.

### 5B. Extractability checklist per priority page

| Check | Pass criterion |
|---|---|
| Clear definition in first paragraph | Yes or No |
| Self-contained answer blocks (work without surrounding context) | Yes or No |
| Statistics with sources cited | Yes or No |
| Comparison tables for `[X] vs [Y]` queries | Yes or No |
| FAQ section with natural-language questions | Yes or No |
| Schema markup (FAQ, HowTo, Article, Product) present | Yes or No |
| Expert attribution (author name and credentials) | Yes or No |
| Updated within last 6 months | Yes or No |
| Heading structure matches query patterns | Yes or No |
| AI bots allowed in robots.txt | Confirmed in Section 4A |

### 5C. Princeton GEO visibility-boost table

<!-- include-body: sections/aeo/princeton-geo.md -->

### 5D. Whitespace

Queries where the brand has no presence but at least two competitors are cited. Each row is a potential AEO content brief.

### 5E. Per-template metadata + schema matrix (REQUIRED — non-droppable)

**REQUIRED — non-droppable.** The ten-archetype × full-field-list matrix table ships in every report as a real markdown table and is checked by the structural-completeness quality gate (it must have all ten archetype rows and the full field-list columns; a template with no measured data is a "not sampled" row, never an omission or a hedge-prose paragraph).

Run the per-template matrix from `seo-audit.md` Section 6G — the same grid is mandatory inside the GTM audit. Inspect ten page archetypes (homepage, collection, PDP, blog index, blog article, about, FAQ, technology/methodology, refund/policy, comparison) against the full field list (`<title>`, meta description, og/twitter pairs, canonical, all required JSON-LD types per archetype, JSON-LD URL validity). Output `per-template-matrix.md` in the audit folder with ✅/❌/⚠️ per cell.

**MEASURED data — use it, do not infer (2026-05-16).** When the enriched data includes an `onpage` array, those are MEASURED per-template records (real fetched pages: `template`, `url`, `status_code`, `onpage_score`, `title`/`title_length`, `meta_description`/`description_length`, `canonical`, `h1`, `ld_json_types`, `og_present`, `twitter_present`, `duplicate_title`/`description`/`content`, `broken_links`, `click_depth`, `issues`). Build the §5E matrix **and** the §4A technical findings from these measured values. For any template that has an `onpage` record this is a HARD rule: state the measured fact, never infer or hedge. Banned for a measured template: "do not appear to", "without seeing a product page", "presumably", "likely", "the inferred friction points". `ld_json_types` is the structured-data column — if it lists `Product`/`Offer` the PDP HAS Product schema; say so; if `AggregateRating` is absent from a page with visible reviews, that is a measured gap, state it plainly. A template with NO `onpage` record may be marked "not sampled" in the matrix (no hedge prose). Catches: homepage `<title>` falling back to `shop.name`, double-domain logo URLs in Organization JSON-LD, FAQPage missing on dedicated FAQ pages, Product schema with incomplete offers, Person schema reduced to just `{name}` on about pages, Review schema absent on testimonial blocks, `llms.txt` and `llms-full.txt` 404. Most "LLM readiness" gaps live here, not in the AEO presence audit.

Also verify (root-level checks):
- `https://{domain}/llms.txt` returns 200
- `https://{domain}/llms-full.txt` returns 200

Both files are static markdown summaries that AI agents fetch when crawling the brand. Spec: <https://llmstxt.org>.

**llms.txt 404 promotion rule.** When the root-level check finds `llms.txt` or `llms-full.txt` missing (4xx response), the fix lands in the Executive summary **Quick wins** sub-block (the ship-it-today list), not in Section 11 Backlog. Rationale: shipping these two files is a ten-minute job (write the spec-compliant markdown, upload to the site root), the AI-citation surface gain is real, and a 404 here is the cleanest "look what we found in 30 seconds" credibility item. Treat the same way for any other sub-ten-minute hygiene fix the audit surfaces (homepage `<title>` literally saying "Home page", missing meta description on the homepage, a logo with double-domain in JSON-LD): these belong in Quick wins, not Section 11 Backlog.

---

## Section 6. Paid Ad Recommendations

Paid search, paid social, shopping, and competitor creative intelligence. Internal-heavy when access is granted. External layer alone still produces competitor intel and creative briefs.

### 6A. Account structure and health (Internal layer)

- `[I]` Account structure map: campaigns, ad sets, audiences, creatives.
- `[I]` CTR, CPC, ROAS, Quality Score, frequency by campaign. DTC benchmarks to test against: CTR above 5% on search, CPC under $2, CPA under $100 as starting reference points. Flag any campaign outside these bands.
- `[I]` Audience overlap analysis: wasted spend on competing sets.
- `[I]` Learning-phase diagnostic for Meta: any ad set stuck in "Learning Limited" (budget too low, audience too narrow, too many changes per week).
- `[I]` Pixel and CAPI event health in Events Manager.

### 6B. Creative fatigue scoring (4-tier framework)

Applied to every active creative. Same framework works for Google and Meta.

| Signal | Threshold | Status flag |
|---|---|---|
| CTR decline from peak | Over 20% drop within a 5-day window | Warning |
| Frequency | Above 2.5 and rising (Meta) | Warning |
| CPA increase from best | Over 25% within a 5-day window | Fatigued |
| CVR decline | Trending down with impressions stable | Fatigued |
| Platform spend shift | Platform moving delivery away from the creative | Dead |

Score each active creative as **Healthy**, **Early Warning**, **Fatigued**, or **Dead**. For every Fatigued or Dead creative, specify the refresh lever: hook type (curiosity, problem-agitation, result-first, social-proof, controversy), proof mechanism (testimonial, data, before-and-after, authority, demonstration), format (UGC, studio, talking-head, carousel, static), copy length (shift short to long or long to short).

### 6C. Google Merchant Center deep-dive (Shopping-active clients only, Internal layer)

- `[I]` Product status counts via Content API: Approved, Pending, Disapproved counts per destination (Shopping, Display Ads, Surfaces Across Google). Flag if disapproved is above 5% of the catalogue. The v2.1 `productstatuses` endpoint can return stale data. Always cross-check against the Merchant Center UI Products tab.
- `[I]` Reporting API impressions and clicks (`MerchantPerformanceView`) daily for the last 30 days. Sparkline reveals dark periods the UI chart can hide. Zero-impression days mean the feed was suspended even if the UI shows approved now.
- `[I]` "Needs attention" triage. Categorise every warning into one of: `item_missing_required_attribute` (fix in Shopify product fields), `low_image_quality` (swap to higher-res), `landing_page_error` (404, redirect, or blocked URL), `price_mismatch` (feed price ≠ landing-page price, often caused by currency or discount apps), account-level misrepresentation or trust issues (requires appeal via UI).
- `[I]` Shipping and returns settings sanity check. Lead time must match production reality. For custom-build products, the lead time is weeks, not days.
- `[I]` Feed URL protocol check. Must be HTTPS.
- `[I]` Sales-channel activation check in Shopify. New products missing from the feed for more than 48 hours usually means the Google and YouTube sales channel is unticked on the product.

### 6D. Competitor creative intelligence (External layer)

Runs at first-run, refreshes monthly in the weekly review.

- `[E]` 30-day Meta Ad Library snapshot on the 5 competitors from Section 3, scraped via `apify/facebook-ads-scraper` (the default method updated 2026-05-13). See Section 3.4 for the full input shape and filtering pattern. Flag any ad running more than 14 days as a probable winner.
- `[E]` 6-dimension analysis per competitor:
  - **Hook patterns.** Categorise opening hooks (curiosity, problem-agitation, result-first, social-proof, controversy, listicle). Count frequency.
  - **Messaging angles.** Primary claims and value props. Pain points targeted vs desires sold to. Feature-focused, benefit-focused, or identity-focused.
  - **Ad formats.** Mix of video vs static vs carousel. Video style breakdown (UGC, studio, talking-head, b-roll, screen-recording). Static style (product shot, lifestyle, text-heavy, before-and-after, testimonial card).
  - **Production style.** High-production vs lo-fi UGC. Pacing, text overlays, palette. Are they testing new formats or running the same style?
  - **CTA and offer strategy.** Hard sell vs soft sell. Offer structure (discount, free trial, bundle, urgency). Seasonal patterns.
  - **Creative volume and rotation.** Active-ad count. Rotation cadence. Same-concept rotation vs entirely new concepts.
- `[E]` Whitespace map: 3 to 5 angles, formats, or approaches no competitor is running. These are differentiation opportunities.
- `[E]` "Steal this" list: 5 brief-ready ideas inspired by competitor creative. Each entry names a hook, an angle, and a suggested format.
- `[E]` Paid SERP saturation for the client's head terms via Serper, as a proxy for competitive intensity.

### 6E. Starter campaign recommendations (always produced)

The deliverable a prospect actually wants: three to five concrete campaigns the brand should run next, each with objective, audience, budget order-of-magnitude, creative brief, and expected KPI band. At least one campaign must be executable without Internal access (for example: Google Search brand-defence once the account is created, Meta prospecting from a fresh seed audience).

---

## Section 7. Email, Lifecycle, and CRM (Internal layer)

Internal-heavy. **In Level 1 mode, lead the section with what was observed publicly** — do not stub with "Requires access" framing. See `feedback_audit_no_access_asks_in_audit.md` (2026-05-13).

Level 1 mode pattern (three to four public-surface observations, each one paragraph, each ending with a forward-looking note on what the data-wired diagnostic adds):

```
## 7. Email, Lifecycle, and CRM

{Client} runs {Mailchimp / Klaviyo / Brevo} for email. {Observation about pixel presence, list-growth mechanism, lead-capture pattern.} The audience research in Section 1 names email follow-up as the area where direct channels lose customers to {boutique stockists / category alternative}. Three observations from the public surface that matter for the email roadmap.

- **The current email programme {observation}.** {Concrete public-surface evidence.} Once the data-wired diagnostic begins, an audit of the existing flow coverage (welcome, abandoned cart, browse, post-purchase, win-back) is the first piece of work. Building or improving the flows from there delivers compounding repeat-purchase revenue.
- **The lead acquisition mechanism is {observed banner / popup / form}.** {Diagnostic implication for top-of-funnel.}
- **Cross-tag conversion measurement.** {Note tying to Section 9 dashboard work.}
```

**Tool-migration rule:** if the client is on a working tool today (Mailchimp, Brevo, etc.), do **not** recommend migrating to a different tool as a Top-5 action. Tool-migration decisions move to a post-engagement evaluation. See `feedback_audit_no_tool_migration_push.md`.

In Level 2 mode:

- `[I]` Flow coverage: welcome, abandoned cart, browse, post-purchase, win-back.
- `[I]` Revenue per flow, open rate, click rate, unsub rate.
- `[I]` Segmentation depth beyond "all subscribers".
- `[I]` Deliverability health: bounce rate, spam complaints, domain reputation.
- `[I]` SMS layer check where applicable.
- `[I]` CRM hygiene: duplicates, missing attributes, mis-tagged records.

---

## Section 8. Conversion Optimisation (Internal layer, with External fallback)

**In Level 1 mode, lead the section with what was observed publicly.** The audit ran the live-site trust-signal audit and the heuristic review of high-traffic templates from the public surface. Present those findings. Do not stub the Internal-layer checks with "requires access" framing — name the conversion bottleneck the public surface reveals (e.g. checkout currency snap-back for a diaspora visitor; missing review schema; absent worn-on-model video on product pages). See `feedback_audit_no_access_asks_in_audit.md` (2026-05-13).

In Level 2 mode:

- `[I]` Session recording review: top friction moments on PDP and checkout.
- `[I]` Rage-click and dead-click heatmaps per template. Also capture quickbacks (land on page, immediately go back) and excessive scrolling (up-and-down hunting signals confusion).
- `[I]` Scroll-depth on hero, PDP, checkout.
- `[I]` Funnel dropoff: product view to cart to checkout to purchase.
- `[I]` Mobile vs desktop conversion gap by template.
- `[I]` Top 5 exit pages. Cross-reference with the site's own session-replay / heat-map tool (whatever it runs — Hotjar, PostHog, FullStory, Microsoft Clarity, etc.) rage and dead clicks to identify UX-driven exits.
- `[I]` Shopify checkout analytics: abandoned-cart rate, cart recovery rate (via Klaviyo flow), average checkout completion time, any checkout errors or payment failures.
- `[E]` Trust-signal audit on the live site: reviews, returns, shipping, payment icons, guarantees.
- `[E]` Heuristic CRO review of PDP, category pages, checkout against conversion principles, executed by reading the public site.

### DTC checkout-funnel benchmarks

Apply to the funnel dropoff bullet above. Flag any step performing more than 20% below benchmark as a CRO priority.

| Step | Benchmark |
|---|---|
| Product page to Add-to-Cart | 5 to 10% |
| Add-to-Cart to Checkout | 40 to 60% |
| Checkout to Payment | 70 to 85% |
| Payment to Order | 90 to 95% |

### Page-CRO 8-dimension framework

Use when producing specific page-level recommendations.

1. Value-prop clarity above the fold.
2. Headline effectiveness and match to intent.
3. CTA prominence, repetition, and copy.
4. Visual hierarchy.
5. Trust signals (reviews, guarantees, payment security, returns).
6. Objection handling (price, fit, risk, timing).
7. Friction points (form length, login walls, unclear pricing, shipping surprises).
8. VOC coverage. Cross-reference the live PDP / homepage / checkout copy against the recurring objections and the buyer's own language mined in Section 1 (the audience VOC set). Name the top mined concerns the site copy does not visibly address, and flag where the copy uses brand or marketing language in place of the buyer's own register. Where no VOC corpus exists for this brand, say so in one line and skip — never invent customer language to score against.

---

## Section 9. Analytics and Tracking (Internal layer)

**In Level 1 mode, lead with the public tag inventory** — do not stub with "Requires access" framing. View-source the homepage for pixel / GA4 / Google Ads / Meta / Klaviyo / Mailchimp / heat-mapping or session-replay tool (Hotjar, PostHog, FullStory, Microsoft Clarity, Mouseflow, etc.) / Conversions API presence. Present a tag-inventory table with status (Wired / Not present / Suspicious) and a one-line note per tag on what the data-wired diagnostic adds. **Heat-mapping / session-replay is a category, not a product.** Detect whichever vendor the site already runs (the scraped platform hints carry it). If the site already runs one, recommend getting more out of that existing tool; never recommend switching and never default to naming Microsoft Clarity. Only if no such tool is detected anywhere, recommend adding one, named as the category with two or three example vendors, the choice left to the brand. Close the section with two or three concrete easy improvements visible from the public surface (e.g. if no session-replay / heat-mapping tool is detected, add one; set up Meta server-side Conversions API). See `feedback_audit_no_access_asks_in_audit.md` (2026-05-13).

In Level 2 mode:

- `[I]` Event coverage: add-to-cart, checkout-start, purchase, custom events.
- `[I]` Conversion tracking integrity: GA4 vs Shopify vs Meta vs ad platforms.
- `[I]` UTM hygiene: missing tags, inconsistent naming, lost attribution.
- `[I]` Cross-device plus cross-session stitching.
- `[I]` Attribution model review: last-click vs data-driven vs MMM-lite.
- `[I]` Dashboard readiness: what is reportable today, what needs fixing.

---

## Section 10. Organic Social Presence

An outside-in read of the brand's and competitors' **organic** social footprint — not a paid-social or owned-account audit (paid sits in Section 6; account-level performance needs internal access and lives in the data-wired tier). The point is to show the founder where the category's organic attention actually sits and how the brand's footprint compares, from public signals alone.

### How to execute

- **Platform-presence map (the backbone, always available).** Each homepage links its own social profiles; the scrape captures them (`social_links`). Build a presence comparison — which networks each brand is active on, audited brand first. This renders even when nothing else does.
- **Instagram metrics (layered on where sampled).** For the discovered Instagram handles, the runner pulls follower count, posting cadence (posts per week from recent-post timestamps), average engagement (likes + comments ÷ followers), and the post-format mix (`social_presence.instagram`). Render a small table — Brand | Followers | Posts/week | Avg engagement % — brand first; a missing metric is an honest "—".
- **Strategic read.** Two or three sentences: where the category's organic attention concentrates, any single-channel concentration risk for the brand (e.g. a market whose discovery rests almost entirely on one unowned, algorithm-driven channel), and the single highest-value organic-social move. Keep it tight.

### Quality bar

Use only the supplied numbers; never invent a follower count, a cadence, or an engagement rate. Where only the presence map exists (no Instagram metrics sampled), write a one-paragraph footprint read and stop — never narrate what was not measured. This section is conditional on outside-in signal and is not gate-checked.

---

## Section 11. Prioritised Action Items

Every finding in Sections 3 through 9 converts into a candidate initiative. Each initiative is scored on ICE. The ranked list becomes the client's operating plan for the next 30 to 60 days.

### 11A. Foundational fixes + the prioritised roadmap

Open Section 11A with one to two short paragraphs. **Foundational fixes** are structural and hygiene items where the audit evidence is unanimous and high-confidence (e.g. CMS migration when SSR is missing entirely, canonical-tag fixes when blog posts are de-indexed, schema markup when AEO citations are blocked). Everything else is a prioritised opportunity drawn from the gaps the audit found. Do NOT crown a single "biggest lever", "binding constraint", or "the one thing" the founder must do — the audit's value is the landscape, the competitive gaps, the benchmarks, the hygiene issues, the obvious fixes, and a prioritised roadmap built from them, not one prescriptive bet we might have wrong.

Frame foundational fixes as foundational and high-confidence. **Avoid strong-opinion language** — do not call them "mandatory", "non-negotiable", "must ship", or "not subject to experimentation". The audit ranks; the founder decides. Reserve "experiment" and "hypothesis" framing for items that genuinely test a belief.

Structure:

- **Foundational fixes (audit evidence unanimous, high-confidence).** Bullet list. Each item carries an execution tag (`[Claude]`, `[Hybrid]`, `[Human]`, or `[Hybrid → Claude on access]`). One short reason per item explaining why the audit ranks it high (the structural evidence from Section 4 or 5).
- **Evidence.** Three to five bullets pulling the strongest data points from the prior sections. Label each as `[E]` or `[I]` so the founder sees which came from public data and which from their own.
- **Expected revenue impact (where quantifiable).** Quantified in monthly revenue with assumptions stated. A1 caps at medium confidence by default. Attach impact to specific roadmap items, never to a single elevated bet.

When the site is already structurally healthy (no clear foundational-fix layer), Section 11A still leads with the prioritised roadmap of opportunities and gaps the audit found, not a single lever. Open with one short paragraph framing the strongest gap clusters, then go straight to the ranked list in 11B.

On Path B (new brand, no revenue history), frame the roadmap as **prioritised hypotheses to test** rather than fixes:

- **Key hypotheses (prioritised).** Several, ranked, not one elevated bet. For each: one sentence — "If we [action] for [audience], they will [measurable response] because [audience insight]."
- **Evidence.** Pull from Sections 1, 2, and 3.
- **The experiment.** One sentence per hypothesis describing the smallest test that moves it from belief to data.
- **Effort and third-party costs.** Effort in days or weeks. Third-party costs only — tooling subscriptions, paid amplification, software licences. **Always close with an explicit clarifier: "EntireCommerce engagement fees are separate and quoted at the engagement scoping stage — none of the dollar figures above are EntireCommerce charges."** The audit must not read as if EntireCommerce is offering the work at the listed dollar amounts. Section 6 paid-campaign sub-sections can use "budget order-of-magnitude" framings since those are clearly ad-spend numbers; Section 11A should not.

Audit-level kill criteria are not used. The audit surfaces insights and action items; the weekly EC Operator review cadence is the kill mechanism — every week reads what shipped, scores progress, and reranks. Per-experiment kill criteria attach to specific paid campaigns once those campaigns are live under the weekly review, not at the audit deliverable stage.

### 11B. Prioritised roadmap

Present every initiative as a flat numbered list, highest-impact first (the underlying ranking is the internal ICE score; never surface the words "ICE" or "ranked by ICE" in the deliverable). This list is the deliverable's core: a gap-driven roadmap of 5 to 20+ items, not one or two emphasised bets. Group into three priority buckets:

- **Top priority** — top 5 items by ICE score. Work begins as soon as the engagement starts.
- **Near-term** — next 10 items. Pulled up as the top 5 complete.
- **Backlog** — everything else. Long-tail items, instrumentation tasks, access-grant items.

Bucket names use **priority language**, not calendar language. The audit may be delivered before the engagement is agreed, so committed-timeline bucket names ("This week", "Next 30 days") read presumptuously. Use "Top priority", "Near-term", "Backlog".

**Every item is tagged with an execution tag** — `[Claude]`, `[Hybrid]`, or `[Human]` — at the start of the title (see the "Action item execution tags" subsection for the spec). Close 11B with a one-line summary: "Of N total items across all three buckets, X are Claude-executable with API/MCP access, Y are Hybrid, Z need your team's hands. The Claude-executable count rises to N+M once [specific access grant] lands."

No P0/P1/P2 labels in the output. The bucket names are the priority signal. ICE scoring stays internal as the ranking mechanism.

### 11C. ICE scoring (rubric)

| Axis | Scale | Definition |
|---|---|---|
| Impact | 1 to 10 | Expected revenue or strategic weight. 1 = cosmetic. 10 = unlocks or protects more than 20% of revenue |
| Confidence | 1 to 10 | How sure the initiative works. 1 = pure guess. 10 = near-certain, tested elsewhere, obvious fix |
| Effort | 1 to 10 | Inverse scale. 1 = months of work. 10 = ships in under a day |

**Score = Impact × Confidence × Effort.** Maximum 1000. Rank all candidate initiatives by score descending. Keep the raw scores in working notes; the report shows the ranked list.

### 10D. Workstream granularity

Each item is a workstream (not a checkbox). Sub-tasks live underneath the item:

```
### 1. Close the SEO title/meta gap on the top 20 commercial pages
- [ ] Pull current titles and metas from GSC
- [ ] Generate CTR-optimised rewrites via `seo_toolkit.py ctr-optimize`
- [ ] Ship rewrites to the CMS
- [ ] Re-score CTR 14 days later
```

One workstream occupies one position in the ranking regardless of how many sub-tasks it contains.

### 10E. Handoff to internal tracking

In the audit deliverable, this section reads as "Top 5 priority items become workstreams on EntireCommerce's side once the engagement starts. The audit becomes the baseline; every weekly review diffs against it. Internal P-tier caps (max 5 active workstreams) are respected so focus stays on the highest-leverage items." **Do not name internal file paths** like `actions.md` or `EntireCommerceClients/{Client}/actions.md` in the audit body — those are internal infrastructure references that the founder cannot navigate and that leak our directory structure.

Internally (post-engagement, after the audit is delivered and signed off), merge the **Top priority** items into `{client_folder}/actions.md` as new items in the Next Actions section. Respect the caps in that file: if actions.md already has 5 P0s, rescore the audit's new candidates against the existing P0s by ICE. Promote the strongest, demote the displaced. Never exceed the caps. This step is internal-only and lives outside the audit deliverable.

### 10F. SEO build-tracker seed (when Section 11's action list contains pSEO build phases)

If Section 11 contains any **build-pattern** items (`alternatives`, `comparisons`, `use-cases`, `pillar-playbooks`), seed the persistent SEO build tracker per `seo-audit.md` Section 9G. Write three files under `{client_folder}/docs/seo-build/`:

- `seo-build.md` — phase tracker with every build-pattern action as a numbered phase (status, pattern, target_template, evidence, ice_score, ship_target, quality_gate). Order by descending ICE.
- `link-inventory.md` — seeded from Section 4A crawl (mature pages + ranking pages + cluster hubs) with anchor reservations for pending phases.
- `config.json` — `client_slug`, `dr`, `kd_ceiling` (from Section 4C), `last_run`, `last_mode`, `competitor_set`, `locale`, `publish_early` (true if `dr < 10`).

This is the handoff from one-shot diagnostic to ongoing build loop. Subsequent `/seo-audit` invocations detect the tracker and execute the next phase end-to-end with the Section 9H per-phase quality gate. The weekly review (`playbooks/weekly-marketing-review.md` Section 12) and EC Operator pick up phase status from this file too.

---

## Access-grant checklist (deprecated 2026-05-13 — moved to the SOW)

**The Access-Grant Checklist section has been removed from the audit deliverable.** Access asks move to the SOW phase, once an engagement is opened. Inside the audit, Sections 7 / 8 / 9 lead with what was observed publicly (see those sections' updated guidance above). The audit itself does not enumerate access asks.

**Reason:** A free first-touch audit is an artefact showing the depth of findings, the rigour of the diagnostic, and the growth potential the audit reveals. Surfacing a 10-row access-grant table at the end reads as a transactional ask before the relationship is established. The SOW is the right place to list what platforms EC needs viewer access on, with the precise scope and service-account email. See `feedback_audit_no_access_asks_in_audit.md` (2026-05-13).

**Where the access asks now live:**

- SOW deliverable produced once the engagement is opened. The SOW template enumerates the platforms with the precise scope, service-account email, and the unlocked-section mapping.
- Audit body can include one neutral conditional line in the Conclusion or Methodology pointing forward: "Access asks for the data-wired upgrade are listed in the SOW once an engagement opens." Drop the line entirely if the audit has no asks.

---

## Execution cadence (Kanban handoff to the weekly review)

Section 11 is the plan. There is no 30-60-90 calendar overlay. Execution follows Kanban pull-flow: the top 5 priority items are the only work in play at any moment, and the ranking is refreshed at the end of every weekly review session. When a top priority is resolved (metric proves it, or the action ships), the next priorities become visible and Section 11 refreshes.

Handoff to the weekly review:

- **Week 1 (post-engagement).** The audit is the week-1 deliverable. Top-priority items begin once the engagement starts.
- **Ongoing.** The weekly review playbook at `playbooks/weekly-marketing-review.md` runs every Monday on the active client. It reads the audit's prioritised roadmap, scores the prior week's actions, refreshes the ranking, and pushes new items into `{client_folder}/actions.md`.
- **Quarterly.** The Growth Equation correlation analysis is re-run as a dedicated quarterly section in the weekly review. If the top-3 drivers have shifted, the audit's priorities are re-ranked and a fresh Section 11 replaces the current one.
- **Daily dashboard.** In Level 2 mode only. `scripts/update-dashboard.py` refreshes the shared Google Sheet every morning. The dashboard is the replacement for every weekly status meeting a traditional agency would run.

The audit is one artefact. The execution engine is the weekly review.

---

## Autonomy and execution layer (data-wired runs — added 2026-06-12)

Data-wired audit runs **close by executing the safe set**: every action that is (a) directly executable by Claude via an API we hold, (b) safe, and (c) reversible is executed in-run, recorded before/after, and marked done. The canonical rules — execution modes (auto / handover / client-decision), the per-tenant `execution_surface` config block, the safety rails (paced sequential CMS writes, reversible-only, live-verify before acting, never emails or spend), the deliverable formats (hosted HTML report, Google Doc handover, Sheet worklists, Superhuman draft delivery), the client-type adapter, live-verify + data-provenance, and the Section 0.5 reconciliation standard — live in `gtm-audit-prompt.md` › "Autonomy and execution layer". This spec inherits them; edit there only.

What changes in the deliverable on a data-wired run:

- The report **opens with an "Items completed as part of this audit run" table** (action | scope | before | after | verification). Executed items move out of the action list into this table, and the remaining actions are re-ranked by ICE with the done items removed (the engine's `--finalize` pass handles both — the deliverable is assembled after it, never from the pre-execution action list).
- Repeat runs open with the **Section 0.5 reconciliation**: metric delta vs the prior run plus a live-verified "what did the team ship since last time" matrix (done / partial / not started), diffed by days-since-prior-run, never "last week".
- Remaining (unexecuted) actions keep the three-type table (Claude / Hybrid / Human). The execution-mode tags map: handover items render as Claude-or-Hybrid pending access; client-decision items render as Hybrid or Human.
- For SaaS / B2B / services clients, commerce / paid / email / CRM sections render **N/A as a site-fact**, never as a limitation, and the conversion scorecard adapts to the model (signup/trial friction, demo CTA, trust signals — never returns/checkout/product-feed). No action may assert broken tracking for a source the tenant config intentionally lacks.

---

## Founder-readability rules (2026-05-15, absolute)

These are hard rules from the Pickle Shickle rework. They override any prior guidance in this spec they conflict with — in particular, rule 5 overrides every "the body of the report keeps technical register" line elsewhere in this document. Each maps to a curated-memory feedback entry and the voice-enforcement gate below enforces them mechanically. The cloud runner composes this spec at runtime and enforces these rules in code (`voice_cleanup.py`, `quality_gate.py`, `confidence_pass.py`); its system header does not restate them (see "Spec authority" above), so the automated and human paths cannot drift.

1. **No limitation, gap, or tool-failure language. Anywhere in the report.** Banned: "X is not installed so Y is not available", "what we cannot see", "without GA4 we cannot answer", "what unlocks once data wiring lands", "Apify hit its limit", "the keyword search collided / a re-pull is needed", "where data is signal-based", "Requires access", "becomes available once Clarity is wired in Phase 1". Three things die: our-process limitations (tool failed / rate-limited / scrape noisy), paywall-tease framing ("available once you pay for Phase 1"), and data-collection-artefact narration (a competitor URL resolved to a different company, a same-name brand in another industry, the competitor set was auto-built or "should be defined more precisely" — drop the bad row silently, never explain it). Never name an internal tool in the body (Apify, DataForSEO, Screaming Frog, Serper, Ahrefs, Keywords Everywhere, PageSpeed Insights). State client-site facts plus recommendations; process notes go to the dogfood file only. See `feedback_audit_no_limitations_no_internal_refs.md`.
2. **No internal filenames, file paths, or appendix references in the body.** No `.md` / `.csv` / `.json` names, no `BT-*` artefact names, no `data/gtm-audit/...` or `competitors/{slug}` paths, no "the per-template-matrix appendix", no Cross-references block, no Section 3.7 Data-files block, no enumerated Supporting-data-files list. Do not write the supporting-data availability sentence, a "Supporting data files" heading or list, or a "Methodology and reading frame" trailer: the runner appends the single supporting-data line and the signature deterministically after the Conclusion. The body ends at the last line of the Conclusion with nothing after it. This is the single most-resisted rule across runs — re-check it before you close. The `[E]` / `[I]` / `[E+I]` execution-layer tags this spec uses in the section specs are operator notation only and never appear in the founder-facing body. Same memory entry as rule 1.
3. **Section 3 is synthesis-first; competitor profiles compress to 3-4 line highlights.** Lead Section 3 with the whitespace map, shared-weakness map, pricing ladder, and benchmarking tables (client first). Then one short highlight paragraph per competitor. Never ship the full six-subsection per-competitor teardown in the founder-facing report; the granular data goes to the supporting-data bucket. Drop the top-of-report reading-frame quote block entirely. See `feedback_audit_synthesis_first_compression.md`.
4. **Executive Summary is bullets, not prose.** One framing sentence, then the bold sub-blocks (What the brand built right / Key issues identified) each carrying a bullet list. No "What we recommend" block. Never a wall of paragraphs. **Markdown mechanics (A.14 — the PDF renderer flattens anything else into one grey wall of text):** the bold label is its own line, then ONE blank line, then each bullet on its own line starting with `- `. NEVER `**Label.** - a - b - c` on one line; NEVER bullets on the line directly under the label with no blank line. One sentence per bullet, ~25 words max — every Top 5 actions table cell included; a dense paragraph stuffed into a bullet or a cell is the same failure as a wall of prose. See `feedback_audit_exec_summary_bullets_and_dejargon.md`.
5. **De-jargon the whole body, not just front matter.** Global replace `PDP` → "product page". Drop or gloss everywhere: `SERP-feature opportunity`, `AEO`, `GEO`, `extractability`, `JSON-LD`, `schema markup` (gloss as "structured-data tags"), `SKU`, `CRO`, `GA4`, `GSC`, `GTM`, `MCC`, `DTC`, `SERP`. Section titles too ("Visibility in AI Answer Engines", not "AEO"). Applies to the WHOLE report body, not just front matter — this overrides the "body keeps technical register" guidance elsewhere in this spec. Same memory entry as rule 4.
6. **Ruthless-concision pass before the voice gate.** A dedicated read-through that cuts redundant, irrelevant, metaphorical, and filler language and makes every sentence direct. Bias to deletion. Soft target ~6,000 words / ~20 pages, achieved structurally, never by truncation.
7. **Trustpilot is not a primary signal.** Especially not in India. At most one optional cell in a review benchmark if data exists. Never a headline, a subsection, or a "no Trustpilot presence" gap call-out.

8. **No playbook-rule names in the body** (commit f953a0d3). Never write "benchmarking layout rule", "placed first per", "per the <X> rule / convention / layout", "layout rule", or any sentence explaining why a table is ordered the way it is. A benchmarking table just has the client first; the report never names the rule. The output is the proof of voice; rules are instructions for the operator, never captions for the reader.
9. **No overconfident / epistemically-overreaching statements** (commit 5312dda1, `feedback_no_overconfident_statements.md`). Distinct from the no-strong-opinion rule (which bans *prescriptive* force: "mandatory", "must"); this bans *epistemic* overreach. A finding resting on one tool, one pull, or inference is hedged ("indicates / suggests / appears / on the data available") AND the uncertainty is named — framed as a fact about the client's own site or market plus the fix, never as an audit-process limitation. Banned: outcome / timeline guarantees ("wins this surface", "takes the category", "first mover takes/wins", "will win / convert / compound", "guarantee(s/d)", "definitely", "certainly", "proves" asserting a conclusion); false-precision multipliers from a soft single source ("by ~60x"); absolutes ("the only X still/not/untargeted", "no competitor anywhere / at all", "every competitor"). Outcome claims become "is the play / is well positioned to". NOT a licence to hedge everything: ranked recommendations with a concrete basis stay direct and specific, never wishy-washy, never padded.
10. **No source citation without backing data** (strict). Every named source — Reddit, Trustpilot, Instagram, Amazon, YouTube, Quora, PriceScope, "customer reviews", "the review corpus", a named competitor's reviews, etc. — must be backed by data the run actually gathered (the enriched external-API bundle in headless mode; the mined corpus in interactive mode). If a source returned nothing or was not pulled, never name it: state the substantive finding from general category knowledge with no attribution, or omit the subsection. Unattributed category inference is acceptable ("buyers in this category typically weigh three things: X, Y, Z"); an attributed claim ("Reddit users say X", "from Trustpilot reviews of competitors") requires the backing data. This is the evidence-integrity twin of rule 9.

## Voice conventions (two registers)

The audit ships two artefacts in two different voice registers. The full report uses the strict report voice. The email overview uses the personal-voice register. Both gates run before any file is written to disk.

### Report voice (full report only)

- No em-dashes. Use period, comma, or colon.
- No negative parallelisms. Never "X, not Y". Never "rather than X". Never "instead of X". Never "opposite of X". State the positive.
- Four-word sentence floor. No one- or two-word sentences.
- Spelling: "ecommerce" one word, no hyphen. Never "e-commerce".
- ICP language discipline: "premium DTC", "AOV $500+", "founder-led", "US-market focus".
- No unearned revenue claims on the agency's own numbers.
- No AI clichés: "Here's the...", "Most people...", "The uncomfortable truth...", curiosity-gap tricks.
- The report is one markdown file in the project's GitHub repo. Supplementary data lives alongside. Nothing in private dashboards.

### Personal voice (email overview only)

- **Em-dashes allowed.** They match the operator's natural voice on email and social. Required to make the email read as personal, not brochure-like. See `feedback_personal_voice_posts.md` in curated memory for the pattern.
- No negative parallelisms ("X, not Y" / "rather than" / "instead of" / "opposite of" / "not only").
- "ecommerce" one word, no hyphen.
- No AI clichés.
- Four-word sentence floor.
- **No unexplained industry jargon.** The founder is a domain expert in their own category (wellness, finance, music, retail), not in digital marketing. Translate every marketing or technical term the first time it appears, or drop it for a plain-English equivalent.
- No meeting-ask or CTA in the email close (per `feedback_warm_reopener_no_meeting_ask.md`).
- When replying inside a live Gmail thread, the email has no subject line.

### Standard plain-English glossary (for the email overview AND the report's Executive summary front matter)

The glossary applies to **two surfaces**: the email overview (always personal voice), and the **front matter of the report** — Executive summary and Top 5 actions to ship first. The body of the report (Sections 1 through 10) keeps technical register since the marketing-team reader expects it. Translate every term in this glossary the first time it appears in the front matter, or substitute the plain-English equivalent.

Apply consistently across every audit so clients see stable vocabulary.

| Industry term | Plain-English substitute |
|---|---|
| PDP / product detail page | product page |
| SKU | product, or "flagship product" for the head-term product |
| AEO / answer-engine optimisation | the AI answer engines (ChatGPT, Perplexity, Google AI) |
| AI Overview | those AI-generated answers at the top of Google results |
| AggregateRating schema / review schema | the small bit of code Google needs to show your star ratings in search results |
| AOV | size of a typical order |
| CTR | click-through rate |
| CPC | cost per click |
| CPA | cost per customer acquired |
| CRO | conversion rate improvements |
| organic traffic | visitors from Google |
| branded traffic | visitors typing your brand name into Google |
| unbranded commercial queries | shopping searches where someone does not know your brand yet |
| keyword | search phrase |
| optimise / optimisation | tune; or "a few weeks of focused writing" |
| content gap | search phrases competitors rank for where you do not |
| backlink | a link from another site back to yours |
| head term | the most-searched category phrase |
| position 30 | page three of Google |
| position 4 to 10 | top of page one |
| cannibalisation | two of your own pages competing for the same search phrase |
| Klaviyo flow | an automated email sequence |
| back-in-stock capture | a way for shoppers to leave their email and get notified when a product is back |
| cart drawer | the panel that slides out when someone adds to cart |
| nine-figure business | multi-hundred-million-dollar business |
| Core Web Vitals / PageSpeed | how fast your pages load on a phone or desktop |

### Voice enforcement gate (mandatory before writing anything to disk)

The voice rules above are stated on every run and still leak into every run. Stating them is not enough. Before the synthesis pass writes to disk, run a mechanical voice-scan on **each** artefact (full report with the report-voice gate; email overview with the personal-voice gate). Grep for each banned pattern and rewrite every match.

**Report-voice gate (full report)**:

| Banned pattern | What to grep for | Fix |
|---|---|---|
| Em-dash | `—` (U+2014) and `--` (double hyphen used as em-dash) | Replace with period, comma, or colon. |
| Negative parallelism (comma / connective form) | `, not `, ` rather than `, ` instead of `, `opposite of `, `not only ` | Rewrite to state the positive claim alone. Delete the negative half. |
| Negative parallelism (contraction-led, sneaky form) | `\bisn't\b`, `\baren't\b`, `\bwasn't\b`, `\bweren't\b`, `\bdoesn't\b`, `\bdon't\b`, `\bdidn't\b`, `\bwon't\b`, `\bwouldn't\b`, `\bcan't\b`, `\bcouldn't\b`, `\bshouldn't\b`, `\bhasn't\b`, `\bhaven't\b`, `\bhadn't\b`, plus the spelled-out equivalents `is not`, `are not`, `was not`, `does not`, `do not`, `has not`, `have not`. **For each match, ask: is this a load-bearing fact (e.g. "AI engines do not run JavaScript") or a setup-and-contrast pattern (e.g. "This isn't urgent because X. It's urgent because Y")?** Setup-and-contrast → rewrite to state the positive claim alone. Load-bearing fact → leave. |
| Banned spelling | `e-commerce`, `E-commerce`, `E-Commerce` | Replace with `ecommerce`. |
| AI clichés | `Here's the`, `Here's what`, `Here's why`, `Most people`, `The uncomfortable truth`, `The brutal truth`, `The breakthrough` | Rewrite to make the specific claim without the cliché frame. |
| Sub-four-word sentence | Sentences of 1 to 3 words (standalone) | Merge into adjacent sentence or expand. |
| Internal drafting / voice rules in body | `\bplain English\b`, `\bnegative parallelism`, `\bIndia-specific register\b`, `\bfour-word floor\b`, `\bem-dashes? allowed\b`, `\boperator voice\b`, `\breport-voice\b`, `\bpersonal-voice\b`, `\bvoice register\b`, `\bvoice gate\b`, `\bdrafting mode\b`, `\bsynthesis pass\b`, `\bAI cliché`, plus stage directions like `\bthis section runs\b`, `\bthe audit reads them\b`, `\bin drafting mode\b`. These are instructions for the operator, never captions or prefaces in the artefact body. **Fix:** Delete the leak. The output is the proof of voice; the rules do not need to be announced. See `feedback_no_internal_voice_rules_in_client_body.md`. |
| Internal credentials / `.env` references | `\.env`, `service-account email`, specific API key variable names (e.g. `APIFY_API_TOKEN`, `OPENAI_API_KEY`), GCP project IDs, `nish-claude@` and similar service-account addresses that name another client | Rewrite per the "Acceptable framings" list under the credentials acceptance criterion below. |
| Internal file paths | `data/gtm-audit/`, `actions.md` (referenced as a path), `EntireCommerceClients/{Client}/`, `.config/`, anywhere the body cites a literal file path the founder cannot navigate to | Replace with descriptive references in the body ("the 30-quote audience appendix", "our internal task tracker"). The model writes no supporting-data block; the runner appends the single "supporting data available on request" line after the Conclusion. |
| Strong-opinion language | `\bmandatory\b`, `\bmust do\b`, `\bmust ship\b`, `\bnon-negotiable\b`, `\bnot subject to experimentation\b`, `\bstructurally non-negotiable\b` | Substitute "highest-impact", "highest-confidence", "the audit evidence is unanimous", "strongly recommended", "foundational fix", "the audit ranks this first". Do NOT substitute "highest-leverage" or "biggest lever" — A.14 bans both as internal jargon (see "Do not lead with internal-to-EntireCommerce jargon"). The audit reports evidence and ranks recommendations; the founder decides. |
| Specific dates / hard timeline commitments | named days (`\bMonday\b`, `\bTuesday\b`, `\bWednesday\b`, `\bThursday\b`, `\bFriday\b`), `lands Monday`, `starting May`, `by 22 July 2026`, plus calendar-bucket headers `This week`, `Next 30 days` | Replace with directional horizons ("inside 30 days", "compounds in weeks") or with the **Top priority / Near-term / Backlog** Section 11B bucket names. |
| Limitation / gap / tool-failure language (2026-05-15, absolute — rule 1) | `not installed`, `not available`, `cannot see`, `we cannot`, `what unlocks`, `once .* (is )?wired`, `becomes available once`, `\bunavailable\b`, `hit its (monthly )?(hard )?limit`, `keyword search (collided|returned noise)`, `re-pull`, `signal-based`, `manual sweep needed`, internal tool names in body (`Apify`, `DataForSEO`, `Screaming Frog`, `Serper`, `Ahrefs`, `Keywords Everywhere`, `PageSpeed Insights`) | State client-site facts plus recommendations only. Process notes go to the dogfood file, never the deliverable. See `feedback_audit_no_limitations_no_internal_refs.md`. |
| Internal filenames / appendix references (2026-05-15, absolute — rule 2) | `\.md\b`, `\.csv\b`, `\.json\b`, `\bappendix\b`, `data/gtm-audit`, `competitors/`, `BT-[A-Z]`, `Cross-references`, `Supporting data files` (as an enumerated list) | One permitted closing line only: supporting data available on request, email nishant@entirecommerce.co. Same memory entry as rule 1. |
| De-jargon the whole body (2026-05-15, absolute — rule 5) | `\bPDP\b`, `\bAEO\b`, `\bGEO\b`, `SERP-feature`, `\bextractab`, `\bJSON-LD\b`, `schema markup`, `\bSKU\b`, `\bCRO\b`, `\bGA4\b`, `\bGSC\b`, `\bGTM\b`, `\bMCC\b`, `\bDTC\b`, `\bSERP\b` | Substitute plain English across the WHOLE report body and section titles, not just front matter. Overrides the "body keeps technical register" rows above. See `feedback_audit_exec_summary_bullets_and_dejargon.md`. |
| Trustpilot as a primary signal (2026-05-15 — rule 7) | `Trustpilot` as a heading / subsection / headline finding, `no Trustpilot presence` gap call-out | At most one optional cell in a review benchmark if data exists. Same memory entry as rule 5. |
| Reading-frame block at top of report (2026-05-15 — rule 3) | `the first page covers the situation`, `marketing-team deep read` | Dropped. Report opens straight into the Executive Summary. See `feedback_audit_synthesis_first_compression.md`. |
| Internal playbook-rule leak (commit f953a0d3) | `\bbenchmarking layout rule\b`, `\bplaced first per\b`, `\bper the .{0,25}(rule|convention|layout)\b`, `\blayout rule\b` | Delete the leak. A benchmarking table just has the client first; the report never announces why. See `feedback_no_internal_voice_rules_in_client_body.md`. |
| Overconfident / epistemically-overreaching statements (2026-05-15, commit 5312dda1 — distinct from strong-opinion above, which bans prescriptive force) | `\bwins this surface\b`, `\btakes the category\b`, `\bfirst mover (takes|wins)\b`, `\bwill (win|convert|compound)\b`, `\bguarantee[ds]?\b`, `\b(definitely|certainly|undoubtedly)\b`, `\bproves\b` (asserting a conclusion), `\bby (roughly |about )?\d+x\b` (soft single source), `\bthe only [A-Za-z-]+ (still |not |untargeted)`, `\bno competitor (anywhere|at all)\b`, `\bevery competitor\b` | Hedge the finding ("indicates / suggests / appears / on the data available") AND name the uncertainty as a fact about the client's own site or market plus the fix, never as an audit-process limitation. Outcome claims → "is the play / is well positioned to". Not a licence to hedge everything: ranked recommendations with a concrete basis stay direct and specific. See `feedback_no_overconfident_statements.md`. |

**Personal-voice gate (email overview)**:

| Banned pattern | What to grep for | Fix |
|---|---|---|
| Negative parallelism (comma / connective form) | `, not `, ` rather than `, ` instead of `, `opposite of `, `not only ` | Rewrite to state the positive. |
| Negative parallelism (contraction-led, sneaky form) | `\bisn't\b`, `\baren't\b`, `\bwasn't\b`, `\bweren't\b`, `\bdoesn't\b`, `\bdon't\b`, `\bdidn't\b`, `\bwon't\b`, `\bcan't\b`, `\bcouldn't\b`, `\bhasn't\b`, `\bhaven't\b`, plus spelled-out forms. Same load-bearing-vs-setup-and-contrast test as above. The sneaky-form pattern is "This isn't X because Y. It's X because Z." — drop the negation half, lead with the positive. |
| Banned spelling | `e-commerce` | `ecommerce` |
| AI clichés | `Here's the`, `Here's what`, `Most people`, `uncomfortable truth`, `brutal truth`, `breakthrough` | Rewrite. |
| Sub-four-word sentence | Sentences of 1 to 3 words standalone | Merge or expand. |
| Unexplained industry jargon | `\bPDP\b`, `\bSKU\b`, `\bAEO\b`, `\bAOV\b`, `\bCTR\b`, `\bCPC\b`, `\bCPA\b`, `\bCRO\b`, `\bCMS\b`, `\bCRM\b`, `schema` (without gloss), `organic traffic` (without gloss), `cannibalise`, `top-of-funnel`, `MOFU`, `BOFU` | Translate per the glossary above or drop. |
| Em-dashes | **Allowed** in this register. Do not remove. | — |
| Meeting-ask in email close | "happy to jump on", "let's discuss", "book a time", "book a call", "schedule a call" | Remove. Sign off on the relationship instead. |
| Internal drafting / voice rules in body | `\bplain English\b`, `\bnegative parallelism`, `\bIndia-specific register\b`, `\bfour-word floor\b`, `\boperator voice\b`, `\bvoice register\b`, `\bvoice gate\b`, `\bdrafting mode\b`, `\bsynthesis pass\b`, `\bAI cliché` | Delete. The same rule as the report gate applies: drafting rules are instructions for the operator, never captions in the artefact. See `feedback_no_internal_voice_rules_in_client_body.md`. |
| Time-investment phrasing (banned in the AI-transparent opener) | `\bspent (this|the) week\b`, `\bfinally sat down with\b`, `\bproper chunk of time\b`, `\bI poured\b`, `\binvested time on\b` | Replace with AI-transparent framing: "I ran my go-to-market-audit AI playbook on [site]" or equivalent. See `feedback_ai_transparent_in_client_outreach.md`. |
| Specific dates / hard timeline commitments | named days, `lands Monday`, `starting May`, `by 22 July 2026` | Replace with directional horizons ("inside 30 days", "compounds in weeks"). The email is sent before any engagement is locked. |

`not` and any contraction of it are permitted only when the negation is genuinely load-bearing and cannot be rewritten. Default to rewriting. **Worked example of the contraction-led sneaky form** (caught after it slipped past the May 2026 voice gate): "This isn't urgent because anything you're doing is wrong. It's urgent because you're publishing weekly content into a site Google can't read." → "The urgency is structural. You're publishing weekly content into a site Google can't read." (Drops the "isn't... it's" setup-and-contrast scaffold; keeps the load-bearing "Google can't read" since that is descriptive fact.)

Every gate must hit zero matches before the artefact is written to disk. Sub-agents run the relevant gate on their returned output before handoff to the main session. The main session re-runs the gate before final write.

### Step 6.7. Pre-send confidence-calibration pass (mandatory — runs after the voice gates, blocks render)

This is a **judgment** review, not a regex sweep. The voice gates above catch banned patterns mechanically; this pass catches miscalibrated confidence the patterns cannot see. Read the rendered-ready markdown **cold**, one claim at a time, apply the rubric below, produce a flagged-line list with a calibrated rewrite for each, apply the rewrites in place, then proceed to render. A non-empty flag set blocks render until every flag is calibrated.

> **Confidence-calibration rubric.** A cold read of the finished deliverable, one claim at a time, against a single question: could the client's own data or the client's own domain knowledge embarrass this sentence? Flag and rewrite:
>
> - any number stated as fact that rests on one tool, one pull, or inference
> - any outcome or timeline guarantee
> - any competitor or market claim the founder could disprove from their own knowledge
> - any "the only / every / no competitor" absolute
> - any single-source finding presented without a hedge and without the uncertainty named
>
> The bar: if a confident claim could be disproven by the founder's own numbers, it is miscalibrated. Calibrated rewrites stay direct and specific. Hedge the certainty, name the uncertainty as a fact about the client's own site or market, keep the recommendation. No wishy-washy hedging, no filler. Ranked recommendations with a concrete basis are fine as-is.

In the cloud audit-runner this pass is automated as `audit-runner/confidence_pass.py`, which runs after `voice_cleanup` and before `quality_gate`; a non-empty uncalibrated-flag set it cannot self-correct fails the quality gate and forces an Opus retry with the flags as feedback, exactly like the existing no-negatives hard gate. The rubric wording above is byte-identical to the runner's `RUBRIC` constant so the two surfaces never drift. See `feedback_audit_pre_send_confidence_pass.md`; it sits alongside `feedback_no_overconfident_statements.md` (the voice gate enforces the pattern set; this pass is the cold-read judgment layer on top).

---

### Step 6.8. Adversarial verification pass (mandatory before render — added 2026-06-14)

This pass is **distinct from every pass above it**. The synthesis pass builds the narrative, the voice gates strip banned patterns, Step 6.7 calibrates *how confidently a claim is worded*. None of them asks whether the claim is **true**. This pass does, and it does so by **refutation, not re-reading** — the doer just wrote the report, so a confirming re-read inherits the doer's blind spots (self-preferential bias). The frame is inverted: assume each material finding is wrong and try to break it. On Opus 4.8 this is an explicit step the model performs, because the model does not separate doer from judge on its own.

**What counts as a material finding** (verify every one; skip hygiene): any quantitative claim (a number, a multiplier, a benchmark, a ranking position), any competitive or market claim, any compliance-sensitive claim (health / medical / GLP-1 / supplement efficacy, financial or returns claims, certification claims such as GIA vs IGI, "clinically", "FDA", "patented"), and any claim that drives a Top 5 action or a P0.

**The refutation rubric.** For each material finding, ask in a cold, skeptical frame:

- What single piece of the gathered data would have to be wrong for this finding to be false? Is that piece actually in the run's evidence, or inferred?
- Is the source named in the body actually backed by data the run pulled (the evidence-integrity twin of rule 10)? If not, the finding is refuted as stated — strip the attribution or cut it.
- Could the founder disprove this from their own numbers or domain knowledge? (overlaps Step 6.7 — there the fix is hedging; here, if the claim is load-bearing for an action, the fix may be **deletion**.)
- For a compliance-sensitive claim: is it literally accurate, and does it conflate two distinct claims? (e.g. "GIA-certified diamonds" vs "IGI-trained founder" are independent — never merge; see `feedback_riyanika_gia_diamonds_vs_igi_founder`. A product efficacy or health claim with no substantiation in the gathered evidence is refuted and cut, not hedged.)

**Verdict per finding: survive / hedge / cut.** Survive (evidence holds) → leave it. Hedge (real but single-source or soft) → hand to the Step 6.7 calibration wording. Cut (cannot be defended from the run's own evidence) → remove the claim and any action that depended only on it, then re-rank Section 11. A finding that drives a P0 and is refuted is the highest-value catch in the whole pass.

**No-early-stop / completeness loop (the agentic-laziness guard).** Before declaring the pass done, count: every required non-droppable table present (cross-checked against the structural quality gate list), every section that owns a material finding actually carrying one, every Top 5 / P0 action traceable to a surviving finding. If any section was addressed shallowly, any required table is thin or missing, or the action list shrank below the floor after cuts, **do not stop — return to that section and complete it.** Partial coverage declared complete is itself a defect. Only when the count is whole does the pass end.

**Surface notes.** In a **local run**, this pass runs as a *separate sub-agent* (true doer/judge separation) — see the gtm-audit command `gtm-audit-prompt.md` Step 6.9 (the live invocation path; `.agents/skills/gtm-audit/SKILL.md` is not loaded). In the **cloud audit-runner**, Opus performs the refutation rubric in-context during the pre-render passes; a dedicated skeptic-model stage (`verification_pass.py`, a second-model refutation call mirroring `confidence_pass.py`) is a **pending** automation, not yet wired — until it lands, the in-context refutation above is the contract. When it lands, its rubric must be byte-identical to the wording here so the two surfaces never drift, exactly as the confidence pass requires. Mandatory for every run; weight it hardest on compliance-sensitive clients (Protelicious, Riyanika).

---

### Tool-utilisation checklist (mandatory at audit start AND before render — added 2026-05-13)

The helper script at `EntireCommerce/playbooks/audit-tool-utilisation-check.py` runs at two points in the playbook:

**Step 2 (audit start, plan mode):** the script prints the expected-tool-usage matrix for the run — every required and optional external-layer tool, with `configured? / sections it covers / expected evidence files`. This becomes the operator's (and Claude's) planning document for the run.

```bash
python3 EntireCommerce/playbooks/audit-tool-utilisation-check.py --plan \
  --env-file /Users/nishantkapoor/Library/CloudStorage/Dropbox/Claude-Projects/.env
```

**Step 6.5 (post-draft, pre-render, verify mode):** the same script runs again, now against the populated data folder, and checks that every required tool the plan promised was actually used. Exits non-zero on any miss without an explicit dogfood-notes override.

```bash
python3 EntireCommerce/playbooks/audit-tool-utilisation-check.py \
  --data-folder {client_folder}/data/gtm-audit/{YYYY-MM-DD} \
  --env-file /Users/nishantkapoor/Library/CloudStorage/Dropbox/Claude-Projects/.env \
  --dogfood-notes EntireCommerce/playbooks/_gtm-audit-dogfood-notes.md
```

The script exits non-zero when one or more **required** external-layer tools are missing without an explicit override in the dogfood notes under this run's `## Run — {Client} ({YYYY-MM-DD})` section. Override marker (only this exact form counts):

```
- **Skipped tool: Ahrefs.** Rate-limited on this run; DataForSEO covered keyword footprint adequately for Section 3.2.
```

**Required tools (default set):** DataForSEO Labs, Ahrefs (backlinks + DR + link-gap), Serper, Apify Meta Ad Library, Screaming Frog CLI, PageSpeed Insights, AEO citation tests (saved transcript at `data/gtm-audit/{date}/aeo-citations.md` with at least 10 query-rows across ChatGPT / Claude / Perplexity / Google AI Overviews).

**Optional tools** (run when conditions warrant; absence is acceptable): Keywords Everywhere, Apify Instagram (VOC second pass — mandatory when first-pass quote count is under 20), Apify Trustpilot.

**WebFetch / WebSearch are not substitutes** for configured specialised tools. When a tool exists for the job (Ahrefs for keyword footprint, Serper for SERP features, Apify for Meta Ad Library / Instagram comments), the audit must use it. Falling back to WebFetch is acceptable only when the specialised tool actually fails (rate-limited, API down, monthly cap hit) — log the fallback in the dogfood notes using the `Skipped tool:` marker. The 2026-05-13 Forest of Chintz audit shipped without Ahrefs, without Serper, and with only 2 AEO queries (no saved transcript) precisely because the operator drifted to generic tools. The checklist catches this pattern. See `feedback_audit_tool_utilisation_checklist.md` in curated memory.

---


## Output formatting: bullets over paragraphs (mandatory for every deliverable)

This is a hard rule, recurring violation across runs. When this playbook generates any deliverable — full report, email overview, weekly review, action list, brief — the default output structure is bullet lists, not prose paragraphs. Long opening paragraphs of any kind are the most common violation; flag and convert them first.

**Universal hard cap (2026-05-16, Nishant): no prose paragraph anywhere in any deliverable exceeds 3 sentences.** Past three sentences, break into a new paragraph or convert to bullets. This is measured deterministically by the `paragraph_length` quality-gate check and fed back on retry — it is not aspirational. The Conclusion and Executive Summary are targets for this rule, never exceptions.

**Rules (apply in order):**

- **The Executive summary itself is bullets, not a paragraph.** This is the single most common regression. The opening of the report is a one-line sentence followed by bulleted constraints, bulleted unlocks, and bulleted kill-criterion. Never a wall of prose. The reader takes ten seconds to scan, not three minutes to parse.
- **Any time the text enumerates three or more items** (findings, fixes, gaps, recommendations, criteria, signals, things-they-do-better) those items become separate bullets. Two items can stay inline with a comma or "and"; three items always become bullets. Sub-bullets when an item itself enumerates more than two pieces of evidence.
- **Reserve paragraphs for genuinely connected reasoning** where prose flow carries the argument. Cap those paragraphs at four sentences. Cap opening framing paragraphs (any section) at one paragraph total. If the section needs more setup than that, the setup is itself bullet content.
- **Multi-clause sentences with semicolons, parentheticals, or commas separating discrete points** are a signal to break into bullets. If a sentence reads "X (zero this, zero that, every other thing canonicalised), and Y, and Z" — break it.
- **Four-word sentence floor still applies inside bullets.** Do not split into staccato three-word bullets to honour the bullets rule and break the four-word rule.
- **Canonical pattern**: heading + one bold lead-in sentence + bullet list. The lead-in frames the list. The bullets carry the load.
- **Real markdown lists only (A.14, the highest-leverage mechanical fix).** A bold label is its own line, then ONE blank line, then each item on its own line starting with `- `. Two shapes render as a single flattened wall-of-text paragraph in the PDF and are banned: (a) inline-dash — `**Key issues.** - a - b - c` on one line; (b) no-blank-line — bullets starting on the line directly below the label. One sentence per bullet, ~25 words max; the same cap applies to every Top 5 actions table cell. The renderer auto-repairs both shapes, but emit clean markdown anyway — the `.md` artefact ships alongside the PDF.

**Mechanical paragraph-density check before write:**

- Grep for any paragraph longer than four sentences. Convert.
- Grep for any sentence containing three or more comma-separated enumerated items (e.g. "..., zero ..., zero ..., zero ..."). Convert.
- Grep the Executive summary. If it is one or more paragraphs of prose with no bullets, rewrite to the bulleted structure spec'd above before writing to disk.

**Applies to:** full reports (especially the front matter — Executive summary and Top 5 actions to ship first), email overviews, weekly reviews, dashboards, action-item lists, and any narrative deliverable.

**Does not apply to:** the single opening framing paragraph of a major section (one paragraph allowed, four-sentence cap), conclusion paragraphs (one or two allowed), or quoted founder/buyer voice (verbatim quotes are sacred).

Render pipeline for bullet-heavy deliverables: Markdown → HTML + CSS (Poppins / Montserrat / teal accent) → Chrome headless → PDF, via `EntireCommerce/playbooks/render-proposal.py`. The renderer runs `normalize_pseudo_lists()` (A.14) on the markdown before conversion: it expands inline-dash pseudo-lists and inserts the blank line a label-then-list block needs, so a model wobble does not flatten the front matter into a wall of text. This is a safety net, not a licence to emit sloppy markdown. Replaces the deprecated pandoc → docx → LibreOffice path for first-touchpoint client artefacts (proposals + initial GTM audits). LibreOffice flattened tables, mis-aligned headings, and dropped page-break safety on bullet-heavy content; HTML + Chrome gives full design control. See `feedback_html_render_for_client_artefacts.md` and `reference_proposal_render_stack.md` in curated memory for the implementation details and the script's `--client-logo` and `--eyebrow` flags.

## Reference material

- `audience-insights-master-prompt.md`: canonical audience research prompt, runs as Section 1.
- `gtm-master-playbook.md`: parent playbook this audit feeds.

---

## Acceptance criteria

- Runs end-to-end from a single Claude Code prompt. No manual copy-paste between steps.
- Detects the correct mode (A1, A2, A2-thin, B1, B2) and adapts the report.
- Honours the two-layer tool gate. External layer runs on every mode. Internal layer runs only when credentials are present.
- Skips missing Internal tools cleanly. Sections 7 / 8 / 9 lead with what was observed publicly, not with "Requires access" stubs. The Access-Grant Checklist is removed from the audit deliverable entirely (deprecated 2026-05-13); access asks move to the SOW. See `feedback_audit_no_access_asks_in_audit.md`.
- **Ships two deliverables every run**: full report (.md + .html + .pdf, plus a hosted HTML link at entirecommerce.ai/audits/{slug}.html) and email overview (.txt). Both required. Standalone executive summary documents are no longer produced — the report's own front matter (Executive Summary + Top 5 Actions to Ship First, ~2 pages) replaces it. **Front matter is two sections, not four** (Cross-cutting Themes and Key Findings dropped as separate blocks per `feedback_audit_front_matter_two_sections.md`).
- Writes a single consolidated report to `{client_folder}/docs/gtm-audit/{YYYY-MM-DD}/{client-slug}-gtm-audit-report.md` with all eleven sections plus the Methodology-and-reading-frame appendix at the end (not the top).
- **Task tables (mandatory):** every action list in the audit renders as a 3-column markdown table (`# / Task / Type`) with Type values Claude / Hybrid / Human. No bracketed prefixes inside item titles. Tag-distribution close-line after each table. See `feedback_audit_execution_tags.md` (rewritten 2026-05-13).
- **Audit-stage register:** "Claude Code can run end-to-end" not "we run them for you". The audit may ship before any engagement is signed; named actor is the agent. See `feedback_audit_claude_code_not_we.md`.
- **No hard timelines in the audit body:** "inside 90 days", "first thirty days" are banned. Use directional sequencing ("first wave / second wave / third wave"). See `feedback_audit_no_hard_timelines.md`.
- **No tool-migration push:** if the client is on a working tool (Mailchimp, Brevo), don't pre-recommend a migration (Klaviyo) inside the audit. Tool-switch decisions are post-engagement. See `feedback_audit_no_tool_migration_push.md`.
- **Pre-audit geo / capital / channels clarification:** ask the founder in writing before running the audit. If skipped, the cover email must surface the geo question. See `feedback_audit_pre_clarify_geo_and_constraints.md`.
- **Apify defaults:** Meta Ad Library via `apify/facebook-ads-scraper` (not screenshots). VOC mining second-pass via `apify/instagram-scraper` when first-pass quote count is under 20. See `feedback_audit_apify_defaults.md`.
- Generates `{client-slug}-gtm-audit-report.html` and `{client-slug}-gtm-audit-report.pdf` via `EntireCommerce/playbooks/render-proposal.py` (Markdown → HTML + CSS → Chrome headless → PDF). The render call passes `--client-logo EntireCommerce/site/public/logos/clients/{client-slug}.png` (drop the asset there before the first audit ships) and `--eyebrow "Go-to-Market Audit"` (overrides the default proposal eyebrow). The H1 in the markdown source reads "Go-to-Market Audit: {Client Name} ({YYYY-MM-DD})" — written out in full so batch jargon-replacement passes do not mangle "GTM" as "Google Tag Manager". The rendered `.html` is also copied to `EntireCommerce/site/public/audits/{client-slug}-{YYYY-MM-DD}.html` so it serves as a hosted link the founder can open on a phone. Pandoc + LibreOffice is deprecated for these artefacts.
- Writes the email overview to `{client_folder}/docs/gtm-audit/{YYYY-MM-DD}/{client-slug}-reopener-email-draft.txt` with ~600 words, 12 bullets in three buckets (opportunity / decay / quick wins), no meeting-ask, no CTA, no subject line when replying inside a live thread.
- Stores supplementary evidence (quote tables, competitor crawl data, raw SERP pulls) under `{client_folder}/data/gtm-audit/{YYYY-MM-DD}/` and cross-links from the report.
- Competitor audit matches the in-depth teardown standard (full catalogue crawl, keyword footprint, content gap, attribute-tuple gap for catalog businesses, ad-library scan, paid-SERP intel, cross-competitor synthesis).
- **Benchmarking tables place the client first.** Every benchmarking table in the report puts the client's metrics in the first column (when entities are columns) or the first row (when entities are rows). Empty cells are fine; absence of the client row or column is a spec violation. Applies across Sections 3, 4B, 4C, 5, 6, and 8.

- **No agency comparisons in audit body.** Banned: "most agencies", "unlike traditional consultants", "unlike other tools", "agencies skip / miss". Public copy states what we deliver, never by contrast to other agencies. See `feedback_no_agency_comparisons.md` in curated memory.

- **No other-client references in audit body.** Banned: "we run this at [Client X]", "as we do at [Client X]", any other client name from the agency's portfolio. Cross-client validation stories belong on calls and pitches, not in client-facing deliverables. Generic "previous business" framing without naming the client is OK if explicitly wanted. See `feedback_no_other_client_references.md`.

- **Mid-level individual names use role labels in audit body.** Use "the ecommerce lead", "the in-house design lead", "the marketing manager", "the technical lead". Founders, board members, deal sponsors, and external partner brands keep their full names. The names of mid-level employees who may leave or change role get stripped — the role label travels with the position, the name does not. See `feedback_no_individual_names_in_client_artefacts.md`.

- **No unknown-baseline promises in audit body.** Drop "Targets" / "KPIs" tables when the audit has no baseline data + constraint visibility to back them. Commit to deliverables (work being done), not outcomes (revenue lifts, conversion-rate forecasts, roster sizes). Public market facts and audit-observed current state are OK; forecasts that assume a baseline we do not have are not. See `feedback_no_unknown_baseline_promises.md`.

- **Stack evaluation runs first before any structural recommendation.** Before drafting any structural recommendation in Section 4A or Section 11, run the stack-evaluation step from Section 4A.0. Recommendations made without this step are spec violations. Every structural recommendation must name the surface it applies to (marketing pages, blog, product app) and acknowledge what is already in place. See `feedback_audit_stack_evaluation_first.md`.

- **Conclusion is bullet-led, not a wall of prose.** Standard pattern: opening framing paragraph (3 sentences max, no enumeration) + bulleted Top-priority moves + bulleted tag distribution + short cadence paragraph. The CTA is NOT in the Conclusion — it is the separate `## Next Steps` section that follows. Any Conclusion paragraph that enumerates 3 or more items inline is a spec violation. The Conclusion is a target for the bullets-over-paragraphs rule, not an exception.

- **All URLs and emails render clickable.** The render script's `autolink_urls_and_emails` post-processor wraps bare URLs and email addresses in anchor tags automatically. Plain-text URLs in the rendered HTML are a render defect. See `feedback_clickable_urls_in_client_artefacts.md`.
- Audience section conforms to the master prompt spec: ≤500 words front page, 15-quote appendix, nine sub-headings.
- Every finding is ICE-scored. Top-priority items flow into `{client_folder}/actions.md` as new Next Actions once the engagement is live.
- No P0/P1/P2 or "Binding Constraint" labels in any deliverable. ICE scoring is the ranking mechanism; the report shows the ranked bucket list.
- Both voice gates (report and personal) honoured end-to-end. Zero banned-pattern matches on each of the two artefacts before write.
- **Executive summary is bullets, not prose.** The Output formatting rule below is the single most-violated rule in this playbook; the front matter must be skim-readable as bulleted constraints + bulleted unlocks, never a wall of paragraphs.
- Methodology-and-reading-frame metadata block lives at the **end** of the report, not the top. The reader opens the document for insight, not plumbing.
- **Every action item is execution-tagged** with `[Claude]`, `[Hybrid]`, `[Human]`, or `[Hybrid → Claude on access]` per the Action item execution tags subsection. The rule is universal: anywhere an action item appears in the report — Executive summary unlocks, the Top 5 actions table, Section 11A foundational fixes, Section 11A bets, Section 11B ranked list, Section 11B backlog, the email overview close — it carries an execution tag at the start. No untagged action items.
- Tag distribution summary line appears at the close of the Top 5 actions table, the close of Section 11B, the Conclusion, and the email overview close. Speed-thesis made tangible: prospects see exactly which items we run for them once API/MCP access is wired.
- **Executive summary written in plain English.** Translate every technical term the first time it appears or substitute the glossary equivalent. The body sections (1 through 10) can use technical register; the front matter cannot.
- **Strong-opinion language is banned in audit deliverables.** Do not write "mandatory", "non-negotiable", "must ship", "must do", "not subject to experimentation", or similar dictation language. The audit reports evidence and ranks recommendations; the founder decides. Substitute: "highest-impact", "highest-confidence", "the audit evidence is unanimous", "strongly recommended", "foundational fix", "the audit ranks this first". Do NOT substitute "highest-leverage" or "biggest lever" — A.14 bans both as internal jargon. Reserve hypothesis / experiment framing for items that genuinely test a belief — not for structural fixes where the evidence is settled.
- **The nine Founder-readability rules (2026-05-15, absolute) are honoured end-to-end.** No limitation / gap / tool-failure language anywhere; no internal filenames / paths / appendix lists in the body; Section 3 synthesis-first with compressed competitor highlights; Executive Summary bulleted (no What-we-recommend block); whole-body de-jargon including section titles; ruthless-concision pass run; Trustpilot never a primary signal. See the "Founder-readability rules (2026-05-15, absolute)" section above and `feedback_audit_no_limitations_no_internal_refs.md` / `feedback_audit_synthesis_first_compression.md` / `feedback_audit_exec_summary_bullets_and_dejargon.md`.
- **No playbook-rule names leak into the body.** Banned: "benchmarking layout rule", "placed first per", "per the <X> rule / convention / layout", "layout rule", or any sentence explaining why a table is ordered the way it is. A benchmarking table just has the client first; the report never names the rule. See `feedback_no_internal_voice_rules_in_client_body.md` and the matching voice-gate row.
- **No overconfident / epistemically-overreaching statements.** Distinct from strong-opinion language (prescriptive force) — this bans epistemic overreach. Single-tool / single-pull / inferred findings are hedged AND the uncertainty is named as a client-site fact plus the fix, never as an audit-process limitation. No outcome / timeline guarantees ("wins this surface", "will convert", "guarantee", "proves", "definitely"), no false-precision multipliers from a soft single source, no "the only / every / no competitor" absolutes. Outcome claims become "is the play / is well positioned to". Not a licence for wishy-washy hedging; ranked recommendations with a concrete basis stay direct. See `feedback_no_overconfident_statements.md` and the matching voice-gate row.
- **Do not lead with internal-to-EntireCommerce jargon.** Anywhere the audit uses an internal shorthand or term, define it on first use in plain English, or substitute the plain-English equivalent. Banned-without-explanation terms include: execution tags (`[Claude]`, `[Hybrid]`, `[Human]`, `[Hybrid → Claude on access]`) — these need a legend block at first use; "EC Operator" — substitute "weekly review cadence" or gloss as "the EntireCommerce weekly review automation"; "speed-thesis" — substitute "AI-native delivery" or describe the substantive claim ("more than half the work runs on Claude alone"); "binding constraint" — substitute "the one thing holding growth back" if used outside an audience-quote context; "biggest lever" / "highest-leverage" — substitute "highest-impact" or "where the biggest gain is" (A.14); "ranked by ICE" / "ICE-ranked" / "by ICE" — substitute "prioritised" / "prioritised by impact", and never surface the word "ICE" in the deliverable (A.14); mode codes (A1/A2/B1/B2) — keep out of the report body, retain only in the Methodology appendix; "Path A / Path B" — gloss on first use as the playbook's two execution paths. Apply the **Executive summary unlock format** to plain-English descriptions: lead with the task, then in parentheses say "a hybrid task that Claude and the team do together" / "a task Claude can run end-to-end" / "a task that only the team can do" / "a hybrid task today; Claude can run it end-to-end once [specific access] is wired".

- **Do not expose internal file paths in audit deliverables.** The reader cannot navigate paths like `data/gtm-audit/2026-05-07/audience-voc-mining.md` or `EntireCommerceClients/{Client}/actions.md` — they leak our internal directory structure and look broken to the founder. Use descriptive references instead ("the 30-quote audience appendix", "the full Screaming Frog crawl export", "our internal task tracker"). The model writes no "Supporting data files" subsection and no "Methodology and reading frame" appendix — both are banned in the body (rule 2). The report ends at the Conclusion; the runner appends the single "supporting data available on request, email **nishant@entirecommerce.co**" line and the signature deterministically. Drop the file-path inline references throughout the body (e.g. `data/...`, `actions.md`, `.env`, `.config/`). Internal infrastructure stays internal.

- **Do not expose internal credentials, API keys, or service-account specifics in audit deliverables.** Banned in audit output: ".env" references; specific service-account email addresses (especially ones containing other client names — e.g. `nish-claude@...` exposes another client); specific API key variable names (`APIFY_API_TOKEN`, `OPENAI_API_KEY`, etc.); GCP project IDs; "credentials live in `.env` already"; "the agency operator account already has [API] wired". These are internal infrastructure details that read as inside-baseball and risk cross-client leakage. Acceptable framings instead: "we share the service-account email at the engagement scoping stage"; "we have existing model-API access on our side"; "Brevo is part of the agency tooling we can wire on your behalf"; "our existing tooling stack covers [X]". The audit can describe what access we ask for without naming the specific credential plumbing on our side.

- **Do not write internal drafting / voice rules into the body of the audit.** The voice rules in this playbook (plain English, no negative parallelisms, no AI clichés, four-word floor, India-specific register, em-dashes allowed in personal voice, etc.) are instructions for the operator drafting the document. They never appear as captions, prefaces, footnotes, sub-headings, or stage directions inside the rendered artefact. The output is the proof of voice; the rules do not need to be announced. Banned patterns include: caption-style sentences immediately above headline / copy lists describing how the candidates were written ("Plain English, India-specific register, no negative parallelisms, no AI clichés."); section prefaces describing the drafting register ("operator voice in this section"; "personal voice register"); stage directions to the production process ("this section runs:"; "the audit reads them"; "in drafting mode"; "during the synthesis pass"). See `feedback_no_internal_voice_rules_in_client_body.md` in curated memory and the matching row in the voice enforcement gate table.
- **No specific dates or hard timeline commitments in audit deliverables.** The engagement may not yet be agreed when the audit is delivered. **Directional timeframes are fine and welcome** — founders want to know roughly when effects compound. Distinguish:
    - **Banned (specific dates / hard commitments):** "lands Monday", "starting May 18", "by 22 July 2026", "Mon May 11", named days, named months, calendar commitments that assume the team is on a calendar.
    - **Banned (calendar bucket names):** "This week" / "Next 30 days" as Section 11B priority bucket headings — these read as commitments the engaged team will work this week. Use **Top priority** / **Near-term** / **Backlog** instead.
    - **Allowed (directional horizons that describe expected signal timing):** "inside 30 days", "inside 60 days", "the first 30 days", "the first 90 days", "compounds in weeks", "lift in the first 60 days post-foundation". These describe when effects show up, not when the team will work.
    - **Allowed (effort estimates):** "two-week sprint", "0.5 dev day", "1 hour" — describe duration, not a calendar.
    - **Allowed (conditional framing):** "if the engagement starts", "once access is wired", "post-foundation".
    - **Allowed (factual references with their actual dates):** "on the May 7 call", "Adhrita committed access on the May 7 call" — these reference past events.
