Content Marketing

Content Audit Services

A content audit is a systematic review of every page on a website to identify which pages rank, which have thin or duplicate content, where keyword cannibalization occurs, and which pages need to be updated, consolidated, or removed — for both Google search and AI-generated answer visibility.
6–12
Months — recommended full audit cycle for most sites
30–90
Day content freshness cadence for core pages (AI citation signal)
~30%
Of pages on average sites qualify as thin or underperforming
44%
Of LLM citations come from the first 30% of a page (GEO research)

What a Content Audit Assesses

A content audit evaluates every indexed URL against search performance data — not just word count or publish date. The goal is to surface pages that are dragging down domain authority and pages that, with targeted improvements, can capture significantly more organic traffic and AI-generated referrals.

The audit starts with a full site crawl using Screaming Frog to inventory all indexed URLs. Each URL is then matched against Google Search Console data (impressions, clicks, average position) and Google Analytics 4 (sessions, engagement rate, conversions) to establish a performance baseline. Ahrefs adds backlink count and referring domain data per URL.

Every page receives an action recommendation — keep, update, consolidate, redirect, or remove — based on traffic potential, content quality, and cannibalization risk. For clients optimizing for AI search, each page also receives a GEO citation readiness score based on structural factors known to increase LLM citation probability.

The Four Content Audit Dimensions

A comprehensive content audit analyzes four distinct performance dimensions, each requiring a different remediation approach.

🔍

Content Gap Analysis

Compares your current content inventory against competitor topical coverage and search demand data from Ahrefs or SEMrush. Identifies keyword clusters where competitors rank but you have no page — these are direct traffic opportunities. Also flags internal gaps where pillar pages exist but supporting cluster articles are missing.

📋

Keyword Cannibalization

Cross-references Google Search Console query data against page URLs to identify cases where two or more pages compete for the same primary keyword. Cannibalization dilutes ranking signals — fixing it (via consolidation or 301 redirect) concentrates authority on the best-performing URL and typically produces measurable ranking gains within 4–8 weeks.

📄

Thin Content Identification

Flags pages with insufficient depth relative to competing pages ranking for the same query. Thin content is assessed by word count thresholds, content uniqueness, search intent match, and GSC impressions relative to index status. Pages with high indexation but near-zero impressions are primary candidates for expansion or consolidation.

🕑

Content Freshness Scoring

Assesses when each page was last substantively updated versus competitors' update frequency. For time-sensitive queries, Google weights recency — and AI language models weight training data recency. Pages not updated in 6+ months on competitive topics are flagged for a freshness refresh, which includes adding new data, statistics, examples, or expanded sections, not cosmetic date changes.

Content Freshness Cadence for AI Citation

AI language models are trained on web data with cutoff dates — but they also weight recent, high-quality sources for retrieval-augmented generation (RAG). Pages that consistently receive substantive updates are more likely to appear in AI Overviews, ChatGPT answers, and Perplexity citations.

The recommended content freshness cadence is:

  • Core service/pillar pages: Substantive update every 30–60 days — new statistics, updated tools list, expanded FAQ
  • Supporting cluster articles: Review and update every 60–90 days — refresh examples, add internal links to newer content
  • Blog/news content: Evergreen posts reviewed annually; time-sensitive posts archived or redirected when no longer accurate
  • "Last updated" timestamps: Displayed in visible page content (not just metadata) so both users and AI crawlers see recency signals

Important: Changing a publish date without adding new information does not improve freshness scores — Google and AI crawlers assess content change depth, not timestamp alone. Cosmetic date updates can trigger trust penalties if quality raters or AI models compare the content against the claimed update date.

AI Citation Readiness Assessment

In addition to standard SEO metrics, the content audit evaluates each page against GEO (Generative Engine Optimization) citation criteria. Pages are scored on:

Citation Readiness Factor What Is Evaluated AI Impact
Answer Capsule presence Does the page open with a 40–60 word direct definition or answer? High — cited in AI overviews for definitional queries
BLUF H2 structure Does each H2 section open with a standalone answer sentence? High — 44% of LLM citations from first 30% of page
FAQPage schema Is structured FAQ markup implemented and validated? Medium-high — FAQ content appears directly in AI answers
Verifiable stats with sources Are statistics cited with source attribution in-text? Medium — AI models prefer citable, sourced claims
Internal cluster links Does the page link to and from its topical cluster? Medium — topical authority concentration increases SoM
Content freshness Was the page substantively updated in the last 90 days? Medium — AI models weight recent, authoritative sources

What the Content Audit Deliverable Includes

The audit deliverable is a structured Google Sheets workbook with one row per indexed URL, covering all dimensions needed to prioritize improvements by traffic impact.

  • URL Inventory with Performance Data
    Full crawl export with GSC impressions, clicks, average position (trailing 3 months), GA4 organic sessions, and engagement rate per URL
  • Content Classification
    Each URL categorized by content type (pillar, cluster, blog, landing page, category), word count, and last modified date
  • Action Recommendations
    Every URL assigned one of five actions: Keep as-is, Update & expand, Consolidate with [target URL], 301 Redirect to [target URL], or Remove & redirect
  • Cannibalization Report
    List of keyword conflicts with recommendation: which URL to consolidate authority on and how to handle the other (redirect or noindex)
  • Content Gap Opportunities
    Prioritized list of new page opportunities based on competitor keyword gaps, sorted by search volume and estimated traffic impact
  • GEO Citation Readiness Score
    Per-page scoring (for GEO audit clients) on Answer Capsule, BLUF structure, FAQPage schema, stats citations, and cluster linking — with specific fixes listed
  • Content Freshness Calendar
    Scheduled update timeline for core pages based on competitive freshness benchmarks and query time-sensitivity

Who This Service Is For

A content audit is most valuable for websites that have published content over 12+ months and are experiencing ranking stagnation, organic traffic decline, or inconsistent performance across pages. Common scenarios:

  • Established B2B SaaS and service businesses with 50–500+ pages that have never conducted a systematic performance review
  • E-commerce sites with category and product pages competing for the same keyword clusters across multiple URLs
  • Local service businesses where multiple location pages or service pages are inadvertently cannibalizing each other in Google Local results
  • Businesses preparing for a GEO strategy who need to assess existing page AI citation readiness before investing in new content
  • Sites recovering from Google algorithm updates where specific content quality or spam policy actions caused traffic drops

Efryll Carmelo's Experience with Content Audits

At Jumpfactor, a B2B SaaS and professional services SEO agency, content audits were a standard component of every new client engagement. Auditing sites with 200–800+ pages required systematic approaches to cannibalization identification and action recommendation prioritization — work that directly informed content consolidation strategies that produced measurable ranking improvements.

At Ampry, content auditing identified thin content across product and landing page variants that had accumulated over multiple development cycles. Consolidation recommendations reduced indexed thin pages by over 40% while concentrating authority on the highest-converting URLs.

Content freshness cadences were established for Business Training Team's pillar-based content program, with structured quarterly reviews ensuring that high-value pages maintained competitive freshness scores against industry benchmarks.

Results Achieved Through Content Auditing

+2,200%
Domain authority growth for a local business client — content consolidation and internal link restructuring were central to the strategy
+80%
Revenue increase for an e-commerce client following content gap analysis and thin content remediation over 12 months
40%+
Reduction in thin/duplicate indexed pages at a SaaS client, concentrating ranking signals and improving overall domain health

Frequently Asked Questions — Content Audit

What is a content audit?

A content audit is a systematic review of every page on a website to assess SEO performance, content quality, and search visibility. It identifies which pages rank, which have thin or duplicate content, where keyword cannibalization exists, and which pages need to be updated, consolidated, or removed. A modern content audit also evaluates AI citation readiness — whether pages are structured to appear in ChatGPT, Gemini, or Perplexity answers.

How often should a content audit be done?

A full content audit is typically performed every 6–12 months. High-volume sites with 500+ pages may benefit from quarterly rolling audits by section. Individual page freshness should be maintained on a 30–90 day cycle for substantive updates — Google's quality raters and AI models both assess content freshness as a ranking signal. Cosmetic updates (changing a date without adding new information) do not improve rankings and can trigger trust penalties.

What does the deliverable look like?

The deliverable is a structured Google Sheets workbook with one row per indexed URL. Each URL has: GSC impressions and clicks (trailing 3 months), organic traffic from GA4, word count, last modified date, content type classification, cannibalization flags, content gap opportunities, action recommendation (keep, update, consolidate, redirect, or remove), and estimated traffic impact. For GEO clients, it also includes an AI citation readiness score per page.

What is keyword cannibalization and how does the audit find it?

Keyword cannibalization occurs when two or more pages on the same site target the same primary keyword, causing Google to split ranking signals between them instead of consolidating authority on one page. A content audit identifies cannibalization by cross-referencing GSC search queries with page URLs — when the same query drives impressions to multiple pages, those pages are candidates for consolidation or redirect.

What is thin content and how is it identified?

Thin content is pages with insufficient depth to satisfy user search intent — typically under 300 words, with no original analysis, duplicated from other sources, or auto-generated. In a content audit, thin content is identified through word count thresholds, GSC performance data (low impressions despite indexation), and manual sampling. These pages are flagged for expansion, consolidation with related content, or removal with 301 redirect.

What is content freshness cadence?

Content freshness cadence is a scheduled review cycle where existing pages receive substantive updates — new data, expanded sections, additional examples, or updated statistics — rather than cosmetic edits. AI language models weight recency in their training data, and Google's QRG (Quality Rater Guidelines) treat "freshness" as a quality signal for time-sensitive queries. The recommended cycle is every 30–90 days for core pages, with "last updated" timestamps visible in page content and schema markup.

How does a content audit improve AI citation readiness?

AI models (ChatGPT, Gemini, Perplexity) preferentially cite pages that: (1) answer the target question in the first 40–60 words, (2) use structured headings with direct answers below each H2, (3) include FAQPage schema, (4) cite verifiable statistics with sources, and (5) maintain consistent topical authority within a content cluster. The audit evaluates each existing page against these criteria and produces a prioritized list of pages to restructure for AI citation eligibility.

What tools are used in a content audit?

A comprehensive content audit uses: Screaming Frog SEO Spider (full site crawl and URL inventory), Google Search Console (impressions, clicks, average position per URL), Google Analytics 4 (organic traffic, engagement rate per page), Ahrefs or SEMrush (backlink count per URL, referring domains, organic keyword rankings), and Surfer SEO or Clearscope (content depth scoring vs. top-ranking competitors). Results are compiled in Google Sheets or Looker Studio for client review.

Audit Your Content Before You Create More

Most sites have more content than they need — and the wrong pages competing for the same keywords. A content audit identifies exactly where to focus before investing in new production.

Book a Free Consultation