When 270 Articles Produce 65 Indexed Pages

A case study in content inflation, indexation decay, and the (hidden) cost of publishing more.

Table of Contents

RECOMMENDED SERVICE:

RELATED DISCUSSION:

KEY TERMS:

content inflation, indexation recovery, auto-generated blog SEO, commodity content, content portfolio utilization rate.


I recently reviewed a young SaaS company that had implemented programmatic SEO to create content. In fact, it was not only content — virtually everything, the site, app, voice and advice (they provide relationship advice) was automatically produced using AI. 


On the auto-generated blog, they have published more than 270 articles and — apparently — the publishing process was efficient. Yet Google had indexed only around 65 pages. To amend myself on numbers: it is even less, as of 65 indexed blog pages 23 are tag pages. All in all, roughly three quarters of the content portfolio was effectively ignored. 


Content inflation


What I call a content inflation is actively publishing on the web which does not result in clicks or mentions. Often, content inflation occurs when the number of published documents grows faster than the amount of unique information contributed by those documents.


Signals:

  • weak differentiation between the site and competition
  • topic overlap
  • low information gain


Patterns that allow categorizing content inflation:


  • Repetitive structures, including openings, headers, etc.
  • Commodity content (especially, titles)
  • The FAQ farm(s)


The result of content inflation:

The site becomes larger without becoming more informative.

In our case, the most visible pattern was “commodity content” or content produced without demonstrating any visible authority or experience of engaging directly with the topic. The site contains a repetitive structure of titles: generic and easily identifiable, without citing any real persons (or having an actual author), despite working in the medical / psychology domain. 


The causation


I claim that 75% of pages weren't indexed because of content inflation. Other possible explanations were explicitly ruled out in the course of research (skipping this part here), including: 


  • crawl budget limitations on a young domain, 
  • internal linking gaps, etc


The portfolio perspective


The most important question in the client's brief was this: a single page — the competitor (brands) review — drives 30% of our organic clicks. The whole cluster, associated with this competitor (review, advice, "brand-vs-competitor", etc.) accounts for roughly half of all non-branded traffic. 


Importantly, the competitor’s team reached out and asked to stop publishing about them. So, the client was seeking to supplement the traffic “crater” (after potential removal of this cluster). 


So, if we think of content not as individual pages, but as a portfolio, we may build the following sequence of qualification-related questions, answering which allows defining the quality of portfolio of content:


  1. How many pages generate traffic?
  2. How many pages rank?
  3. How many pages convert?
  4. How many pages are indexed?


Notably, in this case no other portfolio of pages, including blog as a whole (or any smaller article cluster), service pages, informational pages, etc could compare with a competitor-focused portfolio. Knowing that, the client intentionally used this “competitor-takedown” content strategy (as many other SaaS do by the way), but the problem was — her other portfolio(s) could not even come closer to this. This demonstrated a core thing:

A website accumulates content assets and is simultaneously losing portfolio quality outside the competitor-driven content cluster.

If radicalized, all clusters — except for competitor-focused one(s) — don’t tend to demonstrate incremental value, therefore, adding more content (inside these clusters) is augmenting the negative feedback loop. As described in wlw.de case study on content portfolio optimization, this feedback loop works like this:

“Bad” signals coming from a number of pages are scored into performance of all the pages from the segment (cluster). So, when old ones “sunk”, the new ones share the same destiny, i.e. not picking up in search performance.  

From this perspective the problem relates two things:


  1. Cutting down the competitor-related cluster without incurring a large traffic losses 
  2. Securing indexation recovery of other (own) article clusters


The former one seemed easier. My recommendation was to forget about the “competitor takedown” content strategy bet — and stick to 1 page per competitor instead. All the rest (the whole cluster) shall be consolidated or redirected. I don’t perceive as large a risk of losing traffic coming from such a content pruning strategy. 


The latter — indexation recovery of URLs from native clusters — is much trickier. 


 What Google may be seeing?


Possible signals:

  • weak uniqueness
  • thin differentiation
  • low perceived usefulness
  • lack of demonstrated expertise
  • low engagement


Why this matters more in the AI era


Gen AI search systems:

  • need information gain
  • need unique observations
  • need differentiated sources


If ten pages say the same thing: the eleventh adds little value.


The action plan is to stop publishing. De-commoditize


My main advice was to de-commoditize the content, i.e. move away from generic content to what their app can do (vs other similar tools). This would demonstrate the value of maker-specific content and result in better indexation and more keywords per page.


While most companies track:


  • articles published
  • impressions
  • clicks


I proposed:

Portfolio Utilization Rate

Example: 270 published pages result in 65 indexed pages, which equals 24% effective utilization — this is a portfolio Portfolio Utilization Rate that a content maker needs to watch.


It is important to the client-specific case (and overall) because of the high imminent risk of (larger) deindexation. Patterns like this point in this direction, that’s one of AI strategies that backfire. My somewhat contrarian bet was to stop adding new content at all and focus on trimming the low performing one (redirecting, removing, consolidating articles together) as well as manually changing the titles of the blog. This is where watching Portfolio Utilization Rate metric may be especially helpful.

About Bohdan Lytvyn

Full background and approach — bohdanlytvyn.com

Bohdan Lytvyn

"WASTELESS GROWTH" BOOK AUTHOR

17 years in SEO and growth strategy. Former Senior SEO Manager at Alibaba's European subsidiary. Worked with B2B marketplaces, SaaS platforms, eCommerce businesses, and digital-first companies across Europe.


Based in Paris. Working in English and French.


I don't run an agency that assigns you to a junior team. I'm the person who does the diagnostic, designs the strategy, and delivers the work.