I'd like to walk through what actually changes when you rewrite a published technical B2B article to get citations from Perplexity, ChatGPT Search, and Google AI Overviews — and what doesn't. When people ask me about optimizing already-published articles for AI search, the working assumption is usually the same: add schema, drop in a TL;DR, ship an llms.txt, done. That has not matched what I see when I run the same query set before and after a real structural rewrite.
The article we reworked is Meridian's own SaaS SEO Playbook. It is structured by stage, from getting into the Google ranking game to hitting the Google top ten game (we did a survey of 30 queries and found zero clean citations for Meridian's own content on these topics). It has category labels, statistics (all attribution-only), and inline CTAs — the absolute standard for content a citation-hungry model reads over and over again. We chose this example because tearing down someone else's article is theater. This way I either take the credibility hit or I earn the piece.
The Measurement Baseline — Before You Touch The Article §
Before making any changes to the article, I subjected it to a "test drive" using 30 prompts (a mix of Perplexity, ChatGPT Search and Google AI Overviews). These were prompts that I knew were rankable by the article, or that could be answered by it (I kept in mind a set of questions the reader might ask). For each prompt I recorded three things: whether a Meridian URL was cited, where it appeared when it was, and what other sources showed up around it.
The two anchors are: first, Perplexity provides links to sources for 100% of its answers (except where it does not know the answer — the creative or rewriting tasks in Rebekah May's study of 500 queries and answers — if we assume Perplexity never guesses). Second, Ahrefs Brand Radar has separate fields for Mentions and for Citations, because people keep finding they were mentioned without being linked, and linked without being the source of the answer.Rebekah May's study of 500 Perplexity queries found the engine returns source links on essentially all factual answers, omitting them mainly on creative or rewriting tasks. Ahrefs Brand Radar separates "Mentions" from "Citations" for exactly this reason.
You will see this work labelled GEO, AEO or SXO in practitioner language. The labels can be anything; the structural moves the measure rewards are not.
Change Set 1 — The Lede Carries The Answer §
The biggest change I made was rewriting the introduction. The original 140 words setting the scene for why technical SEO is difficult for SaaS websites was not very effective at getting readers to continue, because it was essentially talking to itself about how hard technical SEO is for SaaS sites. The new introduction answers the primary question the buyer has ("How do I do technical SEO for my SaaS website?") in their own words. It doesn't bother explaining that the article will help you do the things you want.
The retrieval is straightforward. The AI answer engine reads the first paragraph of the block, and if it finds a quotable sentence that can set the scene, poof, that's the first sentence. Google's own documentation for generative AI features in Search says they are building these features around Search's ranking systems, so the same principles that make a featured snippet work should work for citations. The citation surface reads the first paragraph as the article's claim about itself.
The lede is the highest-leverage change on the page. An AI answer engine reads the first paragraph as the article's claim to being a source. If the first quotable sentence isn't the buyer's answer in their own words, no schema, TL;DR, or word-count increase downstream will recover that citation.
The write cost was three paragraphs and a subhead. The measurement cost on the post-rewrite probe was the largest single delta I saw.
Change Set 2 — Entity Coverage, Not Keyword Coverage §
The before-state is a category of "AI search tools." The after-state has specific names for the same thing: Perplexity, ChatGPT Search, Google AI Overviews, and Gemini. Each name carries a one-clause qualifier that tells you what kind of surface it is: retrieval-first, chat-first, integrated with the SERP, and embedded in Google Workspace, respectively. Ethan Smith's guide to AI search and discovery at Reforge does a nice job simplifying the underlying mechanics. The short version: AI search work divides labor between surfacing the answer and giving proper credit for the source, and named-entity co-occurrence is how the source's authority gets determined.
There is a difference between naming an entity and linking to it. The former tells a reader something about the topic; the latter just points to something else. A link is a navigation instruction; a name-with-qualifier is a topical signal that the article is a source for that field. My rewrite rule was: every vendor, standard, or product a technical B2B reader may need to compare gets a name and a one-line qualifier in body prose. Links go on source attributions, not on entity mentions.
The rewrite added a dozen named entities. None of them replaced ideas. They replaced generic phrases.
Change Set 3 — The Citation Surface §
The citation surface is the set of sentences an AI answer engine can pull verbatim with attribution. They look like the claim sentences in the example response, not the commentary. The before-state had claims attributed only by source name ("According to Semrush…"), definitional sentences buried in paragraphs, and numbered findings paraphrased rather than quoted.
The after-state pattern is small and repeatable. Claim sentences state the finding, the entity, and the number in one clause. Definitional sentences open a paragraph with the definition. Numbered findings sit in their own paragraph with the citation link in the same sentence. When Semrush cites their 2025 AI-search vs SEO traffic study to show that pages cited by ChatGPT rank in traditional organic positions 21 and below on related queries almost 90% of the time, the rewrite quotes the finding, not the study, because that's what a retriever lifts: "AI-cited pages rank 21+ on Google 90% of the time."Semrush's 2025 AI-search vs SEO traffic study found pages cited by ChatGPT rank in traditional organic positions 21+ on related queries almost 90% of the time — so a page already deep in the SERP can still be a citation-surface candidate.
The cost is that a few paragraphs read more clipped. The upside is a stock of retrievable sentences that read like sources instead of hedged prose.
Change Set 4 — Retrievable Fragmentation §
Google's Search Central post from May 21 2025 and its developer guide make clear that chunking macros are not necessary — Google can handle multiple topics on one page. Google has debunked that myth. The reason to chunk is not Google.
The reason for the fragmentation is that a retriever, faced with a choice between two paragraphs about the same topic, will pick the shorter one. The before-state had "one H2 per 600 words," and the after-state has "one H3 per discrete sub-question," "one lead clause per paragraph," and "list items are self-contained, not continuing the sentence above."
That's the pattern behind Rebekah May's 91% figure for articles cited on YMYL queries that contained lists. The citation surface doesn't care much about "long-form, authoritative sources"; it cares about "sourceable content that happens to be part of a source-worthy article," even if that source is a framework without lists.Rebekah May (Marketing Aid) reported that 91% of articles cited on YMYL queries contained lists. The retriever rewards self-contained, sourceable fragments over sheer length.
Change Set 5 — Evidence Anchoring §
The before-state links at the end of each paragraph; the after-state names the source in a sentence, states the finding, and puts the link on the source. A citation-hungry retriever needs all three present for the quote to reach the answer engine: who said it, what they said, and where you can find it.
The same is true for the Search API. If you look at Perplexity's Search API quickstart, you'll see the unit of retrieval is a URL with a title, snippet, and date. The API supports filtering by domain (e.g., *.perplexity.ai) and path (e.g., /my-hub). So the specific hub URL on your site — not your homepage — is the object being retrieved. A restructured article gains nothing from footnote-style link stacks. It gains a lot from source-named, in-sentence attribution that mirrors what the API indexes.
I applied this rule to every statistic, benchmark, and date in the rewrite. The total number of links in the article went down. The link density of sentences that referred to specific claims went up.
The After — What The Same Measurement Set Showed §
The same 30-query probe showed the same pattern in the data: prompts phrased as "primary buyer queries" in the article — where the article should be a valid source — changed from not cited to cited in Perplexity. Long-tail definition prompts were occasionally changed, and competitor-branded or comparison queries were unchanged. I had no expectation that they would change.
The one change I was not expecting was the split between Mentions and Citations. Meridian's name started appearing more frequently in AI Overviews, even when there was no corresponding article URL for the mention. This is the classic warning sign an AI Visibility Audit is built to catch: a restructure that improves entity coverage without improving citation-container coverage moves the Share of Voice number but doesn't touch the traffic part of the promise.
The rest of the probe was noise. One question returned a false-positive citation of an unrelated Meridian URL — a useful reminder that any single probe is directional, not a benchmark.
The Before/After, Side By Side §
The load-bearing changes read like a short list of contrasts.
Opening. Before: a scene-setting paragraph about how SaaS companies fail at technical SEO, followed by a promise that the article will help. After: a two-sentence direct answer to the buyer query, with the primary trade-off named in the first clause and no meta-commentary about the article.
Headings. Before: "Phase 1 — Technical Foundation," "Phase 2 — Content Architecture," category labels announcing what section the reader is in. After: reader-question phrasings a search assistant would recognize as a match to a real prompt, with the H3 layer answering discrete sub-questions.
Evidence. Before: "Semrush reports growth in AI-search traffic." After: Semrush's 2025 study predicts AI-search visitor traffic will exceed traditional search traffic on digital-marketing topics by early 2028, and that AI-search visitors are 4.4x more likely to convert on the sites they sampled. The source is named, the number is stated, and the link sits on the source name.Semrush's 2025 study projects AI-search visitor traffic will overtake traditional search on digital-marketing topics by early 2028, and found AI-search visitors 4.4x more likely to convert on the sites it sampled.
Entities. Before: "AI answer tools." After: "Perplexity," "ChatGPT Search," "Google AI Overviews," and "Gemini," each with a one-clause qualifier explaining what it does.
That side-by-side is the load-bearing artifact. Everything else is footnotes.
What Did Not Move The Citation Surface §
Five interventions I tested didn't work.
Adding FAQPage schema had no impact on any of the concrete numbers. According to Google, no specific structured-data format — no particular variation of schema.org — is required for Google's generative AI features on Search.
Adding an llms.txt file had no effect. Google's developer guide is clear on this: you don't need to create llms.txt or any other AI text file, because Google Search doesn't use them. Perplexity's public documentation doesn't mention llms.txt as a ranking signal either.
Making three passages more "conversational" mostly backfired. In one case the paragraph got worse, with the more-conversational transition being the first thing the retriever ignored as low signal.
Adding a TL;DR block at the very top moved nothing the lede rewrite had not already moved. Once the opening 120 words carry the answer, a redundant TL;DR competes for the same retrieval slot with less specificity.
Increasing article word count by roughly 30% didn't change much. Rebekah May's study of 500 queries found the average cited article was ~1,000 words (YMYL) and ~1,500 words (others) — source-worthy, but not so long as to get indexed by length alone.
The negative results themselves aren't surprising; they were tested against the same "research probe" as the positive ones.
The Operator Cost — When This Is Worth Doing §
The rewrite took Meridian's team about a day of editorial work plus a half-day of QA against the probe. That's Meridian's own delivery data, not an industry benchmark, and it assumes the writer already knows the topic well enough to change entity choices and citation containers without turning the article into a different piece. Whether that cost is worth paying comes down to two questions.
The first is whether the article has real blue-link intent for that query — can you find it in the top 21 results for the keyword? (Semrush claims that's where ChatGPT puts 90% of its citations, which is why I check; a bottom-funnel article already in the top 21 is a citation-surface candidate even if it wouldn't rank organically.) The second is whether the article can defend itself as a named entity or canonical URL, or whether it's just a summary that gets out-sourced by the big players.
If both answers are "yes," restructure. If either is "no," deprioritize. TOFU volume, when AI-search share is both low and unlikely to convert, is not where the work should go. Semrush's 4.4x conversion rate for AI-search visitors is downstream validation of that: the citation-surface win is a positioning bet on high-intent traffic, not on volume.
Also new is the class of third-party AI-search measurement services — Profound, Otterly, and Athena. I haven't yet run side-by-side comparisons across them, so no benchmark there.
How To Read Your Own Article For This §
Twenty minutes with one of your own articles will tell you a lot about whether a restructure is the right move. Five questions to ask:
Does the first paragraph (120 words) answer the buyer's main question in their own words, or set the scene? If the first quotable sentence isn't the answer, you've missed the highest-leverage citation.
Are the vendors, standards, and products you might compare in a buying decision present as names in prose, or hidden in links? Named entities are what a retriever looks for when it seeks topical authority.
Are the stats and third-party sources also linked back with source names in-sentence, so the source can be pulled out cleanly? That's what the citation surface is for.
Is each distinct claim in its own paragraph with its own heading, or are paragraphs juggling multiple ideas? Retrievers seek out self-contained fragments.
Is there a single canonical URL on your domain this article should be the source for? Or is there a "main article" that covers the subject more completely? If the main article is already the source, no restructure will help.
If you get three yeses, restructure — the work pays for itself. If your team is talking about optimizing for AI search and you don't know whether a given article passes these five checks, that's the call to make first. Meridian runs restructures like this one as part of the managed content engine; if you have an article you think already sits on real citation-surface intent, reach out.
FAQ §
Where should the primary answer to the article's question live — the opening 80 words, a TL;DR block, or distributed through the body?
Put it in the opening. A retriever reads the first block as the article's claim to being a source, and a TL;DR competes with the lede for the same slot. Distributed answers get lifted less often than concentrated ones, because retrievers care about self-containedness.
How long should a technical B2B article be when the goal is AI-search visibility rather than ranking position?
Long enough to be source-worthy without making the piece too long for a landing page. From Marketing Aid's study on source and cited article length, the average cited article ran ~1,000 words on YMYL pages and ~1,500 words on others. Word count is the floor for source credibility, not the lever to crank for more citations — forcing source articles to be longer didn't change the average cited length in their data.
Do you need original data or proprietary research to be cited by Perplexity and AI Overviews, or are well-structured second-order syntheses enough?
Syntheses can be enough. The study Marketing Aid used found Perplexity's sources were a mix of blogs, business sources, and authorities of every size and type. An article with named entities, clean citation containers, and retrievable fragments can serve as a source for other sources — it just needs a non-commodity thesis to be a source for something.
What entities — products, vendors, standards, people — should a technical B2B article name to be useful for retrieval? And how is "name" different from "link to"?
Name every vendor, product, standard, and person the reader might use to compare during a buying decision, each with a one-clause qualifier. A link takes the reader to more information; a named-and-qualified entity preserves the topical signal of the article, which is what the AI answer engine's retriever uses to decide whether the page is a source or a waypoint.