The 10-gate AI search pipeline: Find where your content fails

The AI engine pipeline has 10 gates between your content and a recommendation:
- Discovered.
- Selected.
- Crawled.
- Rendered.
- Indexed.
- Annotated.
- Recruited.
- Grounded.
- Displayed.
- Won.
Confidence at each gate multiplies, which means your worst gate sets your ceiling, and a single near-zero anywhere in the chain drags the whole result down with it.
That dynamic leads to a simple rule. The “Straight C” principle: in any multiplicative system, the weakest stage sets the ceiling for the entire system, and the highest-leverage fix is always the near-zero, not the near-perfect.
Brent D. Payne nailed it in Sydney in 2019: “better to be a straight C student than three As and an F.” Gary Illyes had been sketching out Google’s multiplicative ranking model, and I scribbled the lot from memory on split beer mats while everyone else went to the bar for another round. The principle stuck with me even though the beer mats didn’t.
Applied to the 10-gate pipeline, the principle makes the work order obvious: find your F grades, fix them first, then find your D grades, and only then worry about pushing your other gates from C to B to A. Below, I’ll walk you through how to identify the weak gates and prioritize them by scope.
The pipeline runs in two phases with different logic
Phase 1 (discovered through indexed) is infrastructure- and bot-centric. It’s mostly pass or fail: either the system has your content, or it doesn’t. The fixes are technical and well-documented: sitemaps, structured data, rendering, and quality signals.
Phase 2 (annotated through won) is competitive and algorithm-centric. Your content is measured against every alternative the system has for the user’s needs.
Passing all five gates in Phase 1 means the system has your content in stock. Winning Phase 2 end to end means the system chooses you over your competition.
Each stall pattern points to its fix
Fix what’s weak. In DSCRI, the fixes are mechanical, and success is relatively easy to measure.
In ARGDW, the fixes are less obvious, more indirect, and the cause-and-effect relationship is harder to demonstrate. That’s why so many brands and practitioners focus too much on mechanical fixes and not enough on competitive ones.
Each of the 10 gates is a place where the pipeline can stall. These are some suggestions, absolutely not exhaustive: use the strategies you already know, too.
| No. | Gate name | Stall | First-party (Entity Home Website) | Second-party (semi-controlled) | Third-party (independent) |
| 1 | Discovered | Bots never find the content | Sitemaps, IndexNow, internal linking, and inbound links | Link from your Entity Home Website with clear anchor text | Outbound links from owned properties and second-party content |
| 2 | Selected | Found but ignored | Internal links, inbound links, anchor text, content around links, and Publisher and Author N-E-E-A-T-T | Anchor text, content around the link, and link back to your Entity Home for context | Outbound links from owned properties and second-party content, anchor text, and content around the link |
| 3 | Crawled | Retrieval fails | Server performance, redirect chains, pruning, and canonicals | Choose reliable platforms; keep URLs clean and stable | Prioritize coverage on sites with strong crawl reputation |
| 4 | Rendered | Retrieved, but the system can’t process it | Server-side rendering, reduce external resources, and JavaScript discipline | Use platform-native formatting; avoid embeds that block render | Prioritize coverage on properly rendered sites |
| 5 | Indexed | Rendered, but not stored | Site structure, content quality, pruning, and canonicalization | Content quality and original perspectives | Prioritize coverage on fully indexed sites |
| 6 | Annotated | Inaccurate, low-confidence annotations | HTML5, structured data, schema markup, site structure, content quality, and unambiguous entity signals | Unambiguous entity signals, and link to your Entity Home for disambiguation | Outreach to clarify entity references, clear anchor text from your owned properties and second-party content |
| 7 | Recruited | Missing from one or more layers of the Algorithmic Trinity | Provide what each layer wants: recency, originality, clarity, information gaps, helpful framing, etc. | Fresh perspectives, original content, and regular updates | Outreach for coverage and updates from news, trade, and industry sites |
| 8 | Grounded | Not selected as a reference for the topic (not Top of Algorithmic Mind) | Entity identity optimization, Publisher and Author N-E-E-A-T-T, and explicitly connect claims to proof | Consistency of identity, credibility signals, and link claims to proof | Outreach for citations from authoritative sources, and build N-E-E-A-T-T through coverage |
| 9 | Displayed | Not chosen as part of relevant answers in the funnel | Close the Framing Gap at each UCD layer, improve brand N-E-E-A-T-T | Frame content to match each UCD layer | Outreach for coverage that closes the Framing Gap, improve N-E-E-A-T-T through external corroboration |
| 10 | Won | The page was the recommendation, but didn’t get the click, the citation, or the action | Write copy, titles, and descriptions that are easy for the algorithm to extract intact; frame claims so the algorithm can respect the brand narrative without rewriting it; educate the algorithm on the brand narrative so it doesn’t distort it | Use platform fields the algorithm will lift verbatim (titles, summaries, intros), and keep brand narrative consistent across every property | Brief publishers and partners on your brand narrative so coverage frames claims the way you’d frame them yourself, and correct distorted coverage at source |
Reading the table: Across the rows, infrastructure fixes (Gates 1 to 5) are specific, technical, and often binary, while competitive fixes (Gates 6 to 9) point at larger bodies of work (graph presence, proof connection, and framing gap closure) that are strategic rather than technical.
Down the columns, your direct leverage drops as ownership drops:
- On first-party, you can fix anything.
- On second-party, you control content but not infrastructure.
- On third-party, your only real moves are outreach and the links you point at the property.
The further into the pipeline the stall sits, and the further from the entity home website it sits, the more the fix becomes about positioning rather than engineering.
You can buy your way through DSCRI. You have to earn your way through ARGD. Won is its own case. By the time the algorithm reaches won, it has either understood your brand narrative or it hasn’t.
If it has, it respects your titles, your descriptions, and your framing, and the click or citation lands the way you wanted. If it hasn’t understood you fully, it rewrites you, and the rewrite won’t be your framing. Assuming your copywriting is top-notch, that’ll lose clients you should have won.
Educating the algorithm on the brand narrative is the work that decides which of those two outcomes you get, and the work happens across your digital footprint, over time (ongoing), and at every gate.
Work outside-in, because most of what you need already exists
The pipeline runs at three scopes simultaneously — per item, sitewide, and web wide. Every gate operates at all three. You can’t work on them simultaneously, which means the order you pick is the single biggest decision in the project, and most brands pick the wrong one because they’re watching their competitors instead of the structure.
Here’s a simple fact most brands miss: most of what you need is already in place.
- You already have claims (you own a website, you’ve published positioning, you’ve explained who you are and what you do).
- You already have proof (clients have written testimonials, journalists have covered you, partners have referenced you, conferences have programmed you).
The two layers exist, they’re just not connected. Joining the dots between existing claims and existing proof is the biggest single piece of leverage available to almost any brand.
Almost nobody is doing it systematically because they’re too busy creating new content from scratch. When I say “join the dots,” that means both bi-directional linking and framing (which I covered in “The framing gap: Why AI can’t position your brand”).
That insight reorders the work. The right sequence is outside-in, and it lines up with claim, prove, and frame at the scope level.
Sitewide first
Get your claims structurally consistent at scale. Templates make it easy for bots to digest your site only if they’re consistent. Get the templates right, and the content taken as a whole reads clearly.
Make sure the categorization is logical, the schema is uniform, the internal linking pattern is predictable, and the HTML5 is built to help bots perform chunking that produces high-confidence, well-bounded representations of every part of every page.
Get the templates wrong, and the algorithms annotate everything with low confidence because the chunking was bad, the categorization was illogical, and the structural signals contradicted each other. That’s a sitewide weakness that the content carries through. This is cascading confidence at scope level.
Content is the input, context is what the templates supply, and confidence is what the system produces when context is consistent enough to make sense of the content. Start at the site level because that’s where the cascade either begins clean or collapses before it starts.
Dig deeper: The funnel flip: Why AI forces a bottom-up acquisition strategy
Web-wide second
Connect the dots to the existing proof. Once your owned property is making consistent, machine-legible claims, the second- and third-party footprint is where those claims get corroborated.
The work here is mostly auditing, not creating: independent journalists who’ve already covered you, client testimonials sitting on client domains, conference programs that name you, partner mentions, and third-party reviews that already exist.
This is the prove layer, and the leverage is enormous because your competitors are mostly not doing it. They’re watching each other’s websites while the independent layer that actually decides who AI recommends sits unattended on the open web. So, update what you can, and insert bi-directional links strategically to “connect the dots physically.”
Per item last
Frame the connection between claim and proof. Once sitewide claims are clean and web-wide proof is surfaced, it’s time to bring it all together in individual items.
Per-item work builds the relational bridge between specific claims and the evidence. It’s up to you to provide the interpretive frame that tells the algorithms how to read the connection and closes the framing gap one page at a time.
Framing only earns its full return once the two layers underneath are solid, because the frame is the connection between things that already exist, and there’s nothing to connect if the claim is incoherent or the proof hasn’t been surfaced.

Fix the earliest broken gate first, or the fix downstream does nothing
The pipeline is sequential. Each gate’s output is the next gate’s input.
First job: get content flowing through every gate without an absolute fail at any point. If discovery is broken, improving your annotation does nothing because your content never reaches annotation.
The rule is simple: find your earliest failing gate, fix it, then re-measure everything downstream on the improved signal. Fixing gates out of order wastes budget because the bottleneck hasn’t moved. I filed a patent for the technical implementation of this principle, but the principle itself doesn’t need the patent — it’s how any sequential system works.
Once nothing is absolutely failing, start fixing the weakest gates one by one, from weakest to strongest, to maximize the effect of each fix on the signal that flows through everything downstream.
If rendering drops 50% of your useful content, every downstream gate inherits the damage, no matter how strong your competitive positioning is. Push that up to 100%, and you’ve doubled the signal for everything that follows.
Below are potential stalls at each gate (single page) with examples of fixes.
| No. | Stall | Problem | Possible fix |
| 1 | Not Discovered | Orphaned article about your brand on Poodle Parlours in Paris Monthly | Create a dedicated page on poodleparlour.paris with a TL;DR of the article (use the opportunity to close the Framing Gap), add the publication name, author, date, and an outbound link to the article |
| 2 | Not Selected | The 600th episode of your podcast on your website is ignored by bots despite a link from the pagination | Link to it from the homepage, make the anchor text explicit (not “listen here”), and add the link to the YouTube version description |
| 3 | Not Crawled | Page load time is slow at peak times | Upgrade hosting and use a CDN |
| 4 | Not Rendered | Schema isn’t being ingested by the LLM bots | Move schema inline, or, if that isn’t possible, add the same data to an HTML table on the page |
| 5 | Not Indexed | Rendered, but not stored | Site structure, content quality, HTML5, and schema markup |
| 6 | Badly Annotated | Inaccurate, low-confidence annotations | HTML5, structured data, schema markup, site structure, content quality, and unambiguous entity signals |
| 7 | Not Recruited | Missing from one or more layers of the Algorithmic Trinity | Provide what each layer wants: recency, originality, clarity, information gaps, helpful framing, etc. |
| 8 | Not Grounded | Not selected as a reference for the topics (not Top of Algorithmic Mind) | Entity identity optimization, Publisher and Author N-E-E-A-T-T, and explicitly connect claims to proof |
| 9 | Not Displayed | Not chosen as part of relevant answers in the funnel | Close the Framing Gap at each funnel layer (Understandability, Credibility, Deliverability), and improve brand N-E-E-A-T-T |
| 10 | Not Won | The page was the recommendation, but the algorithm rewrote your title and description | Improve brand Understandability of the brand narrative and framing, tighten the title, description, and intro so the algorithm extracts your version intact rather than rewriting it; these remain the most visible elements at the zero-sum moment in AI |
Reading the table: gate-by-gate example issues at item level. I provide some suggested solutions for each. You’ll see that many of the fixes are actions you’d take at sitewide or web-wide scope, which is the point.
Scope determines whether the fix touches one URL or thousands, but the underlying mechanism at each gate is identical. Per-item work is where the fixes get specific, but the patterns repeat.
The authoritative entity advantage compounds across the competitive gates
One strategy will improve your grade at almost every gate in the AI engine pipeline: entity optimization.
When your brand entity is fuzzy across the three graphs (document, concept, and entity), actively optimizing the entity identity improves clarity, focus, and confidence at almost every gate.
But the advantage you’ll gain isn’t uniform: at the infrastructure gates it does little, but from annotation onward, it will make a huge competitive difference.
Here’s the authoritative entity advantage at each pipeline gate.
| No. | Stall | The authoritative entity advantage |
| 1 | Not discovered | Marginal. A recognized entity in an outbound link from a third party is slightly easier to identify and trace, but discovery itself is infrastructure-driven. |
| 2 | Not selected | Significant. A recognized, trusted entity in anchor text (or near the link) increases the probability of selection. |
| 3 | Not crawled | None. Crawling is purely server, redirect, and rate-limit mechanics. |
| 4 | Not rendered | None. Rendering is purely technical processing. |
| 5 | Not indexed | Moderate. Entity clarity helps the system make canonicalization and deduplication calls with confidence; fuzzy entities produce fuzzy storage decisions. |
| 6 | Badly annotated | Major. Entity confidence is the foundation of accurate annotation. A fuzzy entity produces low-confidence, often inaccurate annotations across every dimension. A clear entity produces clean, high-confidence annotations. |
| 7 | Not recruited | Major. Recruitment into the entity graph, document graph, and concept graph is entity-driven. Clear entities get recruited — fuzzy ones get passed over for clearer alternatives. |
| 8 | Not grounded | Major. Top of algorithmic mind is entity-driven: topical ownership, N-E-E-A-T-T, knowledge graph presence, and more. The system grounds in references it trusts. |
| 9 | Not displayed | Significant. Entity recognition reduces hedging at display. The system speaks confidently about entities it understands well and hedges on the ones it doesn’t. |
| 10 | Not won | Major. Entity confidence decides whether the algorithm respects your brand narrative or rewrites it. High confidence means titles, descriptions, and framings get extracted intact. Low confidence means the algorithm fills in the gaps from training data, and that won’t be the narrative you carefully crafted. |
Reading the table: entity advantage is zero or marginal at Gates 1 to 5 (infrastructure), then carries the heaviest load through Gates 6 to 9 (the competitive phase). At won, it’s the mechanism that decides whether the algorithm respects your brand narrative or rewrites it.
This is the most underrated insight in the whole diagnostic. Optimizing any single gate gives you one gate’s worth of improvement. Optimizing the entity gives you compounding improvement across all five gates from annotated through won, which is why entity-led optimization outperforms page-led or keyword-led optimization in AI search.
The authoritative entity advantage names that compounding effect, and it’s the structural reason brands whose entities remain fuzzy pay a confidence tax at every competitive gate.
Before you create anything new, audit what you already have
Once you know which gate is failing, the first question to ask yourself isn’t “what do I need to create?” It’s “what do I already have that would fix this?”
The content on your website already makes most of the claims you need, but they are not presented clearly and consistently. Then, all brands have more existing proof than they’re fully leveraging.
Look at things like conference programs, client case studies, trade publications, podcasts, social media, reviews, and third-party mentions. There might be a lot that you have never explicitly connected back to your brand.
Audit-first beats create-first on every metric that matters. Audit-first is cheap and fast. Create-first is expensive and slow.
The diagnostic tells you which gate needs the work, the audit tells you what you already own that could do the work, and the audit also tells you where the genuine gaps are, so when you do create something new, you’re filling a gap the diagnostic identified rather than guessing.
That principle drives the temporal triad: ROPI, ROI, ROFI.
The temporal triad turns the diagnostic into a working plan: ROPI, ROI, and ROFI
- Return on past investment (ROPI) is the audit-first work itself: linking existing claims on your website to existing proof scattered across your digital footprint so the assets you’ve already paid for start paying you back. It’s the cheapest, fastest, and almost always the highest-leverage move available, because the asset has already been built and you’re paying only for the connection.
- Return on investment (ROI) is the present-tense work: expanding on content that’s already live, filling the gaps the audit reveals, and creating new pieces in the short term to support what you’re doing today. This is the layer most brands jump to first, and it’s the most expensive of the three when run in isolation, because new creation without ROPI underneath means you’re paying full price to build assets that are already partially in place.
- Return on future investment (ROFI) is the planning layer, and it’s where brand strategy and pipeline strategy converge. If you have a clear sense of where the business is going (which categories you’ll own in three years, which positioning you’ll claim, which framings you’ll need supporting evidence for), you can plant seeds today that won’t serve you this quarter but will be load-bearing in 12 or 24 months.
At my company, we plant seeds constantly: claims and framings published now that aren’t doing visible work today but will be the corroborated proof we’ll need when the next phase of our long-term strategy rolls out. The brand that runs ROFI consistently is shaping the frame against which competitors will be measured in the future.
Because you’re educating and training the algorithms, ROFI actually influences the criteria by which the market will judge you in your favor.
Three time horizons for your content (wherever it lives online): ROPI extracts value from what you’ve already built, ROI improves the present, and ROFI engineers the future.
The same diagnostic works across every AI engine
The 10 gates describe what search engines, assistive engines, and assistive agents actually do, in order, every time they decide whether to recommend you.
Crawl, index, rank was the right model for a 1998 search engine. It hasn’t been the right model for a long time. The brands that are still optimizing for three steps when the systems run on 10 are optimizing for a model that the engines don’t use.
This isn’t my framework. It’s the engines’ framework.
The engines don’t care what you find easy to measure, fun to do, or impressive at the next conference. They care whether your content survives all 10 gates with high confidence at each, and they reward the brands that build for the gates with citations, recommendations, and the actions that follow.
So treat and run it like a system. Fix your F grades first and your D grades next. Work outside-in because that’s where the leverage already lives, and watch the rest compound on top of work you’ve barely had to pay for.
Follow the system, and AI search pays you back, year on year, engine after engine, long past the lifespan of any acronym fashion.























