In November 2024, with SE Rankingβs research team, we began a 16-month experiment to test how AI-generated content performs in organic search. We launched 20 websites across different niches and tracked their performance over time.
But we didnβt stop there.
We wanted to look beyond rankings and understand how AI systems discover, interpret, and cite information. So we expanded the project into a more ambitious set of experiments on AI search and LLM visibility.
For the next phase, we created a new fictional brand in a real niche with real competition to see how quickly AI systems would pick it up and whether it could be cited alongside or above trusted industry leaders and government sources.
After the first month, several patterns became clear.
Methodology behind the experiment
We created a fictional brand and published content about it across:
- Brand new website representing the brand, registered specifically for the experiment.
- 11 additional domains, all over a year old, with prior history and existing rankings.
Across these sites, we tested seven content formats:
- Deep guides.
- βAlternativesβ listicles.
- βBest ofβ listicles.
- Review articles.
- Comparison (βvsβ) pages.
- How-to/tutorial content.
- Clickbait-style articles.
We started publishing in March 2026 and tracked how five AI systems responded: ChatGPT, Googleβs AI Overviews, Googleβs AI Mode, Perplexity, and Gemini.
In total, we tracked 825 prompts across different query types and scenarios, which generated 15,835 AI answers during the first month.
For each prompt, we looked at three things:
- Whether our brand (or one of our sites) appeared in the AI answer
- Whether it was cited as a source
- How often it appeared as the main cited source (position 1)
This experiment is still ongoing, and the first month was designed to see how AI systems respond to newly created, fully available information tied to a fictional brand.
Key experiment insights
- 96% of all AI visibility for our fake brand came from branded searches. Even in a real niche with relatively low competition, a completely new domain had little chance of competing with established brands for broader, non-branded topics.
- On queries that only our fake brand could realistically answer, we outperformed established competitors (DT 40+) by as much as 32x and achieved near-exclusive visibility in less than 30 days.
- Even without strong authority, the pages that clearly explained who we were, what we offered, and how we were different (e.g., β[Brand Name] Compete Guideβ and βAbout Usβ) became the most cited sources from the main domain. This shows that brand positioning can be shaped early in AI search.
- Perplexity was the fastest engine to surface new content. Newly published pages usually reached position #1 within 1β3 days of indexation. However, Perplexity often cited additional domains instead of the main brand site.
- Googleβs AI Mode was the most stable for branded queries tied to unique claims (showing our brand at #1 for an average of 90% of prompts).Β
- Gemini, by contrast, often misidentified the brand. And even for uniquely branded queries, this AI platform provided 60% of AI answers with no citations to our brand.
- Deep guides, review articles, and comparison pages generated the highest number of AI citations, while more generic formats like how-to articles and listicles showed minimal impact.
- A topical silo made up of one hub page and 10 supporting articles generated no AI citations. Meanwhile, a set of 30 short, repetitive pages (500-750 words each) generated more than 1,800 citations. So, in this test, high-volume content publishing mattered more than internal linking.
Insight 1: New domains may not beat market leaders right away, but they can define their brand narrative in AI search
One of the clearest takeaways from the first month is that a brand-new site has limited chances of competing for broader, non-branded topics, even in a niche with relatively low competition.
AI systems did pick up our fictional brand quickly, but most of that visibility came when the query was already connected to the brand itself, whether through:
- the brand name
- product-specific claims
- or other brand-related angles
Specifically, out of all AI answers, 96% (15,553 out of 15,835) came from branded searches.
Non-branded informational queries produced just 4% of AI answers in total, and even those mostly came through our supporting test domains.
The pattern was even stronger on the main fictional brand site itself. There, we recorded:
- 10,253 AI answers for branded queries
- and just 6 for non-branded ones
That is a 1,700x difference.
This feels familiar because it mirrors classic SEO. New brands still need time to earn trust, build recognition, and compete for broader topics. When AI systems answer general industry questions, they tend to rely on established and authoritative sources.
This is why the strongest results in our experiment came from prompts tied to information only our brand could answer, such as how the product works, how often it updates, and so on.
These queries alone generated 11,430 AI answers with citations to our brand, accounting for 72% of allvisibility in the experiment.
The reason is simple: there is no competition.
If a query is something like βWas [Brand Name] originally built as an internal tool?β, only one source can realistically answer it. AI systems donβt need to compare sources, evaluate authority, or resolve conflicts.
That gave our fictional brand a major advantage. Even with no domain authority, it outperformed established competitors (DT 40+) by up to 32x on these queries.
What all this means for marketers and business owners is that when users ask about your brand, AI systems are likely to rely on your website as one of the main sources of information. So, the content they cite should be fully aligned with how you want your brand to be positioned.
Our experiment supports this. The βComplete Guideβ page on the main site appeared in 1,799 AI answers (the highest result in the dataset) largely because it consolidated key brand information in one place. The βAbout Usβ page followed with 1,500 AI answers. Together, these were the most cited URLs from our main domain, with LLMs relying on them 3β5 times more often than the additional domains.
In practice, AI systems may learn about your brand quickly, but what they learn depends on what you publish. Your core pages should clearly answer all the questions that are important for your brand: who you are, what you offer, and how youβre different.
This way, you can start shaping your narrative in LLMs even as a new or small brand, before you have the authority to compete for broader industry topics.
Insight 2: AI engines behave very differently
Another strong pattern in the experiment is that the five AI systems do not behave alike. They vary not just in how often they mention the fictional brand, but in how quickly they pick it up, how consistently they cite it, and which domains they prefer as sources.
Googleβs AI Mode: The most stable for branded visibility
Google AI Mode was the most reliable engine in the dataset.
Throughout the experiment, it placed our domain in position 1 for branded queries in about 90% of cases. Unlike other engines, it did not show major fluctuations or dependency on other test domains.
If there was one place where direct brand visibility was predictable, this was it.
Googleβs AI Overviews: High visibility, lower consistency
Googleβs AI Overviews also surfaced our tested domain for branded queries, but the pattern was less consistent.
We saw our brand appear in position 1 for 14 days for some prompts, followed by a drop mid-month that didnβt recover. More broadly, mentions and links for branded queries fluctuated heavily, appearing and disappearing multiple times each week.
Yet when links were included, it accurately described the brand. When no links were shown, it often claimed there was no public information available.
The takeaway here is not that AI Overviews failed to recognize the brand. It did. But that visibility was harder to sustain over time.
Perplexity: The fastest to pick up new content, but not always brand-first
Perplexity was the breakout engine for fresh content.Β
It picked up newly indexed pages within 1β3 days, which clearly made it the primary driver of early visibility within our experiment.Β
But this speed comes with a tradeoff.
Instead of consistently citing pages from our main domain, Perplexity often used our supporting test domains as sources.Β
In early March, our main brand held position 1. But as we published more content on supporting domains, those domains gradually replaced it in AI citations.
By the end of the month,six different domains were being cited: our main brand site and five supporting test domains where we had published additional content about the fake brand.
So while Perplexity increases overall visibility, it doesnβt always send that visibility directly to the main brand site.
ChatGPT: Slower to react, stronger over time
ChatGPT showed the most noticeable progression over time.
At the beginning of March, there were no links or mentions of our brand at all. But as the month progressed, visibility steadily increased.
This growth was especially clear across specific content types:
- Unique claims drove the strongest performance, accounting for the majority of visibility, with around 70% of citations appearing in position 1.
- Review articles started with zero presence but quickly gained traction, reaching consistent position 1 rankings by March 17.
- Comparison (βvsβ) articles achieved the highest consistency overall, with mentions on 29 out of 31 days by the end of the month.
Overall, ChatGPT didnβt immediately recognize the brand. Once it recognized the brand, ChatGPT began surfacing it frequently, especially for branded prompts.
Gemini: weakest performance and most inconsistent behavior
Gemini was the weakest engine in the dataset and the least consistent.
Initially, it struggled to identify our niche correctly. However, the results improved when we changed how we asked the questions. When prompts were framed as comparisons (βX vs Yβ) or reviews, Gemini was much more likely to recognize the brand correctly.Β
Even then, the results were still limited. In the best-performing scenario (queries based on unique claims about the brand), Gemini failed to include any citations to our brand in about 60% of responses.
Insight 3: Content format matters, but so does the volume
Next, for this experiment, we tested seven different content types across both our main site and supporting test sites.
And what we found is that comprehensive, in-depth content earns far more AI citations than shorter articles.
The strongest-performing formats were:
- Deep guides (5,000β6,000 words): ~900 AI answers per page
- Review articles: ~257 AI answers per page
- Comparison (βvsβ) articles: ~145 AI answers per page
This does not mean there is one ideal content length or that longer pages automatically perform better. The stronger results likely came from the depth, structure, and completeness of the information these formats provided.
This finding also aligns with our broader research, where weβve seen that detailed, well-structured content performs better across platforms like AI Mode and ChatGPT.
Pages with narrower or less comprehensive coverage generated fewer citations overall. For example:
- How-to articles/tutorials: 22 AI answers per page
- Clickbait/skeptical articles: 19
- βBest ofβ listicles: 11
- βAlternativesβ listicles: 4
As part of the experiment, we also tested a βspamβ approach: publishing 30 thin pages (500β750 words each) on one of our test domains.
Individually, these pages were weak (averaging just 63 AI answers per page).
But together, they generated 1,897 total AI answers, which makes it the highest-performing content setup at the domain level.
However, thin content is not inherently βbetterβ because of this result. It just shows that volume can sometimes compensate for quality by increasing the likelihood of retrieval and citation (especially in AI engines like Perplexity that prioritize freshness).
In simple terms, a few strong pages win on quality, but a large number of weaker pages can still win on overall exposure.
Insight 4: Topical clustering alone doesnβt produce AI visibility
One of the most useful negative findings came from the content structure test.
For this part of the experiment, we created a hub page on one of our test domains and linked it to 10 supporting articles. In theory, this setup should have built strong topical depth and semantic reinforcement. All 11 pages were indexed, properly structured, and internally linked.
Yet, they generated zero AI citations.
This is significant because it challenges a common assumption carried over from traditional SEO: that topical clustering automatically improves authority or increases the likelihood of being retrieved.
At least in this experiment, it did not.
That does not mean topic clusters are useless. It means they are not sufficient alone. Internal linking and semantic breadth may help a search engine understand a site, but AI systems still need a reason to retrieve and cite a specific page for a specific answer.
So, do AI engines reward entity coherence more than truth verification?
Even within just one month, the results point to a clear conclusion:
AI systems appear to respond more strongly to consistency, repetition, and availability than to strict verification.
That should not be overstated. It is not that LLMs βbelieve anything.β But if a claim is:
- Structured clearly
- Repeated across relevant pages
- Phrased like a fact
- Available in retrievable source environments
Then AI systems may surface it surprisingly easily.
We also saw this in manual checks of LLM responses in AI Results Tracker. For prompts such as βis [brand] worth it,β some systems responded positively and recommended using our completely unknown fictional brand.
It may not be because LLMs automatically favor every new brand. In some cases, when little or no negative information exists, a system may fill the gap with a neutral or positive-sounding response based on the limited signals available.Β
But the result is the same: if a completely fictional brand can generate consistent citations and favorable recommendations under certain conditions, then brand narratives in AI search may be more flexible than they seem.
Final thoughts
The most important outcome of this experiment isnβt that a fictional brand achieved visibility.
Itβs that visibility followed a repeatable pattern once specific inputs were introduced: branded context, unique claims, diverse content formats, and sufficient presence across different sources.
That leads to two important conclusions.
- AI search is not random. It follows identifiable signals, and those signals can be studied, tested, and influenced.
- AI is still highly sensitive to manipulation. AIs donβt have their own sense of truth, verification processes, or critical thinking. The same factors that help legitimate brands become visible can also be used to simulate credibility.
If thereβs one lesson here, itβs that you canβt assume AI systems will accurately represent your company, product, or category by default.
You have to actively shape the information environment they rely on.
And this is only the first month of results. Weβre continuing to collect data, expand the experiment, and monitor how these patterns change over time.