Reading view

SEO in 2026: Higher standards, AI influence, and a web still catching up

SEO in 2026: Higher standards, growing AI influence, and a web still catching up

Is it possible to get an accurate view of the current state of SEO?

There have been multiple attempts to reach consensus on what works, predict what might be coming, and identify the factors that may play a role in “good” (or “bad”) SEO.

As useful and productive as some of this may be, none of it offers the same grounded data as the Web Almanac, a project I was honored to be a part of. With the publication of the 2025 SEO chapter, we can now review the data and spot the emerging trends from 2025 and what that could mean for SEO in 2026.

SEO standards on the rise

2025 has been another year of increasingly higher SEO standards — which can only be a good thing:

  • Near-universal adoption of HTTPS (now up to 91%+).
  • Increased use of title tags at nearly 99% adoption, and even viewport meta tags at over 93% adoption.
  • Canonical adoption rose from 65% in 2024 to 67%+ in 2025.
  • HTML validity is slowly improving. For example, invalid <head> elements dropped to 10.1% on desktop and 10.3% on mobile from 10.6% and 10.9%, respectively, in the previous year.
  • Robots.txt error rates fell404s declined to 13% from 14% the previous year, and 5xx responses fell to ~0.1%.
  • Meta robots usage has crept up to 46.2% in 2025 from 45.5% the prior year.

Not all of these statistics represent rapid change, but they do show steady and consistent change, at the very least. The 2025 Web Almanac data presents the web as a more secure and easier-to-crawl place, which is certainly a positive. 

So, can SEOs take a victory lap right now? No, as there is more to do in 2026, even if the basics do feel like they’re stable or steadily improving.

The cementing of SEO ‘defaults’

Content management systems (CMSs) and SEO plugins play a huge role in developing SEO best practices and cementing the “default” or de facto standards.

As the CMS chapter in the 2025 Web Almanac shows, more and more websites are now powered by a CMS:

Of these, the top five most popular systems over the last four years likely aren’t surprising.

Frequently underpinning many SEO defaults are SEO tools typically utilized by WordPress sites:

That’s not to say that using these platforms or tools ensures a perfect website setup. That said, key elements or functions of these tools can become industry standard due to their ubiquity:

  • Robots.txt.
  • Sitemap.xml.
  • Canonical tags.
  • Semantic HTML.
  • Structured data.

Not all of these are on by default. Sometimes they require inputting basic details or simple implementation. Regardless, their ease of access increases the likelihood that they will become an SEO best practice.

This is happening, and it’s proving effective. What this means for 2026 and beyond is that:

  • Working with or lobbying major platform and tool makers is one of the key ways to shape SEO’s future direction.
  • SEO tools and platforms will continue to enforce best practices on the front end, but they could also benefit from AI and assistive features behind the scenes. While it may be less visible in the data itself, these tools offer the opportunity to move quickly and gain deeper insight.
  • Structured data usage was previously driven by what Google rewarded in the search engine results pages (SERPs). SEOs and plugin developers alike could be inspired to move beyond what’s beneficial for the SERPs and onto what contributes to a more predictable, structured, and retrievable data set.

Deprecated, but not forgotten

Defaults and best practices help, but they don’t finish the job. While attention often shifts to new features, old or forgotten standards still see widespread use.

There have been many different cases where deprecated settings or standards have prominently appeared in the data.

  • For example, in meta robots bot declarations, “msnbot” is still in the top 5, even though it was replaced over 16 years ago
  • AMP use has plummeted over the years, but it’s still found on over 38,000 homepages. While technically not deprecated, amp.dev has seen no recent activity for nearly four years now.
  • The most common meta robots attributes are “index” and “follow,” which are implicit and largely ignored.

Web changes — no matter how small — are often neither quick nor easy to get done, and we’ll likely see traces of deprecated features and settings in the data for years to come.

More work is needed

The improvement in SEO standards doesn’t apply to all features and sites. There are some that aren’t moving in the same direction:

  • The mobile performance gap stubbornly lingers — even as it continues to improve.
  • Duplicate content management is still lagging, with nearly 33% of pages missing canonical implementation.
  • Advanced configurations have barely moved from the previous year — nearly 67% of images don’t have loading attributes set, and over 91% of iframes don’t have set loading attributes.
  • Many deprecated standards refuse to go away.

While CMS default settings or configurations can take credit for some of the larger changes, they also bear some of the responsibility for the issues above. For example, median Lighthouse scores for some of the major CMS platforms are still lagging, especially on mobile (while seeing increases over last year).

The long tail of the web is still messy, and this will probably always be the case. The Web Almanac dataset doesn’t exclude websites that are no longer relevant or abandoned.

Site metrics that meet the “top” standards from an SEO best practices point of view can likely be achieved with an out-of-the-box site built on any major CMS with a modern theme and 30 mins of carefully considered configuration. This is one of the most significant opportunities in technical SEO.

In 2026, we’ll likely:

  • Continue to see performance gaps converge between desktop and mobile experiences — but slowly.
  • Still be able to see echoes of past markup and decisions. Even if the collective focus is pulled to the “new world” of AI search, many SEOs won’t abandon proven tactics and approaches from past years. This dataset develops slowly.
  • Observe something that’s mostly “business as usual.”

Get the newsletter search marketers rely on.


Charting the impacts of AI

One of the more eagerly awaited elements of the Web Almanac data was whether we can chart the increasing presence and impact of AI search and crawlers in the decisions of SEOs and developers.

Within the data, we observed two major developments:

  • Robots.txt is increasingly used more as a policy document rather than crawler control.
  • Creation and adoption of llms.txt is one of the few signs of LLM-first decision-making.

Commenting on the state of SEO is challenging because the definition isn’t fixed. What’s good or bad practice is often hotly debated, and in the world of AI search, another (painful) metamorphosis is now taking place.

In the HTTP Archive data we can observe the influences working on SEO from a “nuts and bolts” point of view, report on what we see, and enable people to make up their own minds.

Specifically, one of the elements we added this year was the analysis of the llms.txt file. 

This is a highly controversial text file, but our inclusion was not an endorsement. It’s a recognition that changing trends may (or may not) shape the web. Whether it’s effective or accepted, its adoption says something, and we felt it was important to review that.

Robots.txt as a bouncer

It’s clear that robots.txt has a more important job now than ever. Until relatively recently, it was largely used for targeted control of crawlers, particularly Googlebot and Bingbot. 

For most SEOs, however, robots.txt was mostly an exercise in both ensuring we weren’t blocking anything by accident and resolving problem areas with Disallow rules. This has changed:

  • Gptbot: 4.5% on desktop and 4.2% on mobile in 2025 is up from 2.9% on desktop and 2.7% on mobile in 2024, representing a ~55% increase.
  • Ccbot: 3.5% on desktop and 3.2% on mobile in 2025 is up from 2.7% on desktop and 2.4% on mobile in 2024.
  • Petalbot: 4.0% on desktop and 4.4% on mobile in 2025 (not separately tracked in 2024).
  • Claudebot: 3.6% on desktop and 3.4% on mobile in 2025 is up from 1.9% on desktop and 1.6% on mobile in 2024, nearly doubling.

Robots.txt isn’t the only way to manage bots — and arguably isn’t the best — but it introduces a new decision that must be made: How should websites handle LLM crawlbots?

This will be one of the biggest areas we’ll see change in on the technical side of 2026:

  • Businesses with existing bot strategies will need to evolve them.
  • Businesses that don’t meaningfully manage crawlers will start feeling the pressure to do so.
  • Robots.txt will still be the clearest and easiest way to handle crawlers. We will almost certainly see more good and bad bots alike.

In 2026, SEOs will be drawn into bot management conversations spanning marketing, technology, and security. “Which bots should we allow?” is a question with downstream effects on budgets, revenue, and users, and we’ll need to closely monitor what develops.

LLMs.txt

LLMs.txt is an aspiring web standard that aims to guide LLM crawlbot behavior and make it easier for them to retrieve content before generating an answer. It’s a highly controversial .txt file, and there’s a vigorous debate on whether it actually benefits LLMs, will gain widespread use, and is a possible vector for manipulation.

The rationale or efficacy of this file isn’t something we need to cover here. For this article, the true point of interest with llms.txt is the adoption of this file as a statement of intent. 

At the start of 2025, I crawled the Majestic Million, a regularly updated list of the top 1 million websites ranked by backlink authority, in search of llms.txt and found that adoption was extremely low (0.015% of sites, or just 15). 

While searching one million sites versus 16 million presents some logistical differences, I was expecting a very low level of adoption based on prior experience. I was surprised at how wrong I was.

According to the 2025 data, just over 2% of sites had a valid llms.txt file, and:

  • 39.6% of llms.txt files are related to All in One SEO (AIOSEO)
  • 3.6% of llms.txt files are related to Yoast SEO

This number is still relatively low, but it’s much higher than I thought it would be and potentially represents a huge acceleration.

The primary reason fueling adoption of llms.txt’s SEO plugins that make this easier to enable. 

We can see that llms.txt adoption has continued to rise ever since we started collecting data from across the web:

If, however, the implementation of this file is actually a default feature in some scenarios, it could be easy to overvalue its significance.

LLMs.txt will still be a barometer of AI search decision-making in 2026:

  • More tools and plugins will offer this functionality if they don’t already.
  • Yoast and Rank Math (which don’t default llms.txt to “on”) represent more growth opportunities for this file. Many SEOs may decide to switch it on even if there isn’t strong evidence of its efficacy.
  • The rate of adoption will continue to climb, but whether it’ll reach a point where it becomes an accepted best practice is harder to forecast.

FAQ growth

Another interesting trend worth discussing is the increase in the use of the FAQPage schema. 

While this isn’t as explicit a trend as robots.txt or llms.txt usage, the increased adoption of this schema type is particularly interesting.

Since Google said it was limiting the appearance of FAQ snippets in search results, you’d be forgiven for thinking the implementation of this schema type might plateau — or even fall.

However, you can see from the last three publications of the Web Almanac that this isn’t the case:

The use of FAQPage schema is now an emerging trend as AI search heavily cites FAQ content in its outputs.

This could be correlation rather than causation, but the steady increase in FAQPage schema is a strong sign of AI search strategies changing the shape of the web.

To echo another conclusion from earlier, 2026 may well see continued growth of structured data types even if they don’t result in an obvious improvement. While the growth is unlikely to be explosive, making a case for their implementation is easier when we don’t just optimize for Google.

Not a rewrite: A new layer on top of SEO

Will AI search reshape the web in 2026? Unlikely. Will we continue to see signs of its importance? Almost certainly, but let’s not get carried away. 

SEO has a reputation for changing quickly. Sometimes that’s true. More often, it’s the conversation that moves quickly, while the web itself changes at a steadier pace.

The 2025 Web Almanac data clearly reflects that tension. Core SEO hygiene continues to improve year over year, but largely through default features and settings, tools, and platform behavior rather than deliberate optimization.

At the same time, long-deprecated standards linger, advanced configurations remain uneven, and the long tail of the web remains untidy. Progress is real, but it’s incremental — and sometimes accidental.

What has shifted meaningfully is intent.

  • Robots.txt is no longer just crawl housekeeping. It’s becoming a policy surface.
  • LLMs.txt, regardless of whether it proves useful, represents a new class of decision-making entirely.
  • FAQ patterns are on the rise again, and not because of SERP features, but because structured, extractable answers have immense value elsewhere. 

2026 will not be remembered as the year SEO ended or was reborn. It may, however, be considered the year the AI search layer became more defined. A new patch applied — not a fundamental rewriting.

For a deeper dive into the data behind these trends, explore the 2025 Web Almanac SEO chapter.

❌