Reading view

Why now is the time to prepare for WebMCP

Why now is the time to prepare for WebMCP

New technologies come and go. Early in my career, I often chased shiny new things in an attempt to be on the cutting edge, but it didn’t take more than a few years to realize I was spending countless hours of my time, and my clients’ time, implementing technologies and techniques that went by the wayside. Google Authorship, anyone?

It turns out that if you simply wait for wider — but still early — adoption, learn from the first movers’ mistakes, and catch up quickly, you can avoid wasting time and create greater value for yourself and those you serve. That lesson has served me well.

And then there are those key moments where the early movers stand to not just win in the current landscape, but to shape and lead the next one. Think of the first people reading the PageRank paper and thinking, “I should build some links.” WebMCP feels like one of those moments, only bigger.

It’s not just a revolution in how search works or even in generative engine visibility. We’re at a moment where the very place discoverability occurs is changing, and who (or rather, what) is doing the discovering is changing with it.

Coming soon: Non-human engagement

While SEOs have long debated whether we should be optimizing for search engines or humans (shockingly, it’s both), that paradigm is about to be turned on its head. What happens when discovery shifts from a human to an LLM or agentic system?

This change is already underway. Whenever you visit ChatGPT with a request, it makes decisions, runs supplemental searches, asks follow-up questions, and returns conclusions. The agent is planning and deciding on your behalf, and your resulting output is shaped entirely by what it retrieves and how it interprets it.

We can even see the supplemental (fanout) queries in DevTools:

I think of this as the latest chapter in a longer story:

  • Discovery v1: People interacted with the world and discovered things firsthand. Experience and word of mouth were the discovery points.
  • Discovery v2: People started writing things down. Libraries and educational institutions became the discovery points, followed by newspapers and books.
  • Discovery v3: The web proliferates information and media at a scale previously unimaginable. Directories, then search engines, rose to aid discovery.
  • Discovery v4 (current): After about 25 years of search engines, LLMs rose and discovery moved to a blended, LLM-forward format. Light agentic capabilities are baked in to assist retrieval. People are still in the loop, but the assistant is doing more of the legwork.
  • Discovery v5 (on the horizon): Agentic systems move beyond being assistants in the retrieval and presentation layer and are given autonomy to act on users’ behalf. Many users will have their own agents. Companies will offer them. Google almost certainly will.

I would argue that the stage we’re entering, Discovery v5, will be the most dramatic since the shift to v2.

Can’t you just imagine a world where basic decisions are offloaded from your brain and body, leaving you room to pursue more important things? I know I’ve seen this utopia before.

I honestly don’t see it resulting in this future, but the world we’re creating right now is fundamentally different from the one we’re in presently as marketers, and WebMCP is one of the first concrete steps in this journey. 

Dig deeper: WebMCP explained: Inside Chrome 146’s agent-ready web preview

The trust ratchet only turns one way

Do you accept what you read in an AI Overview and stop your journey there more often than you did on the day it launched? Not 100% of the time, but more often than you did? You do. So do I.

For quick, low-risk queries, we’re happy to trust it. If you’re like me, as these systems have evolved and improved, you’ve started trusting them with higher-stakes information.

Would I trust an AI Overview with tax questions or major health decisions? No. Would I trust it to remind me of the benefits of vitamin D or pull together a dinner recipe? Absolutely.

That boundary keeps moving. As it moves, so does what we’re willing to let an agent do on our behalf, not just what we’ll let it tell us.

  • The cost of being wrong when automating the reorder of groceries you’re running low on is small.
  • The benefit of an agent monitoring flight and hotel combinations for an amazing refundable deal, on your days off, within your budget, is very high.
  • The benefit of hopping in an autonomous vehicle with your family after work on a Friday, dinner in hand, playing a game and sleeping, and arriving at Disney World rested just in time for opening — that’s pretty compelling.

You may say you’ll never hand your autonomy to an agentic system. People said the same about search engines, smartphones, and GPS. The path usually goes: 

  • Skepticism (“Who would ever enter their credit card number on a website?!”)
  • Reluctant adoption (“Ugh, it’s an online service, and I trust the company and don’t have a choice. Alright, I’ll give them my card. But just this once.”)
  • Dependency (“I can’t believe I used to actually go into stores!”)

What does this have to do with WebMCP?

Here’s where it gets concrete and actionable.

MCP servers and skills files are early versions of the infrastructure that makes Discovery v5 possible, but the barrier to entry is high, and they apply only in specific contexts. 

WebMCP is different. It’s a browser-native web standard, currently published as a W3C Community Group Draft and in early preview in Chrome 146 beta as of this writing, that gives websites a structured way to expose actions directly to AI agents without scraping, guessing, or brittle automation.

This isn’t a Google-only initiative. The specification is co-authored by engineers from both Google and Microsoft, which matters. When two of the largest browser and AI platform vendors are writing the spec together, it has a different trajectory than a unilateral bet.

Right now, when an AI agent tries to take an action on your website, like filling out a form, booking an assessment, or searching your inventory, it has to figure everything out by reading your page and inferring intent.

It looks at your DOM, guesses what your fields mean, hopes the date format it picks is the one your form expects, and submits. It’s intelligent, but it’s also fragile. One UI change and the whole flow breaks.

WebMCP changes this by letting you tell the agent exactly what your site can do and how to do it. The spec defines two distinct ways to do that: one that closely maps to what you already know, and one that handles more complex, dynamic interactions.

Get the newsletter search marketers rely on.


Declarative vs. imperative: You already know this distinction

WebMCP proposes two APIs, and the difference between them will feel familiar to anyone who’s spent time in technical SEO.

The Declarative API is the one that should make you sit up and get to work right away. The idea is straightforward.

  • You annotate your existing HTML forms with attributes that describe what the form does and what each field means.
  • The browser automatically translates that into a structured tool any agent can call. 
  • The form continues working exactly as before for human visitors. 

The agent gets a clean, unambiguous interface.

To be clear, the declarative API is still being formally specified, and the exact attribute names aren’t locked down yet. But the concept is settled, and demos are already running. 

Think of it the way you’d think about schema markup in its early days: the syntax evolved, but the underlying idea, annotating what already exists so machines can understand it, was clear and worth acting on.

The analogy to schema markup is almost exact. You’re not building a new system. You’re making what you already have legible to a new class of visitor. That’s a pattern SEOs understand intuitively.

The Imperative API is more mature in the spec and already available for testing. You register tools directly in JavaScript. Here’s an example for a site taking bookings for an assessment:

navigator.modelContext.registerTool({
  name: "book-assessment",
  description: "Book a free IT assessment for your business.",
  inputSchema: {
    type: "object",
    properties: {
      name: { type: "string", description: "Customer's full name" },
      city: { type: "string", description: "City for the assessment" },
      slot: { type: "string", description: "Preferred time in ISO 8601 format" }
    },
    required: ["name", "city", "slot"]
  },
  execute: async (input) => {
    // your booking logic here
    return { confirmed: true, appointmentId: "APT-001" };
  }
});

This is more powerful and flexible, the right approach for dynamic interactions, multi-step flows, or anything that can’t map cleanly to a single form. Here’s something that makes it genuinely interesting: the tools available on a page can change based on state.

A hotel booking demo from Google Chrome Labs illustrates this well. After an agent runs a search_location tool, a new filter_search_results tool appears. After selecting a hotel, start_booking becomes available. The agent’s toolset evolves as the user’s journey progresses, just as a well-designed interface guides a human through a flow.

Think of declarative as the equivalent of adding schema markup to existing content: low lift, high legibility, great starting point. An imperative is like building a fully structured data feed, It takes more effort, offers more power, and is better suited to complex or dynamic needs. Most sites should start with declarative and extend into imperative as their needs grow.

A quick note on scope: The example below uses the declarative side of WebMCP because, as we’ve discussed, that’s the easiest place for most site owners and SEOs to start. It maps naturally to existing HTML forms. Add clear machine-readable descriptions to the form and its fields, and the page becomes easier for agents to understand. 

The imperative API is more case-specific. It’s better suited to dynamic flows, multi-step interactions, custom JavaScript logic, or cases where an action does not map cleanly to a single form.

What the agent sees: Before and after

The contrast is easiest to see with something every service business already has: a booking or contact form. This form:

<form action="/contact" method="POST">
  <label for="name">Name</label>
  <input id="name" name="name" type="text" required>
  <label for="email">Email</label>
  <input id="email" name="email" type="email" required>
  <label for="city">City</label>
  <input id="city" name="city" type="text">
  <label for="message">Message</label>
  <textarea id="message" name="message" required></textarea>
  <button type="submit">Send</button>
</form>

Now here is the same form prepared for WebMCP using declarative-style annotations:

<form action="/contact" method="POST"
toolname="submitContactInquiry"
tooldescription="Submit a contact inquiry for a service business.">
<label for="name">Name</label>
<input
  id="name"
  name="name"
  type="text"
  required
  toolparamdescription="The requester's full name."
>
<label for="email">Email</label>
<input
  id="email"
  name="email"
  type="email"
  required
  toolparamdescription="A valid email address where the requester can be contacted."
>
<label for="city">City</label>
<input
  id="city"
  name="city"
  type="text"
  toolparamdescription="The city where the requester is located."
>
<label for="message">Message</label>
<textarea
  id="message"
  name="message"
  required
  toolparamdescription="The requester's question, project details, or service need."
></textarea>
<button type="submit">Send</button>
</form>

The form still works the same way for a human visitor. Nothing about the normal user experience had to change.

The difference is that an agent no longer has to guess what the form does or what each field means. The form declares its action with toolname and tooldescription, and each important input explains itself with toolparamdescription.

That’s the core idea. You’re not rebuilding the site for agents. You’re making the existing interface easier for them to understand.

And critically, this doesn’t have to mean fully automatic submission. For a contact form, you may want an agent to prepare the form and let the user review it before sending. For a low-risk action, you may eventually allow more automation. The point is that the action becomes explicit, structured, and less fragile.

The attributes proposed for forms are:

  • toolname: The name of the tool (in this case, a form tool).
  • tooldescription: The description of the tool (in this case, the description of a form).
  • toolautosubmit: A boolean attribute that lets the agent submit the form on the user’s behalf without requiring consent. This may not seem like it’d make sense if you’re thinking about just chatting with ChatGPT, but it suddenly makes sense if you have agents engaged in complex tasks, hooked up to your email, and tasked with completing something complex like making reservations or compiling information that requires details beyond a login or form fill.
  • toolparamdescription: A description of a specific parameter, so the agent is aware of the field it’s engaging with.

You can keep up with the specifics of the declarative API as it evolves in the Declarative API Explainer.

Why this matters for your sites specifically

Think about the types of queries agentic systems will handle on behalf of users in a Discovery v5 world:

  • “Find me an SEO consultant who understands technical SEO, doesn’t talk like a LinkedIn carousel, and has time for a call next week.”
  • “Compare three AI agent observability tools and tell me which one seems most likely to solve my actual problem instead of selling me a chatbot.”
  • “Find a contra dance near me this Friday, check whether beginners are welcome, and add it to my calendar if the band looks fun.”

Which site gets the engagement? The one the agent can interact with cleanly, confidently, and without friction. If your competitor has WebMCP-registered tools and you don’t, the agent completes the action on their site and moves on. The user may never know they had a choice.

There’s a secondary implication worth naming. Tool descriptions are the new meta descriptions. The quality of your tool name, description, and parameter definitions will directly shape whether an agent selects your tool over a competitor’s, understands what it does, and calls it correctly. 

The best practices guidance in the WebMCP documentation reads like conversion copywriting. Use clear verbs, explain the why behind options, and be specific about what each parameter means. If that sounds familiar, it should. You’ve been writing for machine readers for years. This is the next layer.

The window is open, but not forever

I’ve been skeptical of early adoption my whole career. I still am, as a default. But I’ve also learned to recognize the moments that are different in kind, not just degree.

Schema markup was one. SSL was one. Mobile optimization was one. Each time, the window in which early movers earned disproportionate returns was real and finite. In each case, the people who understood the underlying shift, not just the tactic, were the ones who compounded that advantage.

WebMCP is a W3C Community Group Draft today, co-authored by Google and Microsoft, already running in Chrome 146 beta, and already integrated into Cloudflare’s infrastructure. It’s not table stakes yet. But the trajectory is clear:

  • The spec matures.
  • Browsers ship it.
  • Agents learn to prefer sites that expose structured tools.
  • The sites that haven’t caught up become invisible to that class of visitor.

The declarative approach, once finalized, means the barrier to starting will be genuinely low: annotations on your most important forms, not a new backend system. The imperative API is available for testing right now.

That’s the argument. It’s the reason I’m making it now, not in six to 12 months when everyone else is trying to catch up.

How to model non-linear SEO seasonality with Prophet

How to model non-linear SEO seasonality with Prophet

Forecasting SEO performance means estimating future outcomes from historical data. But search behavior rarely follows stable or linear patterns.

Seasonal demand, anomalies, SERP changes, and measurement issues can all distort your data and lead to unreliable forecasts.

That makes forecasting more complex than running linear regression, exponential smoothing, or asking an LLM to project trends from historical performance.

Here’s how to account for seasonality, detect anomalies, and build more reliable SEO forecasts in Python using models designed for non-linear search data.

SEO forecasting pays the bills, but doesn’t add much value

Decision-makers rely on forecasts to justify investments and align expectations across digital teams. Stakeholders want forward-looking estimates, finance needs revenue projections, and roadmaps require a clear view of expected returns. However, the value of forecasting has diminished today.

AI Mode and AI Overviews created a major disconnect between clicks and impressions as LLM-driven scrapers increased bot activity and inflated impression data in reporting tools.

Additionally, Google reported a logging issue affecting Search Console impression data since May 2025. As a result, many forecasts end up serving as reassurance rather than guidance. They shield decision-makers from scrutiny while failing to reflect the business’s actual operating context.

From a data analytics perspective, if search performance followed a normal distribution, you could rely on linear regression, exponential smoothing, or even a simple moving average (SMA) with confidence.

However, the average SEO forecast still relies on assumptions that don’t hold in organic search:

  • Stable trends.
  • Normal distributions.
  • Consistent relationships between inputs and outputs.
TechniqueDescriptionWhen to useWhen not to use
Linear regressionFits a straight line through historical data to model long-term trends and project future performance.When traffic or rankings show a consistent upward or downward trend with relatively low volatility. Useful for baseline forecasting and directional planning.When data is highly volatile, seasonal, or affected by frequent algorithm updates, migrations, or campaign spikes.
Exponential smoothingApplies weighted averages where recent data points have more influence than older ones. Can adapt to short-term changes.When recent performance is more indicative of future outcomes, such as after site changes, migrations, or content updates. Useful for short-term forecasting.When long-term trends matter more than recency, or when sharp anomalies may distort recent weighting.
Simple moving average (SMA)Averages values over a fixed window to smooth noise and highlight underlying trends.When you need to understand data direction, such as smoothing daily traffic for reporting.When forecasting future performance because predictions rely on aggregated historical averages and may miss turning points.

Today’s AI landscape forces a rethink of forecasting as search shifts toward highly volatile and probabilistic outcomes. In other words, today, a 10% increase in effort doesn’t translate into a proportional 10% increase in traffic.

Several structural factors are at play:

  • Long-tail traffic distribution: A small number of pages typically generate most traffic, while most pages contribute very little.
  • Binary user behavior: Many core SEO metrics, such as CTR, are driven by yes/no interactions (click versus no click) that diverge from normally distributed patterns.
  • Zero-click search impact: High rankings don’t guarantee traffic — more queries are resolved directly in the SERP, inflating visibility without corresponding clicks.

If you have to forecast, do it properly. Baseline models still have a role:

  • Linear regression for directional trends.
  • Exponential smoothing for short-term adjustments.
  • Moving averages for noise reduction.

There are ways to apply these techniques in Google Sheets. However, they should be treated as descriptive tools, not decision-making systems. To make forecasting useful, you need to move beyond them.

Why LLMs aren’t the answer to SEO forecasting

LLMs and MCP connections only compound the inefficiencies listed above. There are two structural problems with this approach.

They assume data behaves linearly

Pre-configured prompts or skills implicitly assume the data follows a linear distribution. This is misleading because SEO data is dominated by seasonality, cyclical demand, and structural breaks. Any system that treats it as smooth or continuous will systematically misrepresent future performance.

They optimize for plausibility, not statistical accuracy

LLMs aren’t forecasting models. They’re probabilistic text generation systems. They assign probability scores to predict token sequences based on patterns observed during training. They’re trained to reward your thinking, not challenge it.

As a result, they can produce confident but ungrounded outputs that lack the business and domain context required to interpret anomalies.

No matter how well engineered the prompt is, the system can still hallucinate – not because it’s “wrong,” but because it’s optimizing for linguistic plausibility, not statistical validity.

Forecasting requires explicit handling of seasonality, non-linearity, and critical interpretation of outputs. These analytical responsibilities can’t be abstracted away through prompting alone.

LLMs can assist with workflows, accelerate analysis, and even help operationalize models. But they can’t replace the role of an analyst in framing the problem, selecting the methodology, and validating the results.

How to do an SEO forecast that accounts for seasonal effects

Asking the right questions is often the hardest part of any analysis.

SEO forecasts are often requested by enterprise stakeholders or pushed by agencies during new business pitches. This typically makes forecasting more straightforward because the research question is already defined upfront.

Either way, the subject of the analysis is usually one of the following search indicators:

  • Clicks (search demand).
  • Impressions (search visibility).
  • Rankings (position distribution).
  • CTR (SERP behavior).

For this article, we’ll use Python to forecast synthetic clicks for a fictitious website influenced by seasonal demand.

Retrieving and preprocessing seasonal fluctuations

Based on the scope of analysis, gather historical data from Google Search Console through either the API or Google BigQuery.

While a larger dataset with broader historical coverage is technically better, it may not justify the query costs in BigQuery for an SEO forecast.

Carefully assess the tradeoff between cost, resources, time, and data sampling. You might find that using an API to retrieve as much historical data as possible (e.g., via Search Analytics for Sheets) does the job.

Set up a Google Colab notebook, install the required dependencies, load your dataset with date and clicks as columns, and convert the date column into a datetime index.

Enforce daily frequency to ensure consistency across dates, and quickly fill any missing data gaps using interpolation.

#data viz
!pip install plotly
import plotly.graph_objects as go
import plotly.express as px
import matplotlib.pyplot as plt
import matplotlib.pyplot as pyplot
import seaborn as sns
from scipy.stats import boxcox

#anomaly detection
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.seasonal import STL

#timeseries decomposition
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

#data manipulation
import pandas as pd
import numpy as np

#time series plotting
from prophet import Prophet
from statsmodels.tsa.statespace.sarimax import SARIMAX
from sklearn.metrics import mean_absolute_error, mean_squared_error

df = pd.read_excel('/content/input.xlsx')
df.columns = map(str.lower, df.columns)

df['date'] = pd.to_datetime(df['date'])
df = df.sort_values('date')

# Set index
df.set_index('date', inplace=True)

# Ensure daily frequency (important for decomposition)
df = df.asfreq('D')

# Handle missing values
df['clicks'] = df['clicks'].interpolate()
df.head()
Raw clicks line for all available date 
Raw clicks line for all available date 

Does it look like a linear distribution, or can you already spot anomalies?

Data preprocessing involves standardizing and cleaning your dataset to reduce the impact of outliers on your next forecast. This step is often overlooked, yet it’s critical for improving model reliability.

To prove this, we need to assess stationarity, i.e., whether the relevant measures of central tendency, namely the mean and variance, remain stable over time.

result = adfuller(df['clicks'].dropna())
print(f"ADF Statistic: {result[0]}")
print(f"p-value: {result[1]}")

For context, the smaller the p-value (<0.05), the more confident you can be that patterns in the time series aren’t random.

ADF Statistic: -3.014113904399305
p-value: 0.06246422059834887

The p-value isn’t convincing here, meaning the series isn’t stationary (linear), and seasonality likely plays a role.

As discussed, assuming SEO data is stationary (i.e., follows a linear distribution) is a flawed heuristic.

SEO data often follows non-linear trends, so relying on simple methods that assume stable data can lead to poor forecasts. Instead, you should decompose the time series and model seasonality.

Seasonality decomposition helps separate true performance trends from recurring patterns such as weekly or monthly cycles.

To do this, we need to zoom in on granular weekly search patterns. 

#If data recorded daily, and you want to analyse weekly seasonality (period=7)
result_weekly = seasonal_decompose(df['clicks'], model='additive', period=7)

#If data recorded monthly, and you want to analyse yearly seasonality (period=12)
#result_monthly = seasonal_decompose(df['clicks'], model='additive', period=12)

# Plot the decomposition for monthly data
result_weekly.plot()
plt.title('Weekly Seasonal Decomposition')
plt.show()
STL decomposition framework 
STL decomposition framework 

The trend plot itself is already suggestive:

  • Search interest (clicks) is trending downward.
  • Search interest is likely affected by weekly sales cycles – look at the numerous small peaks.
  • Search interest likely follows seasonal demand – it ebbs and flows at certain times of year.

However, the residuals plot contains clusters of large spikes, both positive and negative, reaching up to 500,000. These represent anomalies, or outliers, that appear connected to the trend’s inflection points.

This means the model made a “mistake” when decomposing the trend line because it didn’t fully capture sudden spikes.

Get the newsletter search marketers rely on.


Handling seasonality with SEO forecast 

To decompose and isolate seasonality, you can use several models depending on the level of complexity and flexibility you need:

ModelDescription
STL decompositionA robust technique for separating a time series into trend, seasonality, and residuals. Ideal for revealing the underlying structure in data where patterns vary over time, making it useful for anomaly detection.
SARIMAXARIMA extended to seasonal data. A statistical model that handles non-stationary data, seasonal patterns, and external independent variables such as algorithm updates.
ProphetBuilt by Meta for real-world data, it handles multiple seasonalities, missing data, and abrupt shifts. Leveraging additive models, it’s particularly suited for time series with strong seasonal patterns.
BSTSA Bayesian model that captures trend and seasonality while incorporating uncertainty. BSTS is commonly used for counterfactual estimation in causal impact analysis (“what would have happened if X never occurred?”), making it suitable for testing applications such as pre- versus post-analysis. Useful if you want to learn R.

For this article, we’re going to use STL decomposition for anomaly detection in a “wobbling” (non-stationary) time series.

# Fit STL decomposition (period=7 for weekly cycle)
stl = STL(df['clicks'], period=7, robust=True)
result = stl.fit()


# Extract residuals and flag anomalies via IQR
resid = result.resid
Q1, Q3 = resid.quantile(0.25), resid.quantile(0.75)
IQR = Q3 - Q1
anomalies = df[(resid < Q1 - 1.5 * IQR) | (resid > Q3 + 1.5 * IQR)]


# Plot
fig, ax = plt.subplots(figsize=(14, 5))
ax.plot(df.index, df['clicks'], label='Clicks', color='steelblue')
ax.scatter(anomalies.index, anomalies['clicks'], color='red', label='Anomalies', zorder=5)
ax.set_title('Click Anomalies (STL + IQR)')
ax.legend()
plt.tight_layout()
plt.show()
Weekly anomaly detection using STL decomposition
Weekly anomaly detection using STL decomposition

The red points are extreme values that aren’t explained by either trend or seasonality. However, detecting anomalies isn’t the same as removing them.

In non-stationary time series, variability changes over time (e.g., seasonality, trends, algorithm updates). Removing outliers outright breaks the time index and introduces artificial gaps that bias the actual seasonal impact.

A more robust approach is to replace anomalies with expected values.

df['trend'] = result.trend
df['seasonal'] = result.seasonal
df['resid'] = result.resid
# --- Define anomaly flag (based on residuals) ---
Q1, Q3 = df['resid'].quantile(0.25), df['resid'].quantile(0.75)
IQR = Q3 - Q1


df['anomaly'] = (
    (df['resid'] < Q1 - 1.5 * IQR) |
    (df['resid'] > Q3 + 1.5 * IQR)
)
# --- Replace anomalies with expected value (trend + seasonal) ---
df['clean_clicks'] = df['clicks'].copy()
df.loc[df['anomaly'], 'clean_clicks'] = (
    df['trend'] + df['seasonal']
)

Because this approach preserves the time series rows, the forecasting baseline is now protected from bias and artificial gaps. You can validate this by applying STL decomposition to the cleaned time series.

result_clean = seasonal_decompose(df['clean_clicks'], model='additive', period=7)
result_clean.plot()
plt.title('Weekly Seasonal Decomposition (Cleaned Data)')
plt.show()
STL decomposition framework without anomalies
STL decomposition framework without anomalies

What finally stands out is that once a week (every seven observations), there’s a spike. This suggests peak search demand on Saturday or Sunday, indicating stable and consistent interest patterns.

A few scattered residuals, or anomalies, remain, but they’re rare and random, showing no clustering or drift. This confirms that outlier handling has been effective and the model fit is robust.

At this stage, the time series decomposition is clean enough and ready for forecasting.

Plotting a non-stationary SEO forecast

While you could experiment with SARIMAX or BSTS, this synthetic SEO forecast uses Prophet because it’s well-suited for handling time series with strong seasonality.

Using our anomaly-free dataset with a preserved time index, Prophet can forecast click performance over the next 90 days. To add more context, you can introduce a regressor to flag external factors such as Google core updates or measurement issues.

In this example, you can apply a flag to account for the Google Search Console logging issue that artificially inflated impressions between May 2025 and April 2026.

The code below generates a 90-day forecast and outputs a line chart, with the option to export the forecast as an .xlsx table.

Tabular output of Prophet’s 90-day click forecast from anomaly-free non-stationary timeseries
Tabular output of Prophet’s 90-day click forecast from anomaly-free non-stationary timeseries.

Note that the lower and upper bounds represent the confidence interval, indicating the range within which clicks are expected to fall over the forecast horizon.

prophet_df = df[['clean_clicks']].reset_index()
prophet_df.columns = ['date', 'clicks']
prophet_df['date'] = pd.to_datetime(prophet_df['date'])
prophet_df = prophet_df.rename(columns={'date': 'ds', 'clicks': 'y'})


# ── GSC INFLATION FLAG ───────────────────
start = pd.to_datetime('2025-05-13')
end   = pd.to_datetime('2026-04-13')
prophet_df['gsc_inflation_flag'] = 0
prophet_df.loc[
    (prophet_df['ds'] >= start) & (prophet_df['ds'] <= end),
    'gsc_inflation_flag'
] = 1


model = Prophet(
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=False
)
model.add_regressor('gsc_inflation_flag')
model.fit(prophet_df)


# ── FORECAST────────────────────────────────────────────
future = model.make_future_dataframe(periods=90)


future['gsc_inflation_flag'] = 0
future.loc[
    (future['ds'] >= start) & (future['ds'] <= end),
    'gsc_inflation_flag'
] = 1
forecast = model.predict(future)


forecast_clean = forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].copy()
forecast_clean.columns = [
    'date',
    'clicks_forecast',
    'lower bound',
    'upper bound'
]


# Extract next 90 days only
forecast_90 = forecast_clean.tail(90)


# ── EXPORT OPTION ─────────────────────────────────────
EXPORT = True
if EXPORT:
    forecast_90.to_excel('seo_forecast_90_days.xlsx', index=False)
# ── PLOTLY VISUALISATION ──────────────────────────────
fig = go.Figure()
# Actuals
fig.add_trace(go.Scatter(
    x=prophet_df['ds'],
    y=prophet_df['y'],
    mode='lines',
    name='Actual (Cleaned)',
    opacity=0.6
))
# Forecast
fig.add_trace(go.Scatter(
    x=forecast_clean['date'],
    y=forecast_clean['clicks_forecast'],
    mode='lines',
    name='Forecast',
    line=dict(dash='dash')
))
# Confidence band
fig.add_trace(go.Scatter(
    x=forecast_clean['date'],
    y=forecast_clean['upper bound'],
    mode='lines',
    line=dict(width=0),
    showlegend=False
))
fig.add_trace(go.Scatter(
    x=forecast_clean['date'],
    y=forecast_clean['lower bound'],
    mode='lines',
    fill='tonexty',
    name='Confidence Interval',
    line=dict(width=0)
))
# Highlight inflation period
fig.add_vrect(
    x0=start, x1=end,
    annotation_text="GSC Inflation Period",
    annotation_position="top left",
    opacity=0.2
)
fig.update_layout(
    title='SEO Forecast Adjusted for GSC Impression Inflation Bias',
    xaxis_title='Date',
    yaxis_title='Clicks'
)
fig.show()
Prophet’s 90-day clicks forecast from anomaly-free non-stationary timeseries 
Prophet’s 90-day clicks forecast from anomaly-free non-stationary timeseries 

SEO forecasting isn’t usually linear

SEO forecasting isn’t about projecting neat, linear trends – it’s about understanding messy, non-stationary data shaped by seasonality, anomalies, and external shocks.

By cleaning data properly, modeling seasonality, and accounting for real-world distortions such as SERP changes and tracking issues, forecasts become less about false certainty and more about informed direction.

While the goal isn’t perfect accuracy, a robust approach to forecasting non-stationary time series is essential for framing stakeholder expectations within a realistic range and making better decisions.

How to prioritize technical SEO fixes by business impact

How to prioritize technical SEO fixes by business impact

You just ran a crawl of your website. The report flags hundreds of technical issues, many marked by your tool of choice as high priority. You map out a plan based on best practices, and you’re already dreading the email to your developers.

But here’s the catch: Many of those “critical errors” don’t actually matter. You can spend weeks resolving “high-priority” technical issues and still see no meaningful impact on traffic or conversions. 

Some fixes look critical and do absolutely nothing. A 404 buried six levels deep in the site architecture? Probably not worth the fire drill it causes.

Meanwhile, a seemingly minor internal linking issue on high-value category pages might be suppressing millions in revenue.

The problem isn’t technical SEO. It’s the persistent myth that all fixes carry equal weight. They don’t.

One of the biggest maturity shifts you can make as an SEO is moving from issue-based SEO to impact-based SEO. Because the goal isn’t to fix everything. It’s to fix what actually moves the needle.

Why critical doesn’t always mean impactful

Technical SEO tools are incredibly useful. But, they’re also incredibly good at creating anxiety. Crawl reports, site health dashboards, and those “critical” red flags often create the illusion that every flagged issue deserves immediate attention. 

But a tool may label something as a “critical issue” because it violates a best practice. That doesn’t automatically mean it’s hurting organic performance. 

This is where we lose time. These tools confuse technical correctness with search impact.

A site can be technically imperfect and still perform exceptionally well in search. Likewise, a site can have an impressive CWV score and still underperform because the wrong problems are being prioritized. Some issues are cosmetic, some matter only at scale, and some are tied to old-school best practices that don’t affect rankings. 

Technical SEO should be measured by outcomes, not arbitrary scores from an array of tools.

Your customers search everywhere. Make sure your brand shows up.

The SEO toolkit you know, plus the AI visibility data you need.

Start Free Trial
Get started with
Semrush One Logo

Not all issues affect search in the same way 

A helpful way to prioritize fixes is to understand which layer of performance the issue affects. Is it indexing, or rendering, or user experience? Or a combination of all of the above?

Indexing issues

These affect whether pages can appear in search at all. Some examples include:

  • Noindex tags.
  • Robots blocking.
  • Sitemap omissions.
  • Canonical conflicts.

These tend to be the highest priority. If search engines can’t access or index the page, rankings are impossible.

Rendering issues

These affect how search engines understand and interpret content across the site. Examples include:

  • JavaScript-delayed content.
  • Lazy-loaded content incorrectly rendered.
  • Blocked JS or CSS resources.

These are especially important for JavaScript-heavy frameworks and dynamic experiences (anything built with React, Angular, a headless CMS, etc.).

UX and performance issues

These influence engagement signals and conversion behavior. Examples may include:

  • Slow page speed.
  • Layout shifts.
  • Intrusive interstitials.
  • Poor mobile usability.

This type of issue may not directly impact a site’s ability to rank well, but it can affect engagement and conversions, which, in turn, can impact organic visibility. Sometimes SEO is less about rankings and more about protecting the traffic you already earn.

Dig deeper: Where to focus technical SEO when you can’t do it all

A practical framework for prioritization

Before moving a technical SEO fix to the top of your list of priorities, it can be helpful to pressure-test it against a simple decision framework. 

Start by asking three questions: 

  • Does this issue affect crawlability or indexing? If search engines can’t access, render, or index the page correctly, that’s usually a priority. 
  • Does it impact high-value pages or sections? A small issue affecting thousands of product pages or top-performing content is rarely small in practice.
  • Is there evidence this issue is suppressing traffic or rankings? Look for signals in performance data, not just audit tools. Ranking drops, stagnant pages, indexing anomalies, and crawl inefficiencies all tell a more useful story than issue counts alone.

Once you have a better understanding of the answers to these questions, you can use a prioritization matrix to help create a plan of action.

Get the newsletter search marketers rely on.


High-effort, low-impact fixes

Let’s start with the work that tends to consume a disproportionate amount of our time. These are the fixes that look important in an audit but often produce little measurable lift.

Fixing every 404 on the site

Not all 404s are a problem. Fixing a broken URL may have virtually no impact if it:

  • Has no backlinks.
  • Receives no organic traffic.
  • Is not internally linked.
  • Isn’t part of a key user journey.

This is especially common on large publisher or ecommerce sites with legacy URLs, expired products, or campaign landing pages.

Teams can spend weeks cleaning these up without affecting visibility. The real question is whether the broken page is still being crawled frequently, holding authority, or disrupting conversion paths.

If not, it’s often maintenance work, not growth work.

Chasing minor Core Web Vitals fluctuations sitewide

Site speed matters. But not every performance fix deserves equal urgency. A minuscule shift in CLS on low-traffic blog posts is rarely as impactful as improving render speed on revenue-driving pages.

Core Web Vitals scores have become a hot topic in general. They certainly do matter, and it’s one of the few concrete metrics Google provides in terms of measuring SEO performance. But too often, teams prioritize significant engineering work because a dashboard score dipped slightly, rather than focusing on where performance intersects with rankings and user behavior.

If your category pages, product pages, or article templates are already performing well, marginal speed gains may not produce meaningful lift.

Image alt text on assets with minimal search value

Alt text is essential for accessibility. That alone makes it worth doing. But in terms of SEO impact, not all alt text work deserves equal priority.

Updating alt text across large volumes of legacy images, especially on low-traffic or outdated pages, is often high-effort with little return. If those images aren’t driving visibility through image search and the pages they live on don’t perform well organically, the SEO upside is minimal.

Where alt text does move the needle is on:

  • High-traffic pages.
  • Image-driven content.
  • Ecommerce product imagery.

Treat alt text as a priority where it supports discoverability or user experience at scale, not just because it shows up in an audit.

Header tags that look wrong (but aren’t hurting anything)

Header tags are one of the most over-policed elements in technical SEO. 

Multiple H1s? Skipped heading levels? Tools love to flag them. And on paper, those flags look serious. But in reality, this is often a high-effort cleanup with little to no impact.

Search engines are much better at understanding page structure than they used to be. They don’t rely solely on perfectly nested HTML headings to interpret content hierarchy. In many cases, visual hierarchy, layout, and contextual signals do just as much heavy lifting.

It’s also common for header styles to be defined by design styles rather than strict semantic markup. A page might have multiple H1s, but still present a clear, logical structure to both users and search engines. So it’s not inherently a problem. 

Where header tags do matter is when they create confusion:

  • No clear primary topic or heading on the page.
  • Headings that don’t align with search intent.
  • Structural issues that make content harder to parse (for users or crawlers).

As with most things in technical SEO, the goal with headers isn’t perfection. It’s clarity and impact. If users and search engines already understand the page, this probably isn’t the fix that moves the needle.

Over-optimizing structured data

Schema markup helps search engines better understand content and can unlock rich results. But adding increasingly granular schema types to every page template often diminishes returns.

Going from no product schema to valid product schema? Huge.

Adding optional niche properties that don’t change the SERP appearance? Minimal.

Sometimes we treat schema like a compliance exercise rather than part of a visibility strategy. But if it doesn’t influence comprehension or SERP presentation, it may not be the highest-value work.

Dig deeper: How soft 404s and indexing issues caused a 90% traffic collapse

Low-effort, high-impact wins

Now for the work that often drives outsized returns. These are the fixes that directly affect crawlability, discoverability, and user experience.

Internal linking to high-value pages

This is one of the most overlooked technical wins. A few strategic internal links from authoritative pages to underperforming high-intent pages can improve:

  • Crawl frequency.
  • Page discovery.
  • Contextual relevance.
  • Authority flow.

Compared to complex engineering tickets, this is often low lift with measurable impact. Especially for ecommerce category pages, subcategory pages, and seasonal landing pages — these can gain traction quickly with better internal link support.

Duplicate content and canonical issues

Duplicate content coupled with improper canonicals can substantially impact rankings in a negative way. This is especially common with faceted navigation, pagination, filtered product collections, and syndicated content. Issues may range from parameter handling to trailing slash inconsistencies.

When search engines are forced to choose among near-duplicate URLs, ranking signals can fragment.

Fixing canonicals, parameter handling, or indexing rules can dramatically improve performance. This is often a relatively small technical change with major implications.

Resolving accidental noindex or robots directives

This sounds obvious, but it happens more than teams admit. Staging directives make it into production. Template updates accidentally apply noindex rules. Important JS resources get blocked.

These are classic low level of effort, high-impact issues because they directly affect discoverability. And again, when pages can’t be crawled or indexed, nothing else matters. This isn’t Field of Dreams and traffic isn’t guaranteed just because you built the page. 

Rendering and JavaScript issues that hide your content

This is where things can break fast and quietly. If important content isn’t visible in the rendered HTML, search engines (and even LLMs) may not see it at all. That includes:

  • Client-side rendered pages that rely entirely on JavaScript.
  • Delayed content hydration.
  • Critical elements (copy, links, metadata) that only load after initial render.

In these cases, the issue is foundational. Search engines have improved their ability to process JavaScript, but it’s not guaranteed, and it’s not always immediate. If your content depends on perfect rendering to exist, you’re introducing risk to your ability to be indexed and ranked.

Often, the solution isn’t a full rebuild. It’s targeted. Ensure key content is server-side rendered or pre-rendered. Reduce reliance on delayed JS for above-the-fold content and make sure critical elements exist in the initial HTML response. 

Dig deeper: No-JavaScript fallbacks in 2026: Less critical, still necessary

Why impact vary by site type

Best practices often get treated like universal truths when, in reality, context changes everything. The same fix can drive meaningful growth on one site and do absolutely nothing on another. Without understanding the business model, site structure, and how organic search actually drives value, prioritization becomes guesswork.

Publisher sites

For publishers, it’s all about speed, scale, and freshness. These sites live and die by how quickly content gets discovered and indexed. That means technical priorities tend to center around:

  • Discoverability via internal linking, recirculation, tagging, etc.
  • XML and dynamic sitemaps.
  • Pagination and archive structure.
  • Render speed for content-heavy templates.

If Google can’t quickly crawl new content, it doesn’t matter how well it’s written. It won’t perform well. Rendering issues are especially risky here. If key content is delayed by JavaScript or not immediately visible, you can miss the window where recency or freshness matters most.

Ecommerce sites

Ecommerce is a different game. Here, technical SEO issues affect visibility and revenue at scale. High-impact areas typically include:

  • Faceted navigation and parameter handling.
  • Duplicate URLs across product and category variations.
  • Product availability and lifecycle management.
  • Internal linking across category and subcategory structures.
  • Crawl waste caused by filters, sorting, and pagination.

For an ecommerce site, a single mistake can impact thousands of product detail pages at once. This is where small technical issues become big business problems.

Lead gen and service sites

Lead gen sites tend to be smaller, but the stakes are just as high. Here, the focus shifts to:

  • Clean indexing (making sure the right pages are eligible to rank).
  • Clear location and service page architecture.
  • Page speed and UX on high-conversion pages.
  • Strong local signals, where applicable.

You don’t need millions of pages to drive impact, but the pages you do have need to perform.

See the complete picture of your search visibility.

Track, optimize, and win in Google and AI search from one platform.

Start Free Trial
Get started with
Semrush One Logo

There’s no universal SEO priority list

And there never will be. What matters is context: how an issue intersects with your site’s structure and content model, and how your business actually generates value from search. While the same fix can be critical for one site, it can be completely irrelevant for another.

A crawl report full of thousands of errors doesn’t mean you’ve found thousands of opportunities. Sometimes, it means you’ve found noise. And sometimes, one fix — a canonical correction, a rendering issue, a blocked page — outweighs everything else on the list.

Real SEO expertise is knowing the difference.

How soft 404s and indexing issues caused a 90% traffic collapse

How soft 404s and indexing issues caused a 90% traffic collapse

When a website migration goes wrong, the consequences can be a devastating loss of organic traffic and revenue. But what happens when the damage isn’t immediately visible? What if Google is silently deprioritizing your content, page by page, until your traffic has evaporated?

This is the case study of how a multinational media organization lost 90% of its traffic following a domain migration, and how addressing a seemingly harmless technical issue — soft 404 errors — helped unlock suppressed traffic potential across 13 country-specific domains.

While this case study examines events from 2021–2023, the lessons learned remain timeless and directly applicable to any site facing indexing challenges today.

The catastrophic drop

In January, 2022, the Brazilian localization of a cryptocurrency news website completed a domain migration. After the transition, traffic didn’t just drop — it plummeted. Comparing December 2021 to December 2022, both sessions and pageviews had fallen approximately 90% year-over-year.

According to Google Search Console data, the old domain (xx.com.br) was receiving between 15,000 to 25,000 clicks per day before migration. After migrating to the new subdomain structure (br.xx.com) in January, traffic collapsed and never recovered. It stabilized at around 2,000 to 4,000 clicks per day — a sustained loss that persisted for over a year.

The migration coincided with three major Google algorithm updates in June 2021: the core update, spam update, and page experience update. While these updates caused the expected temporary volatility, the Brazilian site showed no signs of recovery.

The migration problem: More than just redirects

Domain migrations typically show an initial traffic drop as Google recrawls and reassesses the site. That’s expected.

Normally, this traffic recovers within weeks or months. In this case, there were no signs of recovery.

The root cause? The old domain continued to be crawled by Google long after the migration.

According to the team’s analysis, proper redirect implementation and technical migration protocols weren’t fully implemented, causing Google to split its crawl budget between two domains rather than consolidating authority on the new one.

In mid-August 2022, after addressing the migration issues with the SEO and IT teams, there was a subtle uptick — a peak of 12 clicks and 37 impressions on Aug. 29, 2022. While modest, this represented the first signs of recovery and indicated that Google was beginning to properly recognize the new domain.

Using Facebook Prophet forecasting on pre-migration data, the team estimated that without the migration issues, the Brazilian site would have exceeded 2 million monthly clicks by early 2022. Instead, it was generating a fraction of that traffic.

Understanding the indexing bottleneck

While fixing the migration was critical, it revealed a deeper problem affecting not just Brazil, but all 13 of the site’s country domains: a massive indexing backlog.

Google’s page processing follows four stages:

  • Crawl: Google discovers and reads pages.
  • Render: The page code is rendered.
  • Index: Pages wait in a queue to be stored in Google’s index.
  • Rank: Pages appear in search results with rankings.

The Brazilian site was taking an average of 2 minutes for Google to crawl new articles (an acceptable amount of time for a news site). However, indexing these articles was taking 24 hours. For time-sensitive cryptocurrency news, this delay was catastrophic. By the time the site’s articles were indexed, the news cycle had already moved on.

The scale of the site migration problem: 513,000 crawled, but not indexed, pages

In January 2023, Google Search Console revealed alarming indexing issues across all domains:

  • Crawled – currently not indexed: 513,369 pages (Brazil alone)
  • Soft 404: 1,193 pages and growing rapidly
  • Alternate page with proper canonical tag: 2,532 pages
  • Discovered – currently not indexed: 524 pages

The “Crawled – currently not indexed” issue was particularly concerning. These were pages that Google had successfully crawled but chose not to index. This typically happens when Google considers a page low-quality, duplicate, or not worth the crawl budget.

Upon investigation, the team discovered that converter pages (e.g., “/usd-to-thor?amount=250” or “/eur-to-signaturechain?amount=1000”) were being automatically generated at scale. These thin content pages were consuming Google’s crawl budget, causing it to deprioritize the entire domain.

The soft 404 time bomb

While fixing the migration and removing low-quality pages was important, the most insidious issue was the proliferation of soft 404 errors.

A soft 404 occurs when a page returns a 200 (success) status code but actually contains no meaningful content — essentially a “page not found” that doesn’t properly signal its emptiness to search engines. Unlike hard 404s, which clearly communicate that the page doesn’t exist, soft 404s confuse search engines and waste crawl budgets.

The data revealed this wasn’t isolated to Brazil. Soft 404 errors were growing exponentially across multiple domains:

  • xx.com (main site): 90,400 affected pages
  • es.xx.com (Spain): 17,700 pages
  • kr.xx.com (Korea): 15,400 pages
  • fr.xx.com (France): 15,100 pages
  • de.xx.com (Germany): 8,010 pages

Specifically for France, Google Search Console data showed a direct correlation: As soft 404 errors began accumulating in October 2022, total crawl requests dropped from 60,000–70,000 per day to just 20,000–30,000 per day. Google was literally giving up on crawling the site efficiently.

The crawl budget crisis

The concept of crawl budget is critical to understanding why soft 404s matter so much.

Search engines allocate a finite amount of resources to crawl each website. If Google wastes time crawling broken, empty, or duplicate pages, it has less capacity to discover and index your valuable content.

For news sites publishing dozens of articles daily, this creates a vicious cycle: New content doesn’t get indexed quickly, engagement drops, Google further reduces crawl budget, and the problem compounds.

In January 2023, Google was wasting significant resources crawling pages that provided no value. This meant:

  • Slower indexing of new, timely content.
  • Reduced visibility in search results.
  • Lost traffic opportunities.
  • Degraded domain authority in Google’s eyes.

The systematic fix: Addressing root causes of site migration problems

Starting Jan. 31, 2023, the team implemented a comprehensive technical SEO remediation plan focused on three priorities:

Urgent: Soft 404 resolution

The team identified the source of soft 404 errors and implemented proper HTTP status codes. Pages that truly didn’t exist began returning proper 404 or 410 status codes. Pages with content were fixed to render properly.

High priority: Crawl budget optimization

  • Removed or noindexed automatically generated currency converter pages.
  • Implemented stricter URL parameter handling.
  • Used robots.txt to block low-value URL patterns.
  • Set up proper canonicalization for variant pages.

Medium priority: Core Web Vitals

While user experience metrics were important, the team recognized that fixing indexing issues would have a more immediate impact than optimizing page speed. Core Web Vitals improvements were addressed, but not at the expense of resolving indexing bottlenecks.

Get the newsletter search marketers rely on.


The results: Dramatic recovery across all domains

Weeks after implementing the fixes, the impact was measurable:

Brazil (br.xx.com)

  • Crawled – currently not indexed: Dropped from 513,000 to 220,000 pages (57% reduction).
  • Soft 404 errors: Reduced from 1,193 to 370 pages (69% reduction).
  • Traffic recovery: Visible upward trajectory starting early 2023.

Germany (de.xx.com)

  • Indexed pages: Increased from ~150,000 to 370,748.
  • Total clicks: Rose from ~8,000/day average to sustained 12,000-15,000/day.
  • Google Discover traffic share: Jumped from 42% to 58%.

Poland (pl.xx.com)

  • Indexed pages: Grew from ~100,000 to 135,556.
  • Total clicks: Increased significantly with multiple traffic spikes above 30,000/day.
  • Google Discover traffic share: Rose from 15% to 86%.

Spain (es.xx.com)

  • Google Discover clicks: Increased from ~450,000 to 912,721 total.
  • Traffic distribution: Discover now represents 65% of total traffic.

All domains combined

By late April 2023, soft 404 errors across all domains had dropped from a peak of approximately 120,000 pages to under 20,000 — an 83% reduction.

Most remarkably, the biggest traffic gains came from Google Discover — Google’s personalized content recommendation feed. As indexing health improved, Google began trusting the domains enough to recommend their content more aggressively to users.

The Core Web Vitals paradox

Interestingly, improvements to Core Web Vitals (page speed, interactivity, and visual stability) showed mixed results:

Desktop improvements:

  • Germany: 25.1% → 97.1% good URLs
  • Poland: 20.5% → 68.9% good URLs
  • Korea: 15% → 84.6% good URLs

Mobile challenges:

  • Brazil: 0% → 0% (no improvement)
  • Argentina: 0% → 0%
  • Thailand: 0% → 0%
  • Korea: 93.4% → 0.5% (severe regression)
  • Turkey: 94% → 0% (severe regression)

The team’s hypothesis: Core Web Vitals performance is heavily influenced by regional factors like CDN proximity, server location, network quality, and device capabilities. Countries with poor mobile infrastructure or greater server distance showed minimal improvement despite technical optimizations.

This reinforced an important lesson: Not all technical SEO issues affect all markets equally. A one-size-fits-all approach would have wasted resources by optimizing for metrics that couldn’t improve without infrastructure investment, while the real wins came from addressing indexing fundamentals.

Key technical SEO lessons

1. Indexing issues trump almost everything else

No amount of content quality, backlinks, or page speed optimization matters if Google isn’t indexing your pages. Before optimizing what’s visible, ensure your content is actually being indexed.

2. Soft 404s are silent killers

Unlike hard 404s that immediately alert you to problems, soft 404s quietly accumulate, degrading your crawl budget until you notice traffic declining. Regular monitoring of Google Search Console‘s “Pages” report is essential.

3. Domain migrations require exhaustive validation

The Brazilian site’s migration issues persisted for over a year. A proper migration protocol should include:

  • Complete redirect mapping verification.
  • Confirmation of old domain deindexing.
  • Search Console property setup and validation.
  • Multi-week monitoring of both old and new domains.
  • Crawl rate and indexing speed tracking.

4. Crawl budget is real for high-volume sites

For sites publishing 10+ articles daily across multiple domains, crawl budget optimization is not optional. Automatically generated pages, URL parameters, and infinite scroll implementations can quickly consume available crawl resources.

5. Regional differences demand regional solutions

Core Web Vitals data showed that Brazil, Argentina, and Thailand couldn’t achieve the same performance as European markets. Instead of forcing uniform standards, prioritize fixes tailored to each market that can actually succeed.

6. Google Discover is increasingly critical

For news and timely content publishers, Google Discover accounts for a substantial share of traffic in some markets. But Discover only promotes content from sites Google trusts — and technical issues like soft 404s directly erode that trust.

Practical site migration implementation guide

For teams facing similar challenges, here’s a systematic approach:

Weeks 1-2: Audit and prioritize

  • Access Google Search Console for all properties.
  • Export “Page indexing” reports for all domains.
  • Identify the scale of each issue category.
  • Calculate the trend (growing, stable, or declining).
  • Prioritize based on issue volume and growth rate.

Weeks 3-4: Fix soft 404s

  • Sample 20–30 URLs from the soft 404 report.
  • Identify common patterns (empty pages, broken functionality, etc.).
  • Implement proper HTTP status codes (404, 410, or fix the content).
  • Validate fixes in Google Search Console.
  • Monitor for reduction in affected pages.

Weeks 5-8: Address crawled but not indexed

  • Analyze URLs to identify auto-generated content.
  • Implement robots.txt rules or noindex tags for low-value pages.
  • Review and strengthen internal linking to important pages.
  • Ensure proper canonicalization across variants.
  • Request reindexing via Search Console for key pages.

Weeks 9-12: Monitor and optimize

  • Track indexing coverage weekly.
  • Monitor crawl rate changes in Search Console.
  • Measure organic traffic recovery.
  • Identify remaining outlier issues.
  • Document learnings for future migrations.

Calculating the traffic loss from migration issues

How significant was this suppressed traffic opportunity?

According to Facebook Prophet forecasting based on pre-migration data, the Brazilian site was trending toward 20,000+ daily clicks. At the time of fix implementation in early 2023, it was receiving approximately 5,000–7,000 daily clicks. This represented roughly 6575% of potential traffic being suppressed — or conversely, the site was only achieving 25–35% of its forecasted potential.

More broadly, across all 13 domains, the soft 404 and indexing issues prevented approximately 500,000 pages from being indexed. Given average click-through rates for indexed pages, this represented millions of potential monthly impressions and hundreds of thousands of potential clicks being left on the table.

Technical debt compounds

The most important lesson from this case study is that technical SEO issues don’t stay static — they compound. What starts as a few hundred soft 404s becomes thousands, then tens of thousands.

Google’s response isn’t immediate punishment, but gradual deprioritization. Traffic doesn’t crash overnight; it bleeds slowly.

For the Brazilian site, it took over a year to recognize the full scope of the problem. During that year, competitors filled the gap, topical authority eroded, and recovery became exponentially harder.

The good news? Once identified and systematically addressed, these issues are fixable. Within 12 weeks of implementing the remediation plan, every domain showed measurable improvement. Some saw traffic double or triple.

Technical SEO is often seen as unglamorous maintenance work. But as this case demonstrates, it’s the foundation upon which all other optimization rests. Before worrying about AI-generated content, E-E-A-T signals, or the latest algorithm update, ensure Google can actually find, crawl, and index your content.

Because the best content in the world is worthless if it’s trapped outside search engine indexes.

Why vibe coding is becoming an SEO advantage

Why vibe coding is becoming an SEO advantage

SEO used to be constrained by one thing more than anything else: dependency.

Dependency on developers, roadmaps, and “maybe next quarter.”

If you wanted a new page template, a calculator, a comparison widget, or even a simple interactive component, you had to ask, wait, and compromise. That’s changing fast.

If you’re in SEO or GEO today and you’re not learning how to vibe code, you’re limiting your impact.

Vibe coding changed the power dynamics in SEO

A few years ago, building tools like calculators or interactive widgets meant tickets, specs, and dev cycles.

Today, with AI, I’ve personally built dozens of mini apps, tools, and UI components without involving a single developer.

Some of those tools are small. Some are relatively ugly but effective. Some now bring in thousands of organic sessions per month.

Entire pages built around a vibe-coded tool are now outperforming traditional text-heavy competitors.

Parents Hub "Back To School Countdown" Vibe-Coded Tool
Parents Hub “Back To School Countdown” Vibe-Coded Tool

Even more importantly, I’ve introduced this mindset to my SEO team, and they’re now building tools on their own to achieve our search goals. That alone changes everything.

SEO teams can now move faster, test ideas immediately, and reserve developers for actual engineering work, including new templates, infrastructure, and scaling.

And yes, there’s something genuinely satisfying about building a tool yourself, publishing it, and watching it attract traffic month after month.

You don’t need to build fancy things. Just things that get the job done.

Dig deeper: Inspiring examples of responsible and realistic vibe coding for SEO

Stop talking about user personas. Start talking to them.

Everyone agrees on the user persona theory:

  • Identify user personas.
  • Understand their pain points.
  • Create content that addresses them.

What almost no one explains is how to actually present that information.

Historically, SEO handled personas with text:

  • “If you’re a parent…” 
  • “For families…” 
  • “Business travelers should consider…”

That approach is already outdated. Today, we can let users self-identify and surface only the information that matters to them.

One example from a brand I manage:

  • A vibe-coded tabbed component.
  • Each tab represents a different user persona.
  • Clicking a tab reveals persona-specific content.

For airport transfers in Majorca, a “family” persona doesn’t care about the same things as a solo traveler.

Example case of the "User Persona" component
Example case of the “User Persona” component

They care about vehicle safety, child seats, family-friendly routes, vehicle size, and indicative pricing. That content appears only when the Family tab is selected.

From an SEO and GEO standpoint, persona pain points were sourced directly from Google Search Console and query fan-out analysis.

The component was then vibe-coded and placed where intent needed to be satisfied immediately.

This aligns with how AI platforms already structure answers: segmented, persona-aware, and intent-first.

Entire traffic categories can be built on tools alone

On one personal project, we launched a brand-new Tools category — ten pages with simple tools, such as:

  • Calculators.
  • Checklists.
  • Calendars.
  • Countdown timers.
  • AI generators.

Each page leads with the tool and uses supporting components to answer sub-intents.

The result? More than 5,000 incremental clicks in two months. Most of those pages were also out of season.

Dig deeper: How to vibe-code an SEO tool without losing control of your LLM

UI is now a ranking lever

SEOs have never been more capable. The only real limitation left is creativity.

One of the most underrated SEO advantages today is how information is visually presented.

Text is cheap. Everyone can produce it. UI that answers intent instantly isn’t.

I’ve seen:

  • Two calculator pages add 10,000 monthly organic sessions.
  • One tool page rank in the top three within days for a high-volume government query.
  • Multiple seasonal pages rank off-season purely because the UI was better.

When competitors list information, we let users interact with it.

  • Eligibility calculators. 
  • Countdown timers. 
  • Dynamic tables. 
  • Visual comparisons.

These pages still include text. But the text supports the tool, not the other way around.

Get the newsletter search marketers rely on.


‘SEO takes time’ — except when it doesn’t

One page we published targeted a Greek government school financial support program with a high-volume head term, dozens of long-tail queries, and extremely text-heavy competition.

We built:

  • A financial support eligibility tool.
  • A transparent explanation of the algorithm logic behind the tool for E-E-A-T.
  • Common rejection mistakes parents made when applying for support.
  • Historical program changes.
  • A step-by-step application flow.
Parents Hub Kindergarten Financial Support Eligibility Calculator
Parents Hub Kindergarten Financial Support Eligibility Calculator

We tagged the tool as a WebApplication, implemented HowTo schema for the process, and properly marked up the FAQs.

Three days after publishing, the page was already ranking on the first page for the main term and generating about 100 clicks.

Sometimes SEO really doesn’t take that long if you solve the problem better than anyone else.

Tools are the ultimate SEO and PR assets

Some tools are built purely for traffic. Others are designed to become linkable digital assets.

A pregnancy due date calculator, a baby name generator, or a comparison table based on TripAdvisor data isn’t just a page. It’s a potential PR campaign.

When a digital asset solves a real pain point, looks modern, answers intent better than SERP features, and has clear PR angles, that’s where SEO, PR, and branding start to collide. That’s when things get really interesting.

Dig deeper: How vibe coding is changing search marketing workflows

Finding tool-page opportunities is easier than ever

With MCP servers from SEO tools, you can now surface tool ideas directly from search demand without leaving the chat, assess difficulty instantly, and launch faster than ever.

I’ve built and launched multiple tool pages this way, and the speed difference compared with traditional workflows is massive.

We’re entering a period where ideation, validation, and execution can all happen in days, not months.

The big shift

SEO is no longer about who can write the longest article, rephrase the same information better, or game templates. It’s about who answers intent fastest, removes friction, and builds search experiences instead of documents.

Vibe coding changed who gets to build. And right now, the people embracing it are pulling away fast. If you want to win in modern SEO and GEO, build tools, build components, and build search experiences. Text alone isn’t enough anymore. And honestly, that’s a very good thing.

Dig deeper: Build your own AI search visibility tracker for under $100/month

Why intent alignment matters more than perfect technical SEO

Why intent alignment matters more than perfect technical SEO

Improving technical SEO on your site may not be enough to move the needle these days. 

Once a site reaches technical parity with its competitors — the point at which a proper infrastructure no longer gives you an advantage — Google shifts its ranking criteria toward relevance. And relevance is determined by aligning with search intent. 

Let’s talk about how to make your site more relevant.

Why an intent mismatch may be suppressing your site’s performance

An intent mismatch occurs when the copy on a page doesn’t match what the user is expecting to find on it. This happens when pages aren’t relevant to a topic or have mismatched signals.

This generates poor behavior signals — users click through from a SERP, see that the page doesn’t answer their need, and leave. Google interprets these signals as evidence that the page doesn’t satisfy the query. 

This can lead to a decline in rankings, which means fewer users see the page, which means the behavioral signals worsen. It’s a feedback loop that technical SEO alone can’t resolve.

Your customers search everywhere. Make sure your brand shows up.

The SEO toolkit you know, plus the AI visibility data you need.

Start Free Trial
Get started with
Semrush One Logo

Technical SEO improvements may no longer make a difference

In the early stages of implementing an SEO strategy, the needle can move quickly. If a site is operating below the technical baseline needed for Google to properly evaluate it, applying simple fixes — such as fixed crawl errors, resolved duplicate content issues, improved page speed, and adding schema — can produce big gains.

However, after these changes, your site’s technical foundations are now comparable to those of your main competitors — you hit a ceiling. Now, Google isn’t ranking pages based on which ones it can access the easiest, but on those that best satisfy the user’s query. 

Your technical infrastructure, or lack thereof, no longer disadvantages you, but now the rules of the ranking game have changed.

This is where intent alignment becomes the primary lever for improvement. 

Signals that reinforce search intent

Elements that have an impact on a page’s intent, and how Google decides whether the intent matches the page, include: 

  • Click-through rate.
  • Engagement signals.
  • Core Web Vitals.
  • Schema type.
  • Internal linking anchor texts.
  • URL structure.

Click-through rate (CTR)

Click-through rate can be determined by your title tag, meta description, URL structure, and schema. It is also measured against intent. 

For example, if your title tag is optimized for a keyword but doesn’t match the user’s query, your CTR will drop. Google treats a low CTR as a relevance signal and adjusts rankings accordingly.

Engagement rate

Time-on-page, scroll depth, and interaction rates can suffer when intent doesn’t align with a page. 

If a user is searching to purchase something but lands on a how-to guide, they may exit that page within seconds. The same can be said of a user looking for an emergency plumber who lands on a page without a phone number. 

Engagement signals feed directly into how Google evaluates a page’s usefulness for a given query.

Core Web Vitals (CWV)

The three Core Web Vitals — Largest Contentful Paint (LCP), Interaction to Next Paint (INP), and Cumulative Layout Shift (CLS) — determine page loading speed.

A transactional page that loads slowly suffers more than a slow-loading informational article. With the transactional page, the user is ready to buy and their patience is minimal, whereas a reader in research mode can tolerate a longer wait. 

CWV thresholds matter everywhere, but their impact on conversion and bounce behavior is greater on high-intent pages. 

Schema type

Schema markup tells Google explicitly what type of content is on a page. Generally:

  • Article/HowTo is informational.
  • Product is transactional.
  • FAQ is informational and commercial.
  • Local business/event is navigational.

When schema type contradicts the content on a page, Google gets a conflicting signal, resulting in a traffic drop.

Internal linking anchor texts

The anchor text of internal links tells Google about the page that’s being linked to, including its intent. 

If a transactional landing page receives internal links with informational anchor text — “learn more about X,” rather than “get a quote for X” or “buy X” —  the intent signal Google receives about that page’s purpose gets diluted.

URL structure

Google uses URL patterns to infer page type. 

For example, URLs sitting under /blog/ are treated with informational bias. A product or service page buried under a blog path fights against that structural expectation, regardless of its content, and it may not rank well. 

Cannibalization and canonicalization

If your site has multiple pages targeting the same keyword but with different intents, neither is likely to rank well. They compete against each other and dilute the signal Google receives. 

To fix, use canonical tags to clearly signal which page is the preferred one for a given keyword, consolidate or redirect competing pages where appropriate, and ensure your internal linking reinforces the canonical choice.

Get the newsletter search marketers rely on.


How to fix intent misalignment

Here’s an example of a common intent mismatch and some steps to audit your content and fix it. 

What an intent mismatch looks like

For example, if a user searches for “financial analysis software,” they’re looking to buy software. The keyword phrase is highly transactional. 

But if your site targets this keyword phrase for an informational blog post that explains how a person can complete a financial analysis report themselves, this creates a mismatch.

The user is looking for a product that does the analysis for them, which means they want to compare features, understand pricing, see integrations, or book a demo. 

The keyword phrase should be applied to a dedicated product or landing page that clearly outlines functionality, benefits, use cases, and pricing. This would align more with the user’s needs, resulting in more inquiries, leads, and conversions.

Identify the intent of your pages

To fix intent mismatches, to start, compile a list of the top performing keywords that best describe your business and manually check the Google rankings for each.

This initial research will tell you exactly what type of page and copy you should have for these keywords. For example:

  • Knowledge panels, AI Overviews, and People Also Ask boxes usually appear for informational searches.
  • Paid results usually suggest commercial intent.
  • Shopping feeds suggest a transactional keyword.

Next, add the keywords to a spreadsheet and add a column for intent. Work down the list, adding whether you think the page is informational, commercial, transactional, or navigational. 

You can then create another column that states the type of page that will rank well: 

  • Informational: Blog or resource content.
  • Commercial: Service or landing pages.
  • Transactional: Collection, category, or product pages.
  • Navigational: Brand, specific service, or specific location pages.

See what your competitors are doing

Research your competitors’ pages for the keywords you’re targeting. Analyze and note what they have that your pages don’t have.

They may have:

  • Tables.
  • Comparisons.
  • Calculators.
  • Tools.
  • FAQs.
  • Reviews.
  • Step-by-steps.
  • Images.
  • Videos.
  • And more. 

Consider how to improve your own pages to match theirs. 

Measure your page’s performance based on intent metrics

Once you’ve made changes to your pages, track their performance to see whether they helped. Look at:

  • Clicks and impressions for intent-aligned keywords.
  • Rankings for core target queries.
  • Time on page.
  • Conversion rates, particularly those of previously underperforming pages.

Technical SEO still plays a decisive role

Technical SEO is still important, especially for complex, enterprise-scale sites. Here are some ways that technical SEO work can still move the needle significantly, in ways that content optimization alone can’t.

Crawl budget management

An ecommerce site with thousands of URLs can have its crawl budget consumed by low-value pages before its allotment reaches high-intent category and product pages that you want to rank. 

Cleaning up low-value pages is purely technical work and will ensure your crawl budget goes toward pages that count. 

International site architecture

Technical SEO is crucial when handling international sites that contain pages in multiple languages. 

A keyword that’s purely informational in one market may be transactional in another, reflecting different buyer behaviors and levels of market maturity. Hreflang implementation, regional subdomain or subdirectory structures, and URL strategies all affect whether the right page, with the right intent, reaches the right audience.

Log file analysis

A log file analysis will reveal which pages Google is successfully crawling and how frequently they are. For sites with intent alignment problems, Google often spends a disproportionate amount of attention crawling low-value or misaligned pages, while high-intent pages are visited infrequently. 

For small sites with a clean structure and limited number of URLs, technical SEO can reach parity quickly, so the need to shift to intent alignment happens sooner. For large, complex sites, technical and intent work often need to happen in parallel.

See the complete picture of your search visibility.

Track, optimize, and win in Google and AI search from one platform.

Start Free Trial
Get started with
Semrush One Logo

Technical SEO and intent need to work together

Technical SEO is still important today — think of it as a foundation that the rest of the site sits on. Pages that can’t be crawled, indexed, or rendered correctly will be unable to rank, regardless of how well their content matches user intent.

Think of intent alignment as the ceiling — it’s what determines how high a technically sound page can rank, and whether it converts the traffic it earns. 

Every page on a site should have a clearly defined intent, expressed in the right format, with the right content type. And they should also be supported by technical signals, be it schema, URL structure, relevant anchor text, etc., so that the page’s intent is constantly reinforced. 

❌