How to use schema markup to optimize for the agentic web

Schema markup has earned its place at the center of the SEO and GEO conversation. Google and Bing have confirmed they use structured data to power AI Overviews, and ChatGPT factors it into product recommendations.
Now, schema markup is becoming part of the infrastructure behind the agentic web, where AI systems increasingly interact directly with websites on behalf of users.
For AI agents, understanding content isn’t enough. They also need to interpret and act on it. Schema markup helps make that possible.
The role of schema markup in the agentic web
In traditional search, schema helps drive visibility by making content more eligible for SERP features and helping search engines better understand entities. That information supports the index and knowledge graph, influencing how results appear to users.
AI agents take this further. They use schema markup not only to identify entities, but also to understand relationships, relevance, and whether content is trustworthy and actionable enough to support recommendations or complete tasks.
Structured data also makes websites easier and cheaper for AI systems to process. Parsing unstructured HTML is computationally expensive compared to reading clean, structured data, especially as LLMs operate within finite context windows and growing inference costs.
As these systems scale, sites that make their content easier to interpret become the path of least resistance for AI agents.
NLWeb and the infrastructure of the agentic web
Schema markup is the foundation, and NLWeb is built on top of it. Understanding this connection is essential for anyone thinking ahead.
NLWeb, Microsoft’s open-source initiative, enables websites to easily add AI-powered conversational interfaces. It effectively turns any website into an AI app that lets users query content using natural language.
Think of it as the difference between a website a human browses and a website an AI agent can interrogate directly — asking questions, retrieving structured answers, and acting on them without any human in the loop.
To be truly agentic, a site must move beyond being “read” to being queryable. NLWeb is designed to help AI agents interact with websites through natural-language queries and structured responses.
While schema tells an agent what is on the page, NLWeb enables more direct interaction with that information in real time. It’s the difference between an agent reading a static menu and an agent asking, “Do you have a table for four at 7:00 PM tonight?” and receiving a deterministic, real-time answer.

NLWeb was conceived and developed by R.V. Guha, who recently joined Microsoft as CVP and technical fellow. Guha is the creator of widely used web standards, including RSS, RDF, and Schema.org.
The same person who built the vocabulary that defines structured data on the web is now building the protocol that lets AI agents use it. That’s a through-line, not a coincidence.
NLWeb leverages existing structured formats, such as Schema.org and RSS, and LLM-powered tools to create natural language interfaces usable by both humans and AI agents.
It isn’t asking you to rebuild your content infrastructure. It’s asking you to have your schema markup in order so it can take it from there.

5 tips for agentic schema optimization
As a search marketer, you’ve probably been implementing schema markup for years. Here are some new considerations as you optimize for the agentic web.
1. Prioritize completeness over coverage
It’s better to have fully populated schema markup on your most important pages than thin markup spread across your entire site. AI agents prioritize properties that help them answer user queries directly.
For a product page, that means price, availability, ratings, and specifications, not just a product name. Incomplete schema signals uncertainty to agents, while complete schema signals reliability.
2. Automate where you can
Manual schema management doesn’t scale, which is a challenge for teams without dedicated technical SEO resources. Some platforms can handle this automatically for key page types — like product pages, blog posts, events, bookings, and local business information — generating markup by default when content is created.
This baseline matters for both coverage and consistency. Stale or mismatched structured data actively works against you: If your schema says a product costs one price and your page displays another, agents will distrust both signals. Agents can also trust a signal more when it appears reliably across a site than when it appears sporadically.
3. Use AI to scale implementation
Platform automation handles the baseline — but AI can go further, analyzing your content to generate more specific and relevant markup. With AI, you can scale structured data generation, installation, and validation.
4. Use JSON-LD
This isn’t new advice, but it’s more important than ever. JSON-LD is cleanly separated from your HTML, making it far easier for agents to parse programmatically. Google’s official guidance explicitly recommends JSON-LD for AI-optimized content.
5. Think about your schema as a site-level graph
Agents benefit from understanding how your content connects across your entire site: how articles relate to authors, how products relate to categories, how services relate to locations. This means you should periodically audit your structured data at scale. Take note of:
- Which page types have markup and which don’t.
- Where entity definitions conflict across URLs.
- Whether your Organization or Person markup is consistent.
The goal is a coherent, connected picture of your site’s entities, one that an agent can trust regardless of which page it enters from.
The window for early mover advantage
AI systems increasingly prefer sources they have already indexed, validated, and found reliable in prior interactions. For agentic optimization, early adoption matters. Content that establishes itself as agent-friendly now builds compounding advantages as agents develop preference patterns.
Schema markup has always rewarded the teams that took it seriously. In the agentic web, the stakes of getting it right — and the cost of ignoring it — are substantially higher. The agents are already crawling. The question is what they find when they get to you.