Google’s LLM patent suggests a new goal for SEO: Teaching AI who you are
A 2023 Google patent describes how AI systems could build an understanding of businesses, brands, products, and other entities from websites and public data.
The filing outlines a process for extracting information, identifying relationships, and synthesizing what Google calls a “deep, holistic characterization” of an entity.
If systems like this become more influential in search, SEO may increasingly involve helping Google understand the entity behind your content, not just the content itself.
The shift from documents to entities
Google has spent more than two decades helping users find information published on webpages. Whether through traditional search results, featured snippets, or AI-generated answers, the process has generally started with understanding documents.
As Google’s search products become more conversational and recommendation-driven, understanding individual documents may no longer be enough.
Before an AI system can recommend a business, compare products, explain a brand, or suggest a service provider, it must first understand the entity behind the content.
That’s what makes Google’s “Data extraction using LLMs” patent interesting.
At first glance, the patent may seem like another content extraction system. Search engines have been extracting information from webpages for years. However, Google describes a broader objective.
According to the filing:
- “The techniques described throughout this specification enable artificial intelligence to generate and enhance a deep, holistic characterization of a particular entity.”
Google defines an entity broadly, including people, companies, businesses, places, objects, and concepts.
Rather than simply identifying facts or indexing content, the system is designed to interpret information, identify relationships, generate summaries, and develop an understanding of the entity represented by that information.

See where your brand appears in AI search, where competitors are winning, and what it takes to become the answer AI recommends.
How Google’s patent creates an understanding of an entity
At a high level, the patent describes a system for collecting information from multiple sources, interpreting that information, and synthesizing an understanding of an entity.

Step 1: Identify the entity
The process begins by identifying a domain and an associated entity. The system then gathers information from webpages associated with that domain and processes it using an artificial intelligence system that includes a large language model (LLM).
Step 2: Interpret the information
Rather than simply extracting facts from individual pages, the system is designed to generate what the patent calls a characterization of the entity.
Google explains that this characterization is “an interpretation of the extracted first content and extracted second content rather than a verbatim duplication of the extracted content.”
In other words, the system goes beyond collecting information. It interprets that information and forms conclusions about the entity behind it.
Step 3: Extract attributes and relationships
The patent further explains that the AI system can analyze webpages to extract information such as an entity’s presence, age, principles, services, reputation, social media sentiment, and relationships between different elements associated with the organization.
These signals help the system move beyond understanding individual webpages toward understanding the entity itself.
Step 4: Supplement with third-party information
Importantly, the patent isn’t limited to information found on a company’s own website. Google notes:
- “The artificial intelligence systems may use online maps data, job listing data, business information, or other suitable third-party data as additional or augmenting input to provide context for generating the characterization that is output by the artificial intelligence system.”
Taken together, the goal appears to be to build a more complete understanding of the entity than could be obtained from any single webpage.
How the patent represents entities
The system is designed to organize information about an entity into a format that can be interpreted, expanded, and used by other systems.
Entity summaries
After collecting information from webpages and other sources, the patent describes generating an entity summary. The examples provided in the filing aren’t page summaries. Instead, they read more like descriptions of a company’s identity, positioning, values, and characteristics.
One example included in the patent describes a hypothetical company’s brand identity, noting associations with simplicity, accessibility, trust, innovation, and social responsibility.
- “Example Search Co’s brand identity is one of simplicity, clarity, and accessibility. The company’s logo, a colorful, sans-serif E, is instantly recognizable and easy to remember. The color palette is also simple, with a focus on blue and green, which are associated with trust and reliability. Example Search Co’s typography is also clear and easy to read, even at small sizes. The overall tone of Example Search Co’s brand identity is friendly and approachable. The company’s marketing materials often feature simple, humorous illustrations that help to make Example Search Co’s products and services more relatable to users. Example Search Co. also emphasizes its commitment to making information accessible to everyone, regardless of their background or technical expertise.”
Another example presents those same concepts as a set of key attributes rather than a narrative summary.
“Here are some key aspects of Example Search Co’s brand identity:
– Trustworthiness: Example Search Co. is known for its reliable and trustworthy search engine. The company also has a strong commitment to privacy and security.
– Innovation: Example Search Co. is constantly innovating and releasing new products and services. The company is known for its ability to anticipate user needs and deliver innovative solutions.
– Accessibility: Example Search Co’s products and services are designed to be accessible to everyone, regardless of their background or technical expertise.
– Social responsibility: Example Search Co. is committed to using its technology to make a positive impact on the world. The company has a number of initiatives in place to promote sustainability, diversity, and inclusion.”
What’s important here is the overall format. The system takes information distributed across multiple sources, transforms it into an interpretation of the entity, and synthesizes it into a higher-level understanding of the entity.
Entity graphs
Google builds this understanding through hierarchical graph structures. According to the patent, the generated characterization can include:
- “[A] hierarchical graph structure that includes at least one parent node representing a first attribute of the characterization and at least one leaf node representing a second attribute of the characterization.”
The accompanying figures from the patent provide a better sense of what this means in practice.

The figure above shows an example graph generated for a service-based company.
The figure below provides a similar example for a product-based company. In both cases, the system organizes information into connected relationships rather than isolated facts.

Instead of just knowing that a business offers a service, the system associates that service with audiences, locations, reputation signals, differentiators, and other related attributes.
Instead of only identifying a product, the system can also connect it to features, categories, use cases, and related offerings.
Entity models
The patent begins to resemble an entity modeling system more than a content extraction system.
- Extracting information answers one question: What information appears on this website?
- Entity modeling answers a different question: What do we understand about this business?
That difference becomes apparent when you look at the types of information Google says the system can analyze.
The patent specifically references extracting information related to an entity’s presence, age, principles, services, reputation, social media sentiment, and relationships between different elements associated with the business. It also discusses incorporating information from external sources such as maps data, user reviews, business information, and job listings.
Taken together, these aren’t just website attributes. They’re also signals that help define an entity’s identity.
The result is a model that appears capable of answering broader questions about an organization than traditional extraction systems were designed to address.
Rather than identifying products, services, or facts, the system develops a contextual understanding of who the entity is, what it does, how it’s perceived, and how it relates to other entities.
This is where the patent becomes particularly interesting for SEO.
Understanding information regardless of format
Google has spent years building systems that help machines understand information on the web. Structured data, schema markup, product feeds, business listings, and knowledge graphs all exist, in part, to make information easier to organize, interpret, and connect.
One aspect the patent emphasizes repeatedly is the ability to extract information that wasn’t specifically structured for machine consumption.
The patent explains that the AI system can extract content that has “not been structured for parsing by the artificial intelligence system” and can process information from webpages that haven’t been organized according to the requirements of traditional content extraction systems.
Google identifies this as one of the primary advantages of the approach.
According to the filing, existing content extractors are often limited to content that follows predefined structures, while the proposed system can extract and interpret information “irrespective of its format.” Rather than reproducing extracted text, the system can generate new content that interprets and synthesizes the information it finds.
The patent suggests Google is exploring ways to use this capability to build a more complete understanding of an entity. That understanding isn’t limited to information found on a company’s own website.
The patent explicitly discusses supplementing website content with information from maps data, business information, job listings, and other third-party sources.
Taken together, the process begins to resemble an entity analysis system rather than a webpage analysis system. The website remains vitally important, but it’s no longer the only source of truth. Instead, the website becomes one of several inputs used to construct an understanding of the entity behind it.
As AI-powered search experiences become more focused on answering questions, making recommendations, and helping users evaluate options, the quality of those outputs depends on the quality of the system’s understanding.
Before an AI system can recommend a business, summarize a brand, compare products, or explain why one option may be a better fit than another, it first needs a model of the entities involved. The patent describes one possible approach for creating that model.
From webpages to entities: What this means for SEO
Patents don’t tell us exactly how Google will use a technology. Many patents never become products, and even when they do, the implementation often looks different from what is described in the filing.
What patents can do is reveal how Google is thinking about a problem. In this case, the problem appears to be understanding entities.
That may sound familiar because entity understanding isn’t a new concept within Google Search. Google’s Knowledge Graph, introduced more than a decade ago, was built around connecting entities and relationships.
More recently, Google’s emphasis on E-E-A-T, product reviews, business information, and reputation signals has reflected a similar objective: understanding not just what a page says, but who is behind it and whether that source can be trusted.
LLMs expand Google’s ability to understand entities
What makes this patent worth examining is the role large language models now play in that process.
This patent describes a process in which an AI system can:
- Analyze websites and public information.
- Interpret the information it finds.
- Synthesize an understanding of an entity without requiring that information to be presented in a specific format.
That capability becomes increasingly important as Google’s search experiences move beyond document retrieval.
Consider what is required for a system like AI Overviews to answer a question about a company, product, or service. The system must first determine what that entity is, what it offers, who it serves, how it differs from alternatives, and whether it is relevant to the user’s query.
The same challenge exists in AI Mode, Gemini, and recommendation-driven experiences such as Ask Maps. Before an AI system can recommend an entity, it must first understand it.
That idea appears throughout the patent. Google repeatedly describes collecting information from multiple sources, generating summaries, organizing attributes into relationships, and developing an understanding of the entity as a whole.
The patent explains that the system can identify characteristics such as services, reputation, principles, social sentiment, and relationships between different elements associated with the entity.

Webpages become evidence
Through an SEO lens, this suggests a change in how webpages may function.
Traditionally, webpages have been optimized to rank for queries. A service page targets a service keyword. A category page targets a product category. A location page targets a geographic market. Those objectives remain important.
However, if systems like the one described in this patent become more influential, webpages may increasingly serve a second purpose. They become evidence used to construct an understanding of the entity behind them.
- A service page does more than target a keyword. It helps establish what services a business offers.
- A case study does more than attract traffic. It demonstrates experience and expertise.
- A team page helps identify the people behind the organization.
- Customer reviews contribute information about reputation.
- Press coverage, social media, and industry references provide additional signals that reinforce or challenge the system’s developing understanding.
This is one reason the patent’s emphasis on multiple data sources is so interesting. The filing doesn’t describe building an understanding from a single webpage. It describes combining information from websites, maps data, business information, job listings, and other public sources to create a more complete picture of the entity.
Visibility may increasingly depend on entity understanding
The implication here is that visibility may increasingly depend on how effectively Google understands the entity associated with those keywords. That becomes especially important in environments where users are no longer choosing from a list of 10 blue links.
When an AI system is summarizing options, making recommendations, or narrowing choices on behalf of a user, the quality of its understanding becomes a critical factor in determining which entities are surfaced and how they are described.
The challenge for SEO may no longer be limited to helping Google understand a page. It may increasingly involve helping Google understand who you are.
How brands can influence entity understanding
If Google’s goal is to synthesize an understanding of a business from its website and other public sources, the practical question becomes: What can organizations do to help shape that understanding?
The patent suggests that entity understanding emerges from the accumulation and interpretation of information across multiple sources rather than any single webpage, profile, or signal.
While the patent doesn’t provide optimization recommendations, it does point to several areas businesses should pay attention to.
Maintain consistency across sources
The patent repeatedly references using information from multiple sources to generate a characterization of an entity.
Because that characterization is “an interpretation of the extracted first and second content rather than a verbatim duplication of the extracted content,” consistency becomes increasingly important.
Review how your business is described across:
- Your website.
- Business profiles and listings.
- Social media accounts.
- Press coverage.
- Recruiting and job postings.
- Industry directories.
The goal isn’t identical wording everywhere. The goal is to ensure AI systems encounter a consistent understanding of who you are, what you do, and who you serve.
Define the attributes you want associated with your brand
The patent’s example entity summaries focus on characteristics such as trustworthiness, innovation, accessibility, and social responsibility.
Ask yourself:
- What do we want to be known for?
- What differentiates us from competitors?
- What attributes should be associated with our brand?
Examples might include:
- Enterprise software: security, compliance, and scalability.
- Ecommerce: quality, value, and sustainability.
- Local services: expertise, responsiveness, and reputation.
The clearer those differentiators are communicated, the easier they become for AI systems to identify and associate with the entity.
Support claims with evidence
The patent describes building an understanding of an entity from multiple sources. That means claims alone may carry less weight than evidence that reinforces those claims.
Examples of supporting evidence include:
- Customer reviews.
- Case studies.
- Testimonials.
- Press coverage.
- Industry citations.
- Awards and certifications.
- Author profiles and expertise signals.
The goal isn’t simply publishing more content. The goal is providing evidence that supports the attributes you want associated with your entity.
Strengthen entity relationships
One of the more interesting aspects of the patent is its use of hierarchical graphs to organize relationships between different attributes and concepts.
Businesses should make it easy for search engines and AI systems to understand relationships between:
- Products and services.
- Locations and service areas.
- Audiences and use cases.
- Brands and people.
- Organizations and industries.
The easier those relationships are to identify, the easier it becomes for AI systems to understand where an entity fits and when it should be recommended.
Audit your entity footprint
A useful exercise is to ask:
- If an AI system had to describe our company using information from our website, reviews, profiles, listings, and third-party mentions, what would it say?
The answer may reveal gaps, inconsistencies, or missed opportunities that are difficult to identify when looking at individual pages in isolation.
As AI-powered search becomes increasingly focused on understanding and recommending entities, that broader view of your digital presence may become just as important as traditional page-level optimization.
What this means for enterprise, ecommerce, and local businesses
One of the strengths of this patent is that it isn’t limited to a particular type of entity. Google’s definition is intentionally broad, encompassing businesses, organizations, products, places, concepts, and people.
That breadth suggests the framework could potentially be applied across many different search experiences and industries. The challenges associated with entity understanding are likely to vary depending on the type of business being analyzed.
Enterprise and B2B organizations
Enterprise organizations often face a consistency challenge. Information about the business may be distributed across product pages, investor relations content, press releases, partner websites, recruiting materials, analyst reports, and social media channels. Different departments frequently describe the organization in different ways.
If AI systems are synthesizing an understanding of the entity from multiple sources, consider:
- Is our positioning consistent across channels?
- Would an AI system describe our company the same way regardless of the source it analyzed?
- Are our core differentiators clearly communicated and reinforced?
As AI systems increasingly interpret information across channels, maintaining a coherent entity identity may become just as important as maintaining a consistent brand identity.
Ecommerce and product-focused businesses
The patent’s product-related examples suggest that entity understanding may extend beyond organizations to individual products.
Users often ask questions that require evaluation rather than retrieval. Rather than just searching for a product, they’re asking which product is best for a specific use case, budget, audience, or situation.
For ecommerce brands, consider:
- Are product attributes clearly defined?
- Are category and product relationships easy to understand?
- Do reviews reinforce product strengths and use cases?
- Is supporting content helping explain who a product is for and when it should be recommended?
Product information architecture, reviews, category relationships, and supporting content may all contribute to how products are understood and surfaced in AI-driven experiences.
Local businesses
Local businesses often face a reputational and specialization challenge.
Many of the attributes referenced in the patent align closely with signals already used in local search, including services, reputation, social sentiment, and business information.
For local businesses, consider:
- Is your expertise clearly communicated?
- Do reviews reinforce the services and specialties you want to be known for?
- Are service areas consistently represented across sources?
- Does your website, Google Business Profile, and third-party presence tell the same story?
A local business is more than a collection of service pages. It is an entity associated with specific services, locations, expertise, reviews, and reputation signals gathered from across the web.
The common thread
Across enterprise, ecommerce, and local search, the challenges are similar. Before Google can recommend an entity, compare an entity, or explain an entity, it must first understand that entity. The patent provides one of the clearest examples yet of how that understanding might be built.
Track your visibility across AI search, uncover missed opportunities, and grow your presence where customers are asking questions.
The next evolution of entity understanding
Patents aren’t product announcements. Google files thousands of patents, and many never become user-facing features.
The most useful way to view this patent isn’t as a roadmap for a future ranking algorithm, but as a window into how Google is approaching the challenge of understanding entities in the age of LLMs.
Throughout the filing, Google repeatedly returns to the same objective: using AI to collect information from websites and public sources, interpret that information, and synthesize an understanding of an entity.
In Google’s own words, the techniques described in the patent enable artificial intelligence to “extract content from a website or domain and other public sources to synthesize an understanding of a particular entity.”
That objective aligns closely with the direction of Google’s newer search experiences. AI Overviews, AI Mode, Ask Maps, and other AI-powered systems all depend on understanding the businesses, products, organizations, and concepts they reference. They evaluate, summarize, compare, and recommend entities.
For SEOs, that may be the most important takeaway. Historically, SEO has focused on helping Google understand webpages.
Patents like this suggest that the next challenge is helping Google understand the entity behind them. That understanding may influence who gets surfaced, who gets cited, and ultimately, who gets chosen.