How Google Discover qualifies, ranks, and filters content: Research
Google Discover runs on a structured, multi-stage pipeline with hard publisher blocks, strict image requirements, freshness decay, and heavy experimentation shaping what users see, according to new SDK-level research by Metehan Yesilyurt.
Why we care. Google Discover can drive massive traffic, but it often feels unpredictable. This research gives you a clearer view of how your content qualifies, gets ranked, or gets blocked — and where things can break before ranking even begins.
The details. Yesilyurt analyzed observable signals in Google’s Discover app framework and mapped a nine-stage flow. Google:
- Crawls and understands your content.
- Reads key meta tags like your image and title.
- Classifies your content type (e.g., breaking news or evergreen).
- Checks whether you’re blocked.
- Matches your content to user interests.
- Applies a server-side click-through rate prediction model.
- Builds the feed layout.
- Delivers your content.
- Records user feedback.
One key finding. The publisher-level block happens before interest matching and ranking. If a user blocks you, your content never reaches the ranking stage.
- Publisher blocking is powerful. One “Don’t show content from this site” action can suppress your entire domain. There’s no similar sitewide “boost” mechanism.
The ranking model. Your title, image quality, and engagement history are part of the evaluation process. The system uses a predicted click-through rate (pCTR) model on Google’s servers to estimate how likely someone is to click. The model isn’t visible, but the app shows which signals are sent to Google before ranking decisions, including:
- Your page title (from og:title).
- Your image size and quality.
- How new your content is.
- Past click and impression data for your URL.
- Whether your images load successfully.
Freshness matters. Google Discover groups content into time windows:
- 1 to 7 days old: strongest boost.
- 8 to 14 days: moderate visibility.
- 15 to 30 days: limited visibility.
- 30+ days: gradual decline.
There’s a separate classification for strong evergreen content, but by default, newer content has an advantage.
Image and meta tag requirements. Google Discover reads six key page-level tags, including og:image and og:title. No image means no card.
- To qualify for large, prominent cards, your images must be at least 1200px wide. Smaller images typically appear as thumbnails and often earn fewer clicks.
- If certain tags are missing, Google Discover looks for backups — for example, it will try the Twitter title tag or the HTML title if og:title isn’t present.
- Two specific meta tags — “nopagereadaloud” and “notranslate” — can stop your page from entering Google Discover entirely.
Personalization layers. Google Discover personalizes content using:
- Google’s broader interest data tied to user behavior.
- Publisher signals, including Publisher Center registration.
- Individual actions like follows, saves, and dismissals.
- Engagement signals, such as time spent reading.
If a user dismisses your story, the system stores that action permanently for that specific URL. It won’t resurface.
Experiments everywhere. During one observed session, about 150 server-side experiments were running simultaneously. Another 50+ feature controls affected how cards were displayed.
- That means two similar users could see noticeably different feeds simply because they’re in different experiment groups.
Real-time feed updates. Google Discover isn’t static. The system can add, remove, or reorder content while someone is browsing, without a refresh.
The big takeaways. Success in Google Discover depends less on tricks and more on eligibility, trust, strong visuals, and sustained engagement — in a system that can filter you out before ranking even starts.
- Publisher blocks happen before ranking.
- Freshness is built into the system.
- Strong images and clear titles are essential.
- User dismissals are permanent.
- Heavy experimentation makes volatility normal.
The research. Google Discover Architecture: Clusters, Classifiers, OG Tags, NAIADES – What SDK Telemetry Reveals