Google’s John Mueller Pushes Back on Trend of Creating LLM-Only Web Pages
Google’s John Mueller says publishers don’t need separate Markdown or JSON pages for LLMs, emphasizing that AI systems already parse standard HTML. Experts note structured data matters only when platforms provide clear specifications.
Google Search Advocate John Mueller has questioned the growing practice of publishing separate Markdown or JSON pages intended exclusively for large language models, saying he sees no technical reason for LLMs to require alternate versions of web content.
The discussion began on Bluesky after SEO professional Lily Ray asked whether Google had a viewpoint on “creating separate markdown / JSON pages for LLMs and serving those URLs to bots.” The question reflects increasing experimentation among publishers producing simplified “shadow” versions of key pages in formats that may be easier for AI systems to interpret.
A broader discussion on the topic has also been unfolding on X, where some developers argue that LLM-focused formats could help improve how AI systems ingest content.
Mueller: LLM-Only Pages Unnecessary
In response, Mueller said he is unaware of any Google initiative that would require or benefit from a separate LLM-specific format. He emphasized that modern LLMs have long been trained on standard HTML pages and are capable of interpreting them effectively.
“LLMs have trained on — read & parsed — normal web pages since the beginning… Why would they want to see a page that no user sees? And, if they check for equivalence, why not use HTML?” Mueller wrote.
Ray followed up by asking whether alternate formats might help “expedite getting key points across.” Mueller replied that if file formats materially improved LLM outputs, AI companies would openly encourage their use.
“If those creating and running these systems knew they could create better responses from sites with specific file formats, I expect they would be very vocal about that. AI companies aren’t really known for being shy,” he added.
Mueller acknowledged that some pages may perform better with AI systems than others but said the difference is unlikely to be driven by file format—except in cases involving heavy JavaScript, which can remain difficult for some systems to process.
Taken together, Mueller’s comments suggest that Google sees no need for publishers to create duplicate Markdown or JSON versions of existing HTML pages solely for LLM consumption.
Where Structured Data Fits In
The Bluesky exchange also raised the distinction between speculative LLM-targeted formats and situations where AI platforms publish explicit technical requirements.
Matt Wright, participating in the thread, pointed to OpenAI’s eCommerce product feed specifications—an example where JSON structures matter because the platform defines how product data should be formatted and ingested.
“OpenAI eCommerce product feeds are live: JSON schemas appear to have a key role in AI search already,” Wright wrote.
He also pointed to commentary from Chris Long on LinkedIn noting that editorial sites using product schema tend to appear more frequently in ChatGPT citations.
These examples highlight that structured data is valuable when tied to a documented integration or schema adopted by an AI system—not when created speculatively without platform guidance.
Why the Debate Matters
As AI-driven search experiments accelerate, SEO teams are increasingly confronted with questions about “LLM-optimized content formats.” The conversation underscores a common tension: developers may attempt to anticipate future requirements before AI companies formally publish technical documentation.
Mueller’s response reiterates that, in the absence of explicit instructions, standard HTML remains sufficient for LLMs. For most websites, improving page speed, clarity, accessibility, and structure offers more reliable benefits than generating secondary content versions solely for bots.
At the same time, emerging AI-specific formats—such as OpenAI’s product feed schema—indicate that documented, use-case-specific integrations will play a growing role in how content is processed by AI systems.
What to Watch Going Forward
The exchange reflects a rapidly shifting landscape in which SEO and development teams are being asked to prepare for AI-driven search without stable technical guidelines. Until LLM providers publish more precise requirements, the takeaway remains grounded:
- Maintain clean, well-structured HTML.
- Limit unnecessary JavaScript that may hinder parsing.
- Implement structured data where platforms provide established schemas.
The debate highlights that while AI-optimized formats may emerge in specific scenarios, there is no broad evidence that Markdown or JSON clones of existing pages are currently necessary—or beneficial—for general LLM visibility.