LLM.txt File Is Spreading Across the Web. Almost No AI Reads It.


A new technical SEO checkbox is spreading across the web.

It sits at yoursite.com/llms.txt. Yoast can generate it for WordPress sites. Wix has rolled it out for its users. SEO tools and consultants are already turning it into another thing site owners feel behind on.

The pitch goes like this. Add this file and AI reads your site better. Get cited more in ChatGPT. Show up in AI answers.

The data does not back that up. So before you spend time on this, here is what llms.txt actually does, what it does not do, and the one place it genuinely earns its keep.

This content was originally published on masilat.com


What it is

llms.txt is a plain markdown file that sits at the root of your site. It lists your important pages as clean text, no ads, no navigation, no popups. The idea is to hand an AI a tidy summary of your content so it can read the main thing without parsing your entire page structure.

Jeremy Howard proposed it in 2024. Solid idea on paper. The confusion starts when people decide what it is supposed to be for.


It is not robots.txt. Stop calling it that.

The most common thing you will read is that llms.txt is “robots.txt for AI.” That framing is wrong, and it sends people in the wrong direction.

robots.txt controls bots. It tells crawlers where they can and cannot go. It blocks access.

llms.txt does the opposite. It invites AI in and points it toward your best pages. It includes content, not restricts it.

One blocks. One includes. Google’s John Mueller said the robots.txt comparison does not hold. If you want to control which AI bots crawl your site, that is a robots.txt job. There is a setup for that further down.


Does AI actually read it?

Here is where it falls apart.

A SE Ranking study across around 300,000 domains found about 10% had a llms.txt file. Among the 50 most-cited domains in AI answers, one had it.

OtterlyAI ran an experiment on a live site and tracked over 62,000 AI bot visits across 90 days. Requests to the llms.txt file: 0.1% of total traffic. Limy went bigger, tracking over 500 million AI bot events. GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, Google-Extended. All of them skipped the file and crawled HTML directly.

Not one major provider has publicly committed to using llms.txt as a citation or ranking signal. Not OpenAI, not Anthropic, not Google, not Meta, not Mistral.

As an AI search play right now, the file does close to nothing. Anyone selling it as a GEO ranking lever is selling you a guess.


The Google situation

You may have seen people say Google supports llms.txt, then seen others say Google said no. Both camps are right. They are reading two different parts of Google.

Google Search has been clear. At Search Central in July 2025, Gary Illyes confirmed Google does not support llms.txt and has no plans to. John Mueller compared it to the old keywords meta tag, the self-declared signal Google killed years ago because it was too easy to game. Google’s own 2026 AI search guidance names llms.txt as a tactic that does not help.

So where does “but Google added it to their docs” come from?

In December 2025, an llms.txt file briefly appeared on Google’s own developer documentation. The SEO community noticed immediately. Google pulled it the same day. Mueller clarified: the Search team did not add it, did not use it, and does not endorse it. An internal content tool generated it automatically. Nobody caught it in time.

Crawling a file is not the same as using it. Googlebot fetches plenty of things it does not act on.

The second misread: Chrome is not Search. Google’s Chrome team added llms.txt as a check inside Lighthouse’s Agentic Browsing audit. That check is for browser-based agents reading a page, not for search rankings or AI citation frequency. Same company, two completely separate products with different jobs.

Once you see that split, the apparent contradiction disappears.


Where it actually works

Not search. Agents.

AI coding agents fetch llms.txt routinely. Point Cursor, Claude Code, GitHub Copilot, or Windsurf at a documentation site and they look for the file to find the right pages and load fewer tokens. This is why Stripe, Vercel, Cloudflare, and Anthropic all ship a clean one. Their users are building with these agents every day. A well-curated file means the agent writes working integration code. Without it, the agent guesses, hallucinates an endpoint, and the developer loses an hour debugging something that was never real.

The next wave is shopping agents. When an agent buys on behalf of a user, it needs a clean, machine-readable view of your catalog, pricing rules, and stock. Brands that hand agents a structured file win transactions. Brands with cluttered category HTML lose them to a competitor whose site is easier to parse.

The honest rule: if your readers include developers, or you run docs, or you build API products, a good llms.txt does real work today. If you run a service business and want more ChatGPT citations, it does close to nothing.


What actually moves AI visibility

Crawlable pages. Clean structure. Schema markup so machines understand what each page is. Strong, first-hand content. Real authority signals. None of this is new, and all of it works across every AI engine.

If you want to control how AI crawlers treat your site, robots.txt is the tool. OpenAI documents its own bots and tells site owners to manage them there. Here is a setup that blocks training and scraping bots while letting the on-demand citation fetchers through:

# Block training and scraping
User-agent: GPTBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Meta-ExternalAgent
Disallow: /

# Allow on-demand citation fetches
User-agent: ChatGPT-User
Allow: /

User-agent: Claude-User
Allow: /

User-agent: PerplexityBot
Allow: /

Adjust it to fit what you want to allow. This file controls the bots. llms.txt never did, and it was never designed to.


The call

If a plugin generates llms.txt for free, leave it on. Costs nothing. Cheap insurance if adoption shifts later. Do not build anything around it expecting search traffic.

If you build for agents or run documentation, put time into a proper one. It pays off in that context.

Everywhere else, your hours go further on content, schema, authority, and a clean technical setup. That is still the lever.

The file a lot of people are rushing to add is an agent tool wearing an SEO costume.


Sources

Use GTM as Your CRO Testing Hub

CRO gets a bad reputation because a lot of people do it wrong.

They look at a competitor’s landing page. They change their button colour. They run an A/B test for four days with 200 visitors and declare a winner. Then they wonder why nothing improved.

That is not CRO. That is guessing with extra steps.

Real conversion rate optimisation starts with one question: where exactly are people dropping off, and why?

The answer is not on your competitor’s website. It is in your own data. And a lot of teams are not collecting it correctly.

This post covers how to use Google Tag Manager as a proper CRO testing infrastructure — and why Microsoft Clarity running alongside it gives you something no heatmap tool alone can provide.


What CRO Actually Is

Conversion Rate Optimisation is the process of improving the percentage of users who complete a target action.

That target action depends on your business. For eCommerce it is a purchase. For fintech and service-based businesses it is a qualified lead. For SaaS it is a trial signup or an activation event.

The conversion rate is not your only metric. It is the output of a system. Every step before the conversion is an input. If the output is low, something in the inputs is broken.

Your job is to find which input and fix it. Not guess. Find.


How a Lot of Teams End Up Looking in the Wrong Place

SEO is running. Paid is running. Email click-through rates are decent. Traffic is coming in. But conversions are flat.

The team looks at the landing page. They rewrite the headline. They change the CTA button. They add a testimonial section. Nothing moves.

Then they look at what competitors are doing and copy the layout. Still nothing.

Here is the thing — the page might be the problem. Or it might not. You will not know until you look at the data first.

A lot of teams are looking at sessions and bounce rates. They are not looking at where exactly in the funnel the session terminates, which form field causes abandonment, which device has a completion rate near zero, or which traffic source has three times the scroll depth but half the conversion rate.

That data exists. Any decision to change something on the page should come after you have looked at it. Not before.


GTM as Infrastructure, Not Just a Tag Container

In 2026, a lot of teams are still using GTM for one thing: firing the basic Google tag. The container gets set up during the initial tracking configuration and nobody touches it again.

That is leaving most of what GTM can actually do completely unused.

GTM is your event collection layer. And if you configure it properly, it can also function as a lightweight A/B testing tool — you can inject content, swap elements, and test page variations without touching the codebase.

But before testing, the tracking setup itself is where a lot of teams fall short. Here is what you should be collecting:

  • CTA button clicks — not just which button was clicked, but the sequence. Did the user click the primary CTA first, or did they scroll past it and come back?
  • Scroll depth — at exactly what percentage of the page does the volume drop?
  • Load time events — if a section takes more than three seconds to render on mobile, you want to know that before you blame the copy
  • Pop-up impressions and interactions — GTM lets you track whether a pop-up was seen, dismissed, or converted, across any pop-up type including exit intent, time-based, and scroll-triggered
  • Form field level tracking — not whether the form was submitted, but which specific field caused the abandonment

That last point matters. GTM can track any form including pop-up forms. You do not need a dedicated form analytics tool. You need triggers, variables, and a data layer push to GA4. The signal is already there — you just need to collect it.

If you are not using data layer pushes, you are missing the cleanest data in your stack.


Where Clarity Fits In

GA4 tells you what happened. Clarity shows you why.

When integrated through GTM, Clarity becomes more targeted because you control what it records. You do not want to record every session. You want to record sessions that match a specific behaviour — users who visited a key page but did not convert, users who triggered a form but abandoned it, users from a paid source who exited in under 30 seconds.

You build that targeting logic in GTM. Clarity records the sessions that qualify. Then instead of hypothesising, you are watching the drop-off happen.

Here is what that looks like in practice.

Mobile cart pages. I have seen floating cart widgets render directly over the primary CTA on mobile. The button is technically there. It is just blocked. Heatmap summaries show clicks in that area. Session recordings show users tapping repeatedly, getting nothing, and leaving. That is not a copy problem. That is a layout problem you would never find by looking at a competitor’s page.

eCommerce collection pages. A client had a long collection page with a lot of products. Using Clarity scroll and click data, we could see exactly which products were getting attention and which were being scrolled past. That changed the product sort order — not based on what the brand thought was their best item, but based on where actual attention was landing. Collection page conversions improved without changing a word of copy.

Blog to lead. A consultant client was getting solid blog traffic but no leads. Scroll data showed users were reading about half the post and dropping off. Instead of rewriting the blog, we added a scroll-triggered pop-up at 50% depth — name, contact number, email — with a simple offer. That pop-up is still generating leads. The content was not the problem. There was no capture mechanism at the point where attention peaked.

Small changes based on actual data have a disproportionate impact. That is not a principle. It is what the session recordings keep showing.


The Testing Loop

Once you have event data from GTM and session recordings from Clarity, you can run a real testing cycle.

Find the drop. Pick one funnel step where the volume falls. Quantify it — how many users enter, how many exit, what is the drop rate.

Form a hypothesis. The Clarity recordings should show you a pattern. Users are scrolling past the form. The mobile CTA is obscured. The pricing section is too long and people exit before reaching the plan they actually want.

Make one change. Not three. One. If you change three things and conversion improves, you do not know which change did it.

Run it long enough. Four days is not a test. It is a sample.

Document it. The result is not just the uplift. The result is the learning — what you now know about your users that you can apply to the next test. That is what compounds over time.


What to Actually Set Up

Start with one page — your highest-traffic landing page or the one closest to your conversion goal.

Configure GTM with three things: scroll depth tracking, CTA click events, and form abandonment tracking including pop-up forms. Connect GA4. Install Clarity via GTM with audience targeting on.

Then leave the page alone.

If traffic is decent, two weeks gives you a pattern. If you are under a few hundred sessions a month, give it a full month. One week on low traffic is noise, not data.

How to Auto-Create Sub-Tabs in Google Docs Using Apps Script

I recently reorganized a content-heavy workflow inside Google Docs.

The problem was simple: one project, too many documents. I wanted a single document with a clean structure — one parent tab, and dozens of sub-tabs nested under it, one per content piece.

Google recently added tabs and sub-tabs to Docs. Useful feature. The issue is creating them in bulk. Doing it manually — for 50 or 80 items — is exactly the kind of repetitive clicking that should not exist in 2025.

So I tried to get an Apps Script written. That is where things got interesting.


Why AI Models Struggled With This

I tested this across multiple models. The results were all over the place.

Some said Google Docs does not support creating sub-tabs through the API at all. That is wrong.

Others generated the right idea but the wrong request format — payloads that looked correct but would silently fail or throw generic errors.

A few got close. Still failed.

The actual solution is straightforward once you know it. The Google Docs API supports creating tabs programmatically using addDocumentTab. To create a nested sub-tab, you pass the parent tab ID in the request. That is it.

The final error I hit was not even a code issue.

The API returned:

Docs are limited to 100 tabs.

First reaction: another API problem. It was not. Authentication was working. OAuth was working. The endpoint was correct. The payload was correct.

I had just hit Google’s hard limit of 100 tabs per document. The error message was telling the truth the whole time.


The Script

Here is the Apps Script that handles this automatically:

function createLinkedInSubTabs() {
  const docId = DocumentApp.getActiveDocument().getId();
  const token = ScriptApp.getOAuthToken();

  const parentTabId = 't.0';
  const maxTabsAllowed = 100;

  const doc = DocumentApp.getActiveDocument();
  const existingTabs = doc.getTabs().length;

  const remaining = maxTabsAllowed - existingTabs;
  const count = Math.min(remaining, 80);

  if (count <= 0) {
    Logger.log('No space left. Google Docs allows max 100 tabs.');
    return;
  }

  const requests = [];

  for (let i = 1; i <= count; i++) {
    requests.push({
      addDocumentTab: {
        tabProperties: {
          title: 'Post ' + i,
          parentTabId: parentTabId
        }
      }
    });
  }

  const response = UrlFetchApp.fetch(
    `https://docs.googleapis.com/v1/documents/${docId}:batchUpdate`,
    {
      method: 'post',
      contentType: 'application/json',
      headers: {
        Authorization: 'Bearer ' + token
      },
      payload: JSON.stringify({ requests }),
      muteHttpExceptions: true
    }
  );

  Logger.log(response.getResponseCode());
  Logger.log(response.getContentText());
}

How to use this script

  1. Open your Google Doc.
  2. Go to Extensions → Apps Script.
  3. Create a new project and paste the code.
  4. Change the parentTabId to the tab where you want the sub-tabs to be created.
  5. Click Run and authorize the script the first time.
  6. The script will automatically create numbered sub-tabs until the Google Docs limit (100 tabs) is reached.

Who This Is Actually Useful For

Anyone who uses Google Docs for documentation, knowledge management, or organized content creation.

A few examples:

  • SEO projects — Technical Audit, Keyword Research, Content Strategy, Internal Linking, Backlink Planning, Monthly Reports, all in one document.
  • Product managers — Features, Sprint Planning, User Feedback, Release Notes, Documentation in a single place.
  • Project managers — Requirements, Meeting Notes, QA, Development Updates, Handover Documents, no tab-switching between files.
  • Content creators — One document, one sub-tab per piece. Easier to navigate, easier to search, easier to manage than a folder full of separate files.

The Takeaway

A five-minute script replaced hundreds of manual clicks and cleaned up a workflow I use every day.

The more interesting observation is that a relatively new Google Workspace feature — one that has been available for less than a year — is still a blind spot for most current AI models. They either deny it exists or generate broken code. A bit of manual debugging was needed to get there.

Sometimes that is just how it goes with new APIs. The documentation exists. The feature works. The models have not caught up yet.

Shopify vs WooCommerce: What Google Sees on Day One

I managed 36 stores. Not consulted on. Not audited. Managed. End to end.

My own stores. Client stores. Niche dropshipping to full catalogue eCommerce. I know what a badly configured WooCommerce install costs — in speed, in crawl budget, in lost conversions.

So when I say Shopify is not the enemy, I mean it. It solves real problems for businesses that need to move fast.

But there is something nobody tells you when you are choosing between the two.

Google loves Shopify from day one. And it has nothing to do with the platform being better.

It is infrastructure. It is defaults. It is decisions Shopify made before you created your account.

Here is what that looks like across four areas.


Robots and Crawling

Check any Shopify store’s robots.txt before the owner has touched anything. You will find dozens of disallow rules already in place. Admin pages blocked. Checkout duplicates blocked. Internal search parameters blocked. Filter and tag combinations blocked.

Googlebot lands and gets clear instructions. It knows where to go and what to ignore. Crawl budget goes to the right pages.

A new WooCommerce install has an empty robots.txt.

Googlebot lands and crawls everything. Your cart. Your account pages. Your internal search results. Your duplicate filter URLs. All of it. Crawl budget burned on pages that should never be indexed.

This happens on day one, before you have written a single product description.


Hosting and Caching

Shopify runs on Google Cloud. Global CDN. Fast response times out of the box.

Most WooCommerce stores launch on shared hosting. Overcrowded servers. No real CDN. Some hosting environments run automated background requests that generate noise Google reads as spam signals.

This is not a WordPress problem. It is a decision problem. Most store owners pick the cheapest option and move on.

A properly hosted WooCommerce store on cloud infrastructure with a CDN in front will outperform a mid-tier Shopify store on every speed metric. That configuration just does not happen by default.


Tracking

Shopify’s native integrations with Meta and Google handle basic pixel and conversion events out of the box. The pixel loads. Standard purchase events fire.

Enhanced Ecommerce tracking is a different story. Shopify hosts the cart and checkout outside your storefront. That means purchase event data does not flow through your theme. Tracking it requires paid apps or custom workarounds most store owners never set up correctly.

On WooCommerce you build all of it from scratch. Some store owners layer three or four plugins to handle what should be one clean setup. Each plugin adds scripts to the page. Scripts conflict. Page speed drops. Events fire in the wrong order or stop firing after an update.

Most store owners only discover the problem after they have spent on ads and the data does not match reality.


Apps and Plugins

Every app in the Shopify App Store is built for Shopify. It installs, it connects, it works. No compatibility testing. No conflict checking. The platform is consistent enough that app developers can build to a known standard.

WooCommerce has thousands of plugins. Choosing the right one requires research. Does it conflict with your theme? Does it slow the page? Does it break when WooCommerce updates? Does it do one thing well or ten things poorly?

Most store owners do not ask these questions. They search, install, test, uninstall, repeat. The site gets slower with every cycle.

The better approach on WooCommerce is to use plugins only where there is no other option. For everything else — adding fields, changing button text, redirecting after purchase, modifying the checkout flow — a single snippet of PHP in the right place does the job without the dependency.


How WooCommerce Closes the Gap

Same four areas. All manual.

Robots.txt — before you add a product:

User-agent: *
Disallow: /wp-admin/
Disallow: /cart/
Disallow: /checkout/
Disallow: /my-account/
Disallow: /?s=
Disallow: /shop/?orderby=
Disallow: /shop/?filter_
Disallow: /tag/

Adjust for your URL structure. The point is to give Googlebot instructions before it visits.

I built a free GPT that audits your existing robots.txt and builds one from scratch

Hosting — cloud infrastructure with a CDN. Not expensive. Just not the default. Sub-200ms server response times and a real CDN solve most of the speed gap before you touch a single plugin.

Tracking — GTM, not plugins. Build the data layer before you run your first campaign. Meta Pixel with Purchase, Add to Cart, and Initiate Checkout events. GA4 with the events verified in DebugView. One GTM container managing all of it. Do not use plugins for tracking. They add scripts that conflict, miss events on cached pages, and break on updates.

Plugins — fewer is better. Every plugin you install is a dependency. Every dependency is a potential conflict. Before you install anything, ask whether a short PHP snippet in functions.php solves the same problem. Most of the time it does.


The Real Comparison

Shopify is managed infrastructure with locked doors. You pay for the simplicity and you pay again when you need to open a door that requires a paid app.

WooCommerce is unmanaged infrastructure with open doors. You pay with time and technical understanding. When you configure it correctly, nothing is locked.

Both platforms work. They work for different situations.

Small business that needs to move in a week: Shopify. The defaults keep you out of trouble while you figure out the business.

Building something long-term with custom workflows and no platform dependency: WooCommerce. Configure it like you mean it from day one.

Google does not love Shopify because Shopify is better.

Google loves Shopify because Shopify configured the basics before you arrived.

WooCommerce gives you the same result. You just have to do the work yourself.


Before You Build on WooCommerce

I am building a pre-launch WooCommerce checklist and a no-plugin snippet series at WPLifter — everything to configure before you go live, without the dependency bloat. Signup from here if you want either when they are ready.