We pointed our scanner at 2,704 public Shopify storefronts and graded each one the way an AI shopping agent would read it. The headline: the median store scored 89 out of 100, and 47.3% earned an A. Nobody scored an F. On paper, Shopify stores look ready for agentic commerce.
Then you look at the pillars, and the story flips. Three of the five things we measured were close to perfect across the board. One was a disaster. Stores are almost universally crawlable, well-marked-up, and easy for a machine to read. What they are not is trustworthy to an agent. The trust pillar, ratings, FAQ, shipping, and returns signals, had a median of 67, dragging behind everything else by twenty points or more.
That's the real finding. AI readiness on Shopify is no longer a crawlability problem. The platform solved that for you. It's a trust-signal problem, and it's wide open. This post walks through the full data, decomposes the trust gap, and shows you how to reproduce the scan on your own store in about two minutes.
What we measured, and why one number isn't enough
A single AI-readiness score is a useful headline and a terrible diagnosis. It tells you roughly how an agent sees you. It doesn't tell you what to fix, and on Shopify the "what" turns out to be lopsided in a way the headline number hides.
So we score five pillars, each out of 100, then roll them into the composite. The pillars map to the questions an agent asks while reading a storefront:
- Identity: can a machine tell who this brand is? Organization markup, a real name, consistent NAP-style signals.
- Products: is each product marked up with the attributes an agent needs to compare offers? Price, availability, identifiers, structured
Productdata. - Discovery: can crawlers actually reach the content?
robots.txt, rendered HTML, no walls between the agent and the facts. - Answer: does the page state facts an agent can lift verbatim, instead of burying them in imagery or script?
- Trust: are there signals that make an agent comfortable recommending you? Ratings, FAQ data, shipping terms, return policy, all as machine-readable structure rather than prose a human reads and a parser ignores.
If you've read how the AI-readiness checker works, this is the public-scan version of the same philosophy: read only what an agent reads, score what an agent acts on. The composite score and letter grade are explained in more depth in your AI-readiness score, explained.
How we scanned 2,704 storefronts, stated honestly
This is a study of public storefronts, not of installed-app data. Every number here comes from fetching pages the same way an AI agent or crawler would, then parsing the HTML and structured data. We never logged into a store, never read an Admin, never touched private data. If an AI shopping agent couldn't see it, neither did we.
The sample is 2,704 live, public Shopify stores drawn to span verticals and sizes, deduplicated by domain, and filtered to storefronts that returned a real homepage and at least one product page at scan time. For each store we fetched the homepage and a representative product page, extracted JSON-LD and microdata, checked robots.txt and crawl reachability, and ran the five-pillar scoring.
The honest caveats, because data journalism without caveats is marketing:
- A snapshot, not a panel. We scanned each store once. Stores change; a store's score on the day we read it isn't a permanent grade.
- Public surfaces only. We can't see how complete a store's data is inside Shopify Admin, only what reaches the rendered page. A store could have rich Admin data that never surfaces, and we'd score the surface.
- Heuristic trust detection. We detect trust signals by their machine-readable footprint (review schema,
FAQPagemarkup, structured policy data). A store that states its return policy only in a footer paragraph has the policy but not the signal, and an agent treats it the same way we do: as absent.
With that on the table, here's what we found.
The grade distribution: almost half are A's, nobody fails
| Grade | Score range | Share of stores |
|---|---|---|
| A | 90-100 | 47.3% |
| B | 75-89 | 33.5% |
| C | 55-74 | 13.6% |
| D | 40-54 | 5.5% |
| F | below 40 | 0.0% |
Read top to bottom and the curve is reassuring. Four out of five stores land at B or better. The median is 89, one point under the A line, which means the typical store is a single fix away from the top grade.
Read bottom to top and it gets interesting. Zero stores scored an F. That's not a rounding artifact. It's structural. An F requires being effectively unreadable: uncrawlable, no product markup, no machine-readable identity. Modern Shopify ships crawlable HTML and baseline Product schema out of the box, so the floor is high whether the merchant did anything or not. The platform dragged the entire population up off the bottom.
Which raises the obvious question. If everyone's crawlable and almost half are A's, what are the 52.7% who aren't A's missing? The pillar data answers it.
The five pillars: four solved, one open
Here's the median score for each pillar across all 2,704 stores:
| Pillar | Median score |
|---|---|
| Discovery | 100 |
| Answer | 100 |
| Products | 86 |
| Identity | 80 |
| Trust | 67 |
Discovery and answer are effectively solved. A median of 100 means the typical store is fully crawlable and states its facts in a readable way. This is Shopify doing its job. The agent can reach you and can read you.
Products and identity are good and getting better. A median of 86 on products says most stores carry the core Product attributes an agent needs, with room to tighten identifiers and attribute completeness. Identity at 80 reflects stores that have a name and basic markup but often skip full Organization structure. These are the pillars where careful merchants pull ahead, and they're worth real work, especially product identifiers like GTINs, which we covered in GTINs and barcodes for AI shopping.
Trust is the outlier, and it's not close. A median of 67 means the typical store is missing roughly a third of the signals that make an agent comfortable recommending it. This is the pillar that decides grades. The stores stuck at B and C aren't failing on crawlability or markup. They're failing on trust.
The headline: everyone's crawlable, almost nobody's trustworthy
Put the two findings together and you get the reframe this whole study is built around.
The old mental model of "AI readiness" was a crawlability problem. Can the bot reach me? Can it parse my products? For years that's where the work went: robots files, rendered HTML, valid schema. On Shopify, that problem is done. Discovery and answer both median 100. The platform handles it.
The problem that replaced it is trust. An agent that can perfectly read your catalog still has to decide whether to put you in front of a shopper. That decision runs on signals you mostly have to add yourself: ratings it can verify, an FAQ it can quote, shipping and returns terms it can state with confidence. Those signals are optional, app-driven, and therefore the place where stores diverge. A median of 67 means the field has barely started.
That's a strange and lucky position to be in. The hard, infrastructural part of AI readiness is free and automatic. The part that's still wide open is the part you can close in an afternoon with the right markup and a couple of apps. For the strategic version of why these signals matter to an agent's decision, see the Shopify agentic commerce guide.
The trust gap, decomposed
"Trust scored 67" isn't actionable on its own. Here's what's actually missing inside that pillar, in rough order of how often it's the culprit.
Ratings as data, not pixels. Plenty of stores show star ratings on the page. Far fewer expose them as machine-readable AggregateRating markup an agent can verify. A human sees four-and-a-half stars; a parser sees a background image and a number it can't trust. This is the most common single gap, and it's the cleanest to fix because most review apps already emit the markup, you just have to make sure it's on the page and not stripped by the theme.
FAQ structured data, almost universally absent. In our State of AI Shopping Readiness study, 91.6% of stores were missing FAQ structured data. An FAQ is the highest-density trust artifact you can give an agent: a list of literal question-answer pairs it can lift verbatim to answer a shopper. Most stores have FAQ content somewhere. Almost none expose it as FAQPage schema. If you fix one thing after reading this, fix this. The Shopify FAQ schema guide walks the implementation.
Shipping terms an agent can state. "When will it arrive, and what will it cost?" is one of the first questions a shopping agent resolves. When shipping terms live only in a policy page paragraph, the agent has to guess or hedge. Structured OfferShippingDetails turns the guess into a fact.
Returns, same story. A return policy written for humans is invisible to a parser. A return window an agent can read becomes a reason to recommend you over a competitor whose terms are a black box. Agents are risk-averse on the shopper's behalf; a clear, machine-readable returns signal directly reduces the risk of recommending you.
Notice the pattern. In every case the content usually exists. What's missing is the structure that turns human-readable prose into an agent-readable signal. That's why the trust gap is so closable: you're rarely creating policy, you're exposing it.
What a top-quartile store does differently
We compared the median store (89) against the top quartile (median 92) to see where the extra points come from. It is almost entirely trust.
Top-quartile stores aren't meaningfully more crawlable, those pillars are already maxed for nearly everyone. They aren't dramatically better on product markup either; the gap there is a few points. The separation is trust signals. The stores at the top carry verifiable ratings markup, real FAQPage data, and structured shipping or returns terms. The stores in the middle have one or two of those, or have them as prose instead of data.
The practical read: if you're sitting at a B or a high C, you don't have an infrastructure problem and you don't need a replatform or a developer sprint. You have three or four specific trust signals to add. That's the whole gap. The free AI-readiness checker will name exactly which ones you're missing.
Reproduce this on your own store in two minutes
Every number in this study came from a public scan, which means you can run the same scan on your own storefront, or a competitor's, right now.
- Open the free AI-readiness checker and enter your store URL. It reads only public pages, the same surfaces an agent sees.
- Read the five pillar scores, not just the composite. If you're like the typical store in this study, discovery and answer will be high and trust will be your weak point.
- Start with the lowest pillar relative to its max, which for most Shopify stores is trust. Add verifiable ratings markup, expose your FAQ as
FAQPageschema, and structure your shipping and returns terms. - Re-scan. Trust fixes move the composite fast because the pillar starts so low, so a couple of changes can lift you from a B to an A.
The benchmark says the floor is high and the ceiling is reachable. Almost half of Shopify stores already clear it, and the ones that don't are usually one trust pillar away. Run the free AI-readiness checker on your store, see where your trust signals stand against the field, and close the gap that's actually open.

Comments
Every comment here comes from a verified email. Write yours, confirm from your inbox, and it's live.
Loading comments…