All posts
Ray Iyer
Ray Iyer
Co-founder, Anglera

Adding Product JSON-LD on a headless storefront — and keeping it in sync

How to add schema.org Product JSON-LD to a headless storefront, which fields matter most, and how to keep markup in sync with the rendered page.

Adding Product JSON-LD on a headless storefront — and keeping it in sync

Once your product data is enriched — accurate names, brand, identifiers, specs, and use-cases — the remaining problem is mechanical: getting that data onto the page as structured data that search crawlers and AI agents can both parse reliably. On a headless storefront (Next.js, Remix, Nuxt, Astro, or a custom React/Vue front end talking to Shopify Hydrogen, commercetools, BigCommerce, or a similar API-first backend), there's no theme layer auto-injecting JSON-LD, so it has to be built and maintained deliberately. Here's how to do that and keep it from drifting out of sync with what shoppers actually see.

Where the JSON-LD should live

Google recommends JSON-LD over microdata or RDFa because it's easier to maintain at scale, and Googlebot does support structured data injected dynamically via JavaScript, once it renders the page. But that rendering is a second, queued pass, not what happens on first crawl, and most AI crawlers (the ones behind shopping agents and answer engines, as opposed to Googlebot) don't execute JavaScript at all — they only ever see the initial HTML response. On a headless stack, the safest default is emitting the JSON-LD during server-side rendering or static generation, not injecting it client-side after hydration. If your framework supports SSR or SSG for the product route (Next.js server components, Remix loaders, Nuxt, Astro server islands), emit the JSON-LD there, from a small server-side function that takes your normalized product object and returns the payload — called from the same data-fetching path that renders the page body, not a separate client-side call. Same source, same render pass: that's what prevents most sync bugs.

Which fields actually matter

Google distinguishes two markup profiles under the same Product type: product snippets (pages where the product can't be bought directly, e.g., an editorial page) and merchant listing markup (actual purchase pages, which almost every retailer PDP is). For merchant listing eligibility, the required fields are name, image, and a nested offers object with price and priceCurrency. Everything else is "recommended," but in practice determines whether Google, and AI shopping agents parsing the same markup, can identify and rank your product:

  • name — should match the on-page H1 exactly, not a truncated or keyword-stuffed variant.
  • brand — a nested Brand object; this is one of the signals Google and shopping-focused AI agents use to match your listing to a known product entity rather than treating it as generic.
  • sku — your internal identifier. Useful for your own systems but not a cross-retailer identifier.
  • gtin (or gtin8/gtin12/gtin13/gtin14/isbn) — the actual global identifier (UPC/EAN/ISBN). This lets Google and AI agents match your product to the same item sold elsewhere, which matters for comparison-shopping surfaces and price-comparison answers. Omit it if you genuinely don't have one (private-label items often don't), but don't fabricate one — Google treats structured data that doesn't match reality as a policy violation.
  • mpn — manufacturer part number, useful alongside GTIN for durable goods and electronics.
  • offers — a single Offer with price, priceCurrency (ISO 4217, e.g., USD), availability (a schema.org ItemAvailability value like https://schema.org/InStock), and url. Google's merchant listing markup requires Offer specifically — AggregateOffer is only accepted on product-snippet pages, not on a page where the product is actually for sale. If your PDP defaults to one variant, offers.price should reflect that variant, not the catalog's lowest or first price. For a page listing multiple variants, model it as a ProductGroup with hasVariant, giving each variant its own Product and its own single Offer, per Google's product-variants documentation.
  • aggregateRating — only include this if you have real, on-page reviews. It requires ratingValue and reviewCount (or ratingCount), plus bestRating/worstRating if your scale isn't 1–5. This is the field most often flagged in Search Console because it's easiest to let drift out of sync (see below).

Populating these fields without hand-authoring per SKU

None of this should be hand-written per product page. The JSON-LD generator should read from the same normalized product record your page component renders from — typically whatever your PIM or commerce API returns after enrichment, mapped once into a Product shape at the data layer. If your PIM stores GTIN/UPC, brand, and structured attributes as first-class fields, the mapping is closer to a rename than a transform. If those fields are inconsistently populated, the JSON-LD will inconsistently reflect that — structured data can't invent identifiers or specs your PIM doesn't have.

Keeping JSON-LD in sync with the rendered page

This is where most headless implementations break, usually invisibly. The failure modes to guard against:

  • Divergent data sources. If price/availability come from a live inventory API for the visible page, but the JSON-LD was generated from a cached or stale feed, they'll disagree during flash sales or stockouts. Google's structured data policies require markup to be "a true representation of the page content" — a mismatch like this is exactly what gets flagged.
  • Rating drift. If a live review widget loads client-side from a reviews platform but the JSON-LD aggregateRating was baked in at build time, the two will disagree as new reviews come in. Regenerate aggregateRating from the same reviews API the widget reads, on the same render cycle.
  • Variant mismatches. If your PDP defaults to a specific variant (size/color) based on the URL or a query param, the JSON-LD should reflect that variant's price and SKU, not the catalog's cheapest or first variant.
  • Client-side-only injection. Avoid patterns that inject the JSON-LD script tag after the page has already painted — it works for a browser, but many AI crawlers and some Googlebot passes won't wait for it.

The simplest guard rail is architectural: write one function that both the visible price/name/availability UI and the JSON-LD generator call, so there is exactly one source of truth per field, not two implementations that can silently diverge.

A real example

{
  "@context": "https://schema.org/",
  "@type": "Product",
  "name": "Trailhead 32L Daypack",
  "image": [
    "https://example.com/images/trailhead-32l-1x1.jpg",
    "https://example.com/images/trailhead-32l-4x3.jpg",
    "https://example.com/images/trailhead-32l-16x9.jpg"
  ],
  "description": "32-liter daypack with a hydration sleeve, hip-belt pockets, and a rain cover, built for single-day hikes and light overnights.",
  "sku": "TH-32L-GRN",
  "mpn": "TH32-GRN-001",
  "gtin13": "0810055551234",
  "brand": {
    "@type": "Brand",
    "name": "Trailhead Gear"
  },
  "offers": {
    "@type": "Offer",
    "url": "https://example.com/products/trailhead-32l-daypack",
    "priceCurrency": "USD",
    "price": 129.00,
    "availability": "https://schema.org/InStock",
    "itemCondition": "https://schema.org/NewCondition"
  },
  "aggregateRating": {
    "@type": "AggregateRating",
    "ratingValue": 4.6,
    "reviewCount": 214
  }
}

Render this inside a script tag (type application/ld+json) in the page's server-rendered output.

How to validate

  • View source vs. rendered DOM. Run curl -s https://example.com/products/trailhead-32l-daypack | grep -A 40 "application/ld+json" (or view your framework's SSR output directly) to confirm the JSON-LD ships in the initial HTML response, not just in the browser's rendered DOM. If it only shows in the dev tools' "Elements" panel but not in "View Source" or curl output, it's being injected client-side and needs to move server-side.
  • Rich Results Test. Paste the live URL into Google's Rich Results Test to see it fetched and rendered the way Googlebot does, or paste the raw JSON-LD snippet for a quick syntax and required-field check before deploying. The tool checks syntax and required properties — it does not check whether aggregateRating or price actually match what's rendered on the page, so a green check isn't proof of sync; that's a separate audit against the live DOM.
  • Search Console. Once indexed, the Merchant listings report surfaces field-level warnings across your catalog (missing GTIN, mismatched availability, etc.) at scale, which a one-off test on a single URL won't catch.

Verified as of July 2026 against Google's Search Central documentation for Product structured data. These field requirements are Google-specific rich-result rules, not schema.org requirements — schema.org itself doesn't enforce required properties, so recheck Google's pages directly for any given rich-result feature, since they evolve.

None of this JSON-LD is useful if the underlying fields — GTIN, brand, structured attributes, use-case descriptions — aren't populated and current in your PIM to begin with. That's the half of this problem Anglera handles: it continuously enriches product data at the source, so whatever mapping layer you build here has real, accurate values to render instead of blanks.

Ray Iyer

About the author

Ray IyerCo-founder, Anglera

Ray is a co-founder of Anglera, building the product-data infrastructure for agentic commerce — turning messy catalogs into structured, AI-readable data that buyers and answer engines can find. Previously product at Uber; Stanford CS.

See it on your own SKUs.

A 30-minute walkthrough on your categories and your supplier data.

Book a demo