Matching intent to SKU: how attributes turn a search into the right product

Why "waterproof hiking boot size 10 wide" fails at the exact moment of intent, and the search metrics that show you where attributes are missing.

A buyer doesn't type "boot." They type "waterproof hiking boot, size 10, wide." Every clause in that query is a bid for an attribute — material, category, size, width — and your catalog either has a value sitting there ready to match it or it doesn't. When it doesn't, the buyer doesn't get "no results." They get the wrong result, or a result that looks right until they open the PDP and find out it isn't. That's the moment intent dies, and it's measurable, attribute by attribute, if you know where to look.

Queries are attribute bundles, not keywords

Modern site search and faceted navigation don't match strings, they match structured fields. A query like "stainless steel 30-inch gas range" is really four filters stacked in one box: material = stainless steel, width = 30in, fuel_type = gas, category = range. If any one of those attributes is blank, null, or stored under an inconsistent label on a given SKU, that SKU silently drops out of the candidate set — even if it's the exact product the buyer wants.

Baymard Institute's research on filter UI makes the mechanism explicit: users "instinctively expect a corresponding filter for any key attribute displayed in a product listing," and category-specific attributes (RAM and processor speed for electronics, fit for apparel, delivery options for home improvement) are what separates a usable filter set from a generic one. Their audits also found that naming inconsistency — "Colour" vs. "Color" vs. "Shade" — undermines buyer confidence in the filter itself, and that a meaningful share of sites bury or hide filters that should exist because the underlying attribute data isn't clean enough to expose reliably. (baymard.com)

This is the taxonomy problem hiding inside a "search" problem. Your search engine is not broken. Your attribute schema is incomplete, inconsistently populated, or inconsistently normalized across suppliers — and search just happens to be where the gap becomes visible to a buyer with money in hand.

Why the gap costs more than a bad search result

A missed attribute match doesn't just cost one session. Algolia's aggregated e-commerce data shows visitors who use on-site search convert meaningfully higher than average site visitors (4.63% vs. 2.77% in one large sample), and that when search succeeds the overwhelming majority of shoppers buy the item they searched for, often adding more to the basket. The same data shows the flip side: roughly a fifth of searchers who don't find what they want refine their query, and a similar share exit the site from the search results page entirely. (algolia.com)

Refinement isn't neutral behavior — it's the buyer doing your data cleanup for you, in real time, on a page where they're one click from a competitor. Every refinement is a vote that the first result set didn't honor the attributes they specified. If your logs show refinement clustering around specific queries ("waterproof," "wide," "compatible with"), that's a pointer straight at which attribute is thin or missing across the catalog.

The four numbers that show you where the match breaks

You don't need a new analytics stack to see this — most site search and analytics tools already log the raw events. The work is connecting them to specific attributes instead of treating "search" as one undifferentiated funnel stage.

Metric	What it shows	How to measure it
Search CTR (click-through on results)	Whether the top results actually match query intent	Clicks on result items ÷ searches, segmented by query pattern, in your search analytics or GA4 event stream
Filter usage rate	Whether buyers trust filters enough to narrow with them, and whether the right facets even exist	Sessions with ≥1 filter applied ÷ total search/category sessions, from your search platform's facet-click events
Query refinement rate	Where the first attempt failed to honor stated intent	Sequential searches within a session ÷ total searches; spike by query term to find the attribute gap
Result-to-PDP rate	Whether the result set is credible enough to open	PDP views from search results ÷ total search result impressions, by category or attribute filter combo

Read these together, not in isolation. High filter usage with a high refinement rate on the same category is a strong signal that the filters exist in the UI but the underlying attribute values behind them are sparse — buyers are trying to narrow and getting punished for it. A low result-to-PDP rate on a specific query cluster (say, anything mentioning a compatibility spec) usually means that attribute isn't populated widely enough for the engine to surface confident matches, so it's either returning near-misses or padding results with loosely related SKUs.

Zero-result and near-zero-result queries deserve their own weekly pull. Baymard's broader filtering research and general site-search benchmarking both point to the same pattern: a meaningful share of "no results" and "poor match" outcomes trace back to attribute coverage gaps rather than search technology limits. Pulling your top zero- and low-result queries weekly and mapping each one to a missing or inconsistent attribute turns a vague "search is bad" complaint into a prioritized backlog.

Where the fix actually lives

None of this is a search engine problem to solve with better ranking weights alone. Ranking can only work with the attribute values it's given. If width is populated on 40% of your footwear SKUs and material is spelled three different ways across two supplier feeds, no amount of relevance tuning recovers the other 60% of matches. The fix is upstream: consistent attribute schemas, complete values extracted from source documents rather than left blank, and normalization across every naming variant a supplier or catalog import introduces.

That's the layer Anglera works at. Your PIM stores the attribute — Anglera does the work of finding it in the source documentation, filling the gap, scoring the confidence, and keeping it consistent as new SKUs and suppliers arrive, without requiring a rip-and-replace of your PIM or search stack. Most teams can be live in about 30 days, starting from whatever flat file or feed they already have. Measuring the funnel above tells you exactly which attributes are worth fixing first — matching intent to SKU is a data problem before it's ever a search problem, and it shows up in the numbers long before anyone complains.

Matching intent to SKU: how attributes turn a search into the right product

Queries are attribute bundles, not keywords

Why the gap costs more than a bad search result

The four numbers that show you where the match breaks

Where the fix actually lives

Related reading

How to measure the ROI of product data: a practical framework

Building a product-data scorecard your whole team trusts

Right product, right buyer, right moment: the real job of product data

See it on your own SKUs.