Scoring product-data quality so it improves instead of decaying

How to score product-data quality across completeness, consistency, accuracy, and richness, set a real bar, and keep catalogs improving instead of decaying.

Most catalog cleanups follow the same arc: a big enrichment push, a temporary spike in quality, then a slow slide back toward chaos as new SKUs, supplier updates, and marketplace exports pile back on. The problem isn't that teams don't care about data quality. It's that they measure it once, as a project, instead of scoring it continuously, as a metric. Here's how to build a scoring system that actually holds the line.

Why catalogs decay by default

Product data isn't static, even when nobody touches it. Suppliers revise spec sheets, categories get restructured, new attributes become mandatory for a channel, and last quarter's "complete" record quietly falls behind. This mirrors the broader pattern of data decay across business systems, where records erode continuously unless something actively maintains them (Object Edge). A PIM doesn't stop this. A PIM stores whatever was true (or good enough) on the day someone entered it. It has no opinion about whether that record still meets the bar six months later, or whether the bar itself has moved because a retailer or an AI answer engine now expects more.

That's the core distinction worth internalizing: your PIM is a system of record, not a system of quality. Scoring has to sit on top of it.

The four dimensions worth scoring

Data quality literature converges on a consistent set of dimensions, and for product data specifically, four map cleanly to buyer and channel needs (Atlan, GS1):

Dimension	What it measures	Example failure
Completeness	Are required and channel-specific fields populated?	Marketplace requires 8 bullet points; record has 3
Consistency	Does the same attribute match across SKUs, categories, and systems?	"Voltage" stored as `24V`, `24 volts`, and `24-Volt` in the same category
Accuracy	Do values match the true spec, not just something plausible?	Weight copied from a similar SKU during a rushed import
Richness	Is there enough structured, buyer-relevant detail to answer real questions?	Dimensions listed, but no material, load rating, or compatibility data

Retail data-quality programs, including GS1's, treat physical attributes and net content as high-stakes fields precisely because errors there cascade into returns, compliance issues, and even GTIN reassignment requirements (GS1 US). Distributors should treat their highest-return, highest-search categories the same way: score them harder than the long tail.

What a score actually looks like

A useful score is not a single number pulled from a vibe. It's a weighted composite per SKU, rolled up by category, brand, and supplier, so you can see where the catalog is actually weak instead of guessing.

For example, a mid-tier scoring model might weight completeness and accuracy higher for categories with high return rates or high search volume, and weight richness higher for categories where buyers compare technical specs before purchase (industrial components, electronics, safety equipment). The output isn't "94% complete" as a vanity metric. It's a ranked list: these 400 SKUs are below the bar, here's why, here's the fastest fix.

Before and after: a torque wrench listing

Raw supplier feed description:

"Torque wrench 1/2 drive adjustable"

Enriched attribute table:

Attribute	Value
Drive size	`1/2 in`
Torque range	`10-150 ft-lb`
Accuracy rating	`±4%`
Handle type	Ergonomic, non-slip grip
Calibration	Factory-calibrated, ISO 6789 compliant
Case included	Yes, molded storage case
Use case	Automotive, HVAC, general maintenance

The raw feed has four words of information. The enriched version answers the questions a buyer, a distributor's search filter, and an AI answer engine all ask independently.

Ask an answer engine "what torque wrench works for automotive lug nuts and is ISO calibrated," and only the enriched record has the structured attributes to surface as a confident match. The raw description doesn't contain the words "calibration," "ISO," or "torque range" at all, so it's invisible to that query even if the product is the right one.

Setting a real bar, then holding it

A bar only works if it's specific and enforced at the point of ingestion, not discovered in a quarterly audit. Practical thresholds worth adopting:

Completeness: 95%+ on required fields for products actively selling (Atlan cites similar thresholds as standard practice across product data programs).
Consistency: zero tolerance on unit-of-measure and naming variance within a category, since this is the cheapest defect to catch and the most damaging to search and filtering.
Accuracy: values traceable to a source document, not inferred by analogy to a similar SKU.
Richness: a defined minimum attribute count per category, set by what buyers and retail requirements actually ask for, not by what's easy to fill in.

Below the bar should trigger action automatically, not sit in a dashboard. Above the bar should be revalidated on a cadence, because "passed once" and "still true" are different claims.

Continuous scoring instead of periodic cleanup

The reason cleanups don't stick is that they treat quality as a project with an end date. A scoring system that runs continuously catches drift as new SKUs land, suppliers push updates, or a category's requirements change, and it flags what actually fell below the bar instead of forcing a full re-audit. That's the difference between a catalog that improves and one that just gets cleaned periodically while decaying in between.

This is the problem Anglera is built to sit on top of. Your PIM stores the data; Anglera scores it against completeness, consistency, accuracy, and richness continuously, gap-fills from real supplier and source documents rather than guessing, and keeps flagging what drifts below the bar as the catalog changes. It's additive to whatever PIM you already run, or to a flat file if you don't have one, and most teams see it working inside 30 days rather than committing to a multi-year systems overhaul.

Scoring product-data quality so it improves instead of decaying

Why catalogs decay by default

The four dimensions worth scoring

What a score actually looks like

Before and after: a torque wrench listing

Setting a real bar, then holding it

Continuous scoring instead of periodic cleanup

Related reading

Stop fixing your product data at the exit

Product data enrichment is the cheapest growth in ecommerce

Your PIM added an AI button. It didn't add an enrichment team.

See it on your own SKUs.