Feedback

Email:

Content:

Home  /  Database  /  SBS Data
Understanding Data Quality
When talking with data providers, the term ¨data quality〃 is often used. Data quality is usually described by the terms Overall Match Rate, Elemental Match Rates, and Accuracy. These are often the only factors that some companies consider when making a data purchase or formulating a test. This is a mistake. While these measurements are important, there are other factors that may be just as, or more, important in the final application of the purchased data. Before we discuss these, however, letˇs define the terms.
Overall Match Rate: This refers to the amount of records1 you receive from your data provider with respect to the number you submitted for enhancement. Enhancement is defined as the addition of information to an individual consumer record. For example, if you sent in a list of 1,000 of your customer names and the data provider returned data on 800 of those, you would have an overall match rate of 80%. This applies only to the total amount of records with data provided, not the amount of data appended to each record.
When comparing data providers, many companies find match rates to be an extremely important variable, especially in modeling. Low match rates may mean that the data provider does not have a large enough representation of your customer base to give you the information you need. An overall match rate of 70-80% should be expected as a minimum from most large providers.
Elemental Match Rates: This refers to the number of elements2 requested for each record versus the total number of elements appended to your file. Not all providers will be able to supply you with all of the elements you require. Conversely, some may have all of the elements present but return very few of them, i.e., there may be several blank fields.
When comparing data providers be sure that you are able to receive all of the data elements you request. A company providing a 100% match rate but returning only half of the elements you desire is probably not going to meet your needs.
It is also important to look at the average number of elements2 returned per record for the elements provided. A 100% overall match rate with a 50% elemental match rate implies that 1/2 of their database for this element contains blank fields.
Be aware that some companies measure elemental match rates as the ratio of elements appended to matched records. In the 1,000 record example above, they would measure an ordered element with 600 matches for a single element as a 600/800 (80% overall match rate). This would compute as a 75% elemental match rate. The real way to measure this number is 600/1000, or 60%. Donˇt be fooled by these inflated numbers.
Accuracy: At first glance accuracy would seem to be a simple thing to measure. Just pick 10 records at random then call those people and validate the information, right?
Unfortunately it is not that easy. In order to be statistically accurate your records should be chosen at random, be diverse, and of sufficient size (10 is usually not a sufficient size). In addition it is very important that the data is tested against a valid benchmark. See the data test section later in the paper for a more detailed explanation.