It’s Time to Talk about Data Quality


Andrew Bullock,  CTO/VP Data Services

We use data to communicate with our customers. But what happens when the data we intend to share, shows up differently than what we expected?

Data quality – the idea that data in question appropriately serves the business function for which it is intended – is an important concept. It is also important to note that the business needs demanding that data can expand over time, outgrowing those original quality specifications and leading to unanticipated results.

For example, what happens when a newly implemented external marketing platform exposes data as dynamic content to your customer base? An organization may assume it has done its due diligence if the data is fit for previously defined business purposes it was intended to serve. This is a dangerous assumption however, and can lead to campaign development challenges, content issues, and ultimately a decline in customer perception of your brand.

Whether it's due to an internal data technology team not fully comprehending the complexity of a given platform, or data being offered up to the customer that has never been previously leveraged externally, or simple corruption due to mundane encoding issues ("UTF-huh?"), blunders are more common than you may think. Product descriptions looking like someone ran them through a virtual paper shredder, irrelevant customer service notes appearing in unexpected fields, and other such manifestations can negatively impact the customer’s experience. Regardless of the cause, the result is the same: your brilliant dynamic digital marketing effort ends up looking like it was put together by an inebriated robot.

Simply having data isn't enough. You must make data quality a core aspect of your marketing system implementations.

A good rule of thumb is that if data isn't suitable for website presentation, it's probably not suitable for presentation in any other digital media. An example of such data might be any data that was keyed in by a customer support representative - ask Comcast about how this can go wrong.

While in the Comcast case the issue showed up in automated paper billing, you get the idea. The people entering the data in a particular field (or creating the processes that do so) may not have any idea that it might be exposed to parties outside the organization.

Communication is key. Make sure the data team in your organization understands what you are doing with this data, beyond simply needing it. Data being leveraged for customer segmentation and business rules has a different set of quality considerations and implications than data being purposed for content. Take nothing for granted. Your data team may know something you don't. Something that could save you from embarrassing your brand in front of your entire base ... or worse.

Then, test. Then test again. Then test more. Some commonly observed issues are:

  1. Misattributed data such as address data showing up in name fields, price data showing up in product names, and other such mistakes.
  2. Encoding errors, which are special characters, and sometimes not-so-special characters, showing up as gibberish in your generated content. This is typically a result of character encoding inconsistencies between your internal data system and your marketing system. This can be avoided by being sure the correct encoding scheme is being leveraged (e.g. UTF-8), or developing a solution that replaces these characters with their HTML character references, either within your marketing platform or upstream in your data sources (this may require further cooperation from your data team).
  3. Case issues, where data isn’t appearing with proper letter case. For example names where the first letter is in lowercase, product names that are entirely in uppercase, and so forth. In robust marketing systems, it is often simplest to handle these with functions to appropriately convert the case within the system itself.

At the end of the day, it all comes down to communication and cooperation with your data resources, broad testing, and careful consideration of where the issue would be most efficiently addressed, be it upstream in the data source itself, within the marketing system’s functionality, or somewhere in between.

It is easy to overlook the importance of data quality, and the many points at which it can go wrong, when you'd rather focus on your message and creative. But taking the time to ensure that the data you are dynamically presenting is what you want it to be is a key element to a successful campaign.