I have long held that the accountants have often been the determiners of good or bad data quality. My rationale for this being that, the role of accounting is to collate and report business activity for the purposes of decision making.
You could take the position of the accounting function as purely “measurement, processing, and communication of financial and non-financial information about economic entities such as businesses and corporations” but in reality it is often something that needs to be considered even well ahead of the business and corporation.
Today, when we consider data quality we often think of it in the context of technology but of course technology is just one part of the equation. Accounting itself, is an approach, not a technology, and is thought to have originated from Mesopotamia, as a precursor to other kinds of writing perhaps. Artifacts from those ancient times, clay tablets with indentations, principally carry record keeping lists. So, even something as mundane as the act of clay tablet list-based record keeping, a precursor to modern day accounting, ultimately spawned further innovation and advances that we can ultimately relate to, from today.
In the accounting world, one popular system that is used, is that of debits and credits or double-entry accounting. Double entry accounting is commonly accepted today as the correct way to report financials, one where one side of the accounting books balances, or is supposed to balance, with the other, but it is an approach that has only been around since the time of Christopher Columbus.
Through the use of this balancing system, accountants can determine how much was invested by the owner(s) and how it is allocated within the business through various kinds of assets. Buildings, inventory, goodwill (the brand or business identity), cash etc. These allocations are all enumerated in terms of monetary value. Were you to sell your business and all the elements that comprise it, you, as the owner or investor would expect to receive compensation in money, goods or kind, of an equivalent value.
It’s interesting then, that data, a kind of asset of its own kind, is not similarly ascribed an accounting value. Why would that be? Is it perhaps because of the variability in the way that the data is gathered and can be assessed?
If you take public census records for example. As single records, they are probably only of value to a genealogist or someone with a particular interest in an individual, or for individual assessment. In aggregate those same records carry value to town and city planners, economists, academics and historians. However, if those records are partial or incomplete, their value as an aggregate is diminished because now they don’t paint a complete or full picture of something that you ‘re trying to describe or perform calculations on. At the individual level, missing census data means you have holes in your understanding of the individual or household and therefore cannot paint a picture over time.
Whether you’re looking at individual or collective records, at the very best, you can perhaps make interpolations about the missing data but depending on the level of precision that you need, that could be a very risky approach.
If we bring the context a little closer to businesses and how businesses leverage data to make decisions, the quality of the records varies according to the nature of the data.
A eCommerce customer record, that contains a name, a delivery address, an email address and a link to a collection of eCommerce transactions, tells you something about that individual up until the point that you last transacted with them.
You can perhaps infer something about their gender, their age, perhaps even their socio-economic status, from the transactional history. But if you last engaged with them for a transaction some months ago, you cannot tell if the contact information is still valid and so you cannot actually tell whether you could sell more to them or target them with a very specific offering aligned with their past purchases. This record is not as valuable as a record that you recently verified the contact information for, or from a customer that recently transacted with you. Working out the difference in the value of these two different records may be important if your cost of engagement with these customers is relatively high. Ultimately, you would want to choose your most valuable customers first in any targeted outreach campaign.
A second aspect to consider is whether the customer who buys toilet paper from you, is as valuable as the customer who buys electronic goods. Perhaps not. So now the purchasing profile of the customer becomes another variable that you might want to consider.
Factor in social media, the relationship between your customer record and their social media accounts and you now potentially have a bonanza of opportunities to assess customer value. Using a combination of image analysis, location analysis and even sentiment analysis on the language of social media postings, you could draw up a view of a customer that pin points exactly the kind of persona that they are relative to the product lines or opportunities that you might want to present to them.
It is easy for those in the weeds to describe this sort of stuff as “big brother” technologies and potentially intrusive, but the reality is that there are an awful lot of people out there, consumers, who are either blissfully unaware of the opportunities that their behaviour and data presents to potential sales and marketing who actively choose to feed these kinds of analyses so that they can be targeted with more customized offerings. Again, which kind of customer out of these, is the more valuable?
If your business is not ascribing a value to these variables then you can’t comment on whether the data you have, is of high or low value, and if you are not actually using that data to make any decisions, then the chances are that you don’t feel it has much value at all. Worse, if you haven’t bothered to take stock of the data that you do have, and you’ve made no effort to profile and assess the completeness and consistency of the data that you have, then you’ve no notion of how you might leverage it to gain any kind of business advantage.
Isn’t it time to consider assessing the value of your data quality today?
This article is a cross-post of an article I posted on LinkedIn