An interview with WhereScape's and NOW Consulting's Co Founder, Michael Whitehead, was recently released on iStart
It is a simple statement on the face of it, but one which has considerable nuance which will affect the outcomes of a data warehousing and analytics project.
We’re not talking here about the age-old concept of ‘garbage in-garbage out’, familiar to IT and businesspeople since time immemorial (or, at the very least, since the first record-keeping and analysis of those records commenced). Instead, we’re talking about the assessment of the quality of data against its suitability for the task at hand.
This is an important consideration because IT people tend to think in the neat and absolute terms of binary. Black and white. One or zero. Yes or no.
For businesspeople, however, there are grey areas aplenty. Depending on the task at hand, ‘data perfection’ might not be a requirement. Fairly accurate data might be sufficient to establish viability or otherwise – and businesspeople will be keenly aware than data perfection can come at a not-insignificant cost.
Now, data quality has always existed as a problem in the analytics world. It is like a dirty family secret which everyone knows about, but daren’t discuss. Where issues arise is that when there is a data quality problem, it is considered an IT problem and therefore sheeted home to ‘those useless nerds in IT’.
Did I mention costs? While those IT nerds are saddled with the blame, they are rarely provided with the funds which are required to achieve data purity. See the paradox? In effect, the business, which should understand data quality as its own problem, shoots the messenger.
And in any event, the way the IT department – our bullet-riddled messenger – goes about fixing a data problem isn’t always appropriate. When mandated and (usually ill-) equipped to sort it out, they approach it in a thoroughly blunt manner. Binary. Black or white. One or zero. Yes or no.
Take an invoice with an incorrect product code. If you are in IT that is easy to detect, and the answer is obvious – it is wrong so flag the record as an error. Black and white. But what about if you are in sales? You sold something and the customer (hopefully!) paid for it. The resulting muddle means sales reports don’t reconcile and the business loses faith in what the IT people are doing. In this example, being 100 percent accurate wasn’t important to the business (but it was seen as the only answer by IT). There are, therefore, differing views of what is correct and what is accurate.
The move to citizen data
There is another aspect to it which emerges with the move to ‘citizen data’, where responsibility for IT is put within the business unit itself rather than as a service being provided by an IT department. With the power comes responsibility: when it goes wrong, there’s no longer a door at which to lay the blame.
Businesspeople tend to be good at doing things quickly, and also good at prioritising what matters. An accountant will trade time for accuracy, a salesperson will trade accuracy for time. Both get an ideal (for them) outcome.
IT, on the other hand, applies a single standard, often taking the lowest common denominator approach (which actually pleases no one). We’re back to binary and ones and zeroes.
By contrast, businesspeople are not known for applying high levels of governance and rigour to complex technical challenges…which is precisely what IT is good at.
With the move to ‘citizen data’, an opportunity arises where astute businesspeople recognise when and where to invite IT into help with the issue. When this synergy is achieved, IT become the heroes.
This is how data quality should be viewed. Data quality should be for the business, not IT to sort out. The mechanics of how to gather, analyse or otherwise crunch the data is where precision (and IT expertise) is required. It is about picking the battles which are important to the business, rather than to IT – after all, businesses have thousands of problems, some of which are more important than others. The IT team can’t do the differentiation on those problems, only the business can – and when the business is running the analytics, they know when something is good enough and whether or not it is fit for purpose. After all, good enough is good enough and in many instances, perfect is the enemy of good.
A final word on data quality
While there are trade-offs which will satisfy some industries or departments, sometimes data quality is mission critical. If getting it wrong means the CEO will go to prison, for example, the binary approach is probably best. Similarly if life and death is involved. In these cases bringing IT in to fix it once (properly) is likely a benefit for all. There are cases where unimpeachable data quality is a good idea.
Finally, always bear in mind that the outcome of the data quality exercise (along with its accumulation and analysis) has to be useful. All you’re doing with data, really, is creating stories. The art of a good story is in knowing your audience and being sure the story says something worth knowing for them.