preloader

The drivers of data quality

blog-image

Engaging in data quality activities could be not always possible or even desirable. Data is an abstract entity and generating the relevant corporate urgency might prove difficult if not approached right.

When is the right time to start checking the quality of your data?

We have relied on data forever; the only difference now is that the scale is changing. Big or small, data has always formed the basis of our decisions. We run our businesses observing the dynamics of indicators we trust, we create invoices when we are sure they will reach the intended recipient correctly, we execute decisions when we have sufficient confidence, and our experience supports them.

For all these and more, we have been employing data to reach a specific goal in mind. Let us take the example of creating an invoice. If we had only three customers, it would not be that difficult to select the right one from the bunch. But when this number grow to, for example, one hundred thousand, and we issue thousands of invoices per day, this simple tasks quickly grows unexpectedly difficult. We want to be sure that we execute it correctly and one of the main enablers for this is to trust the underlying data. But how to trust data? It is, as we mentioned, an abstract entity. It is nowadays stored in a semi-fictional place, called the Cloud, is driven by technology that most of us know little about and originating from multiple sources.

This got us thinking. Yes, sooner or later, we want to check the data we use. But when is the right time? When is the optimal moment to stop enjoying the comfort of operations or decision-making based on trusted data and start spending time, effort, and resources on validating its credibility? In our opinion, there are three drivers that influence this choice: (1) Reactive drivers; (2) Resource drivers; and (3) Digital maturity drivers.

Reactive drivers

Reactive drivers are those events that would push a company to look into itself and decide how to react. To continue with the example above, when a finance department notices that there is an increase in the number of invoices that are not issued on time, an investigation would be launched. Often, part of this investigation would also be a critical look into the data that generates these invoices. All such events trigger data quality checks, which we summarize in the word – “reaction”.

Resource drivers

The next driver that we have identified to trigger verification of company data is the availability of resources. The notion that data should be treated as an asset is not new and is widely accepted by organizations from all shapes and sizes. At the same time, the data might not be the traditional asset that firms have experience dealing with. Therefore, any additional activities related to data storage, maintenance, usage, and quality could be considered investment rather than operational expenses.

We summarize data quality checks triggered by the availability of resources as a corporate “gamble”. On one end, mobilizing people and resources to look at the quality of the data that is in use could pay off. It will either discover issues that will be resolved or will prove that the data is clean and can be depended upon. However, a subtle danger, especially when the quality checks are not performed on regular intervals, could lead to a set of biases. One such bias, the overconfidence bias, could be particularly risky. It leads to decisions are made based on continuously changing data that has been previously labeled as clean and now just trusted without question.

Digital maturity drivers

The last driver that triggers firms to engage in data quality activities is the digital maturity of the organization. This is the state, in which the company knows that using data comes with its benefits and dangers. Such organizations have established processes, procedures, and methodologies to execute data quality checks as efficiently as possible and have firmly embedded them into their operations.

When the trigger is digital maturity, we say that the companies engage in “preventative” activities, so that when data is needed, it is there, and everyone trusts it. The danger here is the data does not stand still. Such changes could force the company to start playing catch up, adjusting the policies and processes with the purpose of keep the level. Admirable, yes, but in in long run this can become quite expensive, slowing down growth and resulting in missed opportunities due to the shift of focus.

Combining these three drivers

We have isolated these three drivers with their most likely outcomes to illustrate the three extremes that could trigger companies to resort to checking and verifying the quality of their data. Of course, the real world is much more complex, and usually each of the driver interacts with at least one other, prompting data quality checks.

Reactive drivers when there are available Resources

These are circumstances in which companies must respond to a major “change”. Or more precisely when they use the quality of the corporate data to trigger company-wide transformations. One such example could be the decision to change the enterprise-resource planning (ERP) system. Yes, when the current ERP is outdated, sure, this is a logical business choice and there is nothing related to data quality. Often though, events that trigger such decision are business disruptions due to faulty data. Going for a fundamental shift might seem the only logical solution, but much easier, and often safer, choice would be to investigate and repair the data that caused the trouble in the first place and bring them back to the expected standard of the business and the system landscape.

Reactive drivers that affect Mature companies

The combination of these drivers triggers dedicated effort to continuously verify the trustworthiness of the data as a matter of “business as usual (BAU)”. The key word in this statement is: continuous. This is how this outcome differs from the situation when the maturity of the organization is the sole driver. The continuous data quality monitoring and subsequent actions are what a data driven organization should constantly aim for in order to be prepared for any change coming from within or the outside.

Mature company where Resources are abundant

This combination of drivers is the best set for prompting “investment” where the data is in the center. A good example is another fashionable trend – the development of artificial intelligence (AI) capabilities. For an AI to work, it needs vast amounts of data. Data that is clean, organized, structured and actual. Feeding an AI system with good data is the primary factor for its success. Therefore, a company which would be looking to embark on such journey must be sure that all its data is ready to use. Such company is digitally mature and have the resources necessary to walk the way. Measuring and assessing the state of the data, becomes one of the first requirements and this company would not only be willing but also looking to invest in data quality activities.

The bullseye – where all three drivers are in play

This is a state where different factors are pushing the company to react, have sufficient means to respond to those factors and is mature enough to realize that the knowledge, which comes from high volume of quality data, is its main asset. Companies, that purposefully choose to test and monitor the quality of their data, are most likely to start generating and “enjoying” sustainable competitive advantages leading to superior returns over extended period of time.

It is not a question of when

In conclusion, the question of when a data quality check should be performed is not one of timing, but rather of circumstances. Each driver or a combination of drivers triggers a specific response and not aligning that response with the condition or the goals of the company can only magnify the possible damage. Therefore, we worked out the framework that this blogpost started with as a guide for the organization to judge what could trigger a review of the quality of its data and what business outcome, they might expect as a consequence.