Data is complex. On one end, it needs to power the systems that the organization has built. On the other, it needs to provide specific meaning used for running the business or for decision making. Finally, it needs to connect well across a complicated system landscape. The data quality is not a one-dimensional need that can be satisfied with a one-dimensional solution. Therefore, we developed our 4Cs approach.
The origin of the 4Cs approach
Asking different functions when data is of good quality will give you different answer. We conducted multiple interviews with professionals from department in small and large corporations to get a feeling when data is of good quality. Depending on the areas of expertise, we got different answers. The definition of the IT specialists is focused on the ability of the data to serve the technical purposes of the system landscape. In simple words, when the data makes the systems break, it is not of good quality. This demand is understandable as the data powers the IT systems. We also interviewed many functional professional, like finance, marketing, and business operation managers. There, the data must enable the business processes, i.e. the business rely on the data to achieve their specific goals. Finally, the data does not usually sit in one system or database, it travels across the organization through some complex pipelines, which in technical language are called systems integration. Often happens that data sent from one system has not arrived at the target systems as expected and the people in the second systems cannot use it for the purposes, they need it for. In other others, the data is not of good quality. These three main demands led us to develop the 4Cs approach.
The goal of the 4Cs
The 4Cs approach aims at detecting issues within the data that can make the life of different types of professional more difficult. The first 2Cs – “Data is Complete” and “Data is Consistent” are looking at the data as an entity. This is how the IT specialists are looking at it. As an analogy, if we compare the data to a motor vehicle, we should expect wheels, steering system, or chassis for example. Same is with the data. In these two sets of checks, we make sure that the records stored in the systems can be defined as data that is ready to serve a purpose. The next C is the “Data is Correct”. Here, we test if the data can be used for the ultimate purpose it is intended. These are usually the needs that the businesspeople have towards the data. To continue with the motor vehicle analogy, we first ask what the purpose is, this vehicle should be used for. If the answer is “to do the groceries” we will check if the vehicle has cargo storage, it is of appropriate size and even deeper, if the vehicle can be parked for example in a small parking lot. The final C is “Data is Connected”. Here we make sure that the data has travelled successfully across the system landscape. In other words, we check if, for example, the produce data that sits in the ERP system has arrived as expected in the marketing system, because the teams there need it for their advertising activities.
The 4Cs output
With the 4Cs approach, we are confident that we can cover all needs that anyone in the organization might have towards the data. We make sure that the organization can use it, end-to-end, for all purposes it needs it for.