Warning: Creating default object from empty value in /home/johnnz/public_html/wp-content/themes/simplicity/functions/admin-hooks.php on line 160

Is Data Dialysis Killing Your Enterprises?

One of the major barriers to  improving the maturity of Data Quality in enterprises around the globe is the mistaken belief, by far too many DQ practitioners, that good data creates a good enterprise.

The reality is quite the reverse of this. It is a good enterprise that creates good data. Good data is the output of the effective execution of the Business Functions of the enterprise.  Good data is a major indicator of the health of an enterprise – it is not a driver of it!!! 

Data Dialysis Insanity

The current approach to Data Quality of finding and correcting data defects in an effort to make an enterprise ‘healthier’, is about as effective as trying to turn an unfit and unhealthy body into a fit, healthy one by removing its unhealthy blood,  running it through a dialysis machine and returning it to the body.  In the medical world, this would be seen as an expensive (and insane) waste of  time, blood and money that would in no way improve the fitness of the body and, ultimately, result in the degradation and loss of blood and, not long after that, the death of the body.

THIS ARTICLE HAS BEEN UPDATED AND MOVED TO http://jo-international.com/is-data-dialysis-killing-your-enterprises/


Share the Love

If you enjoyed this post, please share it with colleagues and friends  by clicking a social media button below.

To follow me on Twitter or Facebook, click on a floating icon at the bottom right corner of this window so that we can stay connected!



8 Responses to “Is Data Dialysis Killing Your Enterprises?”

  1. Peter Benson April 17, 2014 1:28 am #

    I like the analogy of data cleansing to dialysis and although the analogy with manufacturing is useful it does not mention that the the biggest revolution in manufacturing is the development of managed supply chains. In a world of specialist, everybody is an integrator. Very little data is “created” within the enterprise the great majority comes from outside the enterprise. The ultimate solution to data quality is the creation and maintenance of a reliable data supply chain. The keys to a data supply chain are contained in quality identifiers. A quality identifier is an identifier that can be resolved by the authoritative source (or an authorized agent for the authoritative source) to the data that the identifier represents. ECCMA is building an ISO 22745 quality identifier resolution server.

    • John Owens April 17, 2014 11:21 pm #

      Hi Peter

      Thanks for the comments.

      I actually subscribe to the belief that all data is created within the enterprise! It is true, as you say, that much data might have an external source. However, when an enterprise chooses to allow this data to enter its internal systems, it is ‘creating’ it internally. The fact that it has accepted the data electronically, as opposed to entering it manually, makes no difference. This give the enterprise both power and responsibility. It has the power to vet and then accept or refuse every piece of data presented to it.

      It has the responsibility to ensure that all of the upstream sources from which it might choose to accept electronic data input are fully trusted. This upstream quality assurance has been widely practiced in manufacturing for decades now.


  2. Richard Ordowich April 13, 2014 12:03 pm #

    There are instances when organizations look for excuses but I think data quality is more complex than manufacturing quality. Physical product quality is bounded by laws of science, physics, chemistry biology etc. Business data has no bounds. No laws of nature to bound it. As a result the quality metrics become increasingly subjective and lead to measures such as “fitness for use” or “good enough”.

    I have had great difficulty measuring “fitness for use” and “good enough” data quality.

    There is another underlying challenge with data and that is the data is the result of numerous business and system processes. The root cause is typically some flawed business process that in many cases goes back to the design of the data. The data was designed for one purpose and is being used for numerous other purposes. This is akin to using a plumbing pipe as an electrical conductor. Can it be done? Sure. Is it suitable? Not so much.

    Beyond checking for valid values, what other quantifiable metrics are there for data? And are these metrics suitable for all uses of data within an organization? Data that is suitable for use by the sales department may not be of any use by finance. Just ask an organization how many “customers” they have. The next question is “whose asking”? How do you count a customer?

    Over the past thirty years we have not designed quality data. The focus has been on programming software (poor quality at that). Its like having faulty manufacturing equipment and expecting the products to meet quality metrics. Until we begin to build quality software, we will continue to have defective data. And we are left with only the ability to fix not prevent data errors. This is not consistent with fundamental principals of manufacturing quality practices.

    • John Owens April 13, 2014 11:42 pm #

      Hi Richard

      Without three key business models, it can seem like business data has no bounds, but this is definitely not the case. In any enterprise, not matter what size or in what sector, their are a finite number of forms that data can have validly have. There are also a finite number of values that much of this data can have, which is dictated by domain and Master Entity values.

      However, for any enterprise operating without a Business Function Model (BFM) these forms and values remain a mystery; everything is merely guesswork. ‘Fitness for Purpose’ and ‘Conformance to Requirements’ are the essential drivers for both manufacturing and data quality. But again, without the BFM, the requirements for data remain a total mystery. By building the BFM, the requirement for every item of data in the enterprise can be clearly and unambiguously defined.

      The process is 1) build the BFM, 2) from this extract the Data Entities, Attributes and relationships and build the Logical Data Model (LDM) and then 3) link every Function in the BFM to the relevant Entities and Attributes in the LDM using the CRUD Matrix.

      Contrary to most DQ practitioners beliefs, it is the Functions that read and update data entities and attributes (not the Functions that create them), that define the requirements that Data Quality has to fulfil. Once an enterprise knows these requirements, it can then make sure that the Functions that create the data do so in manner that fully meets these requirements.

      Without the BFM, LDM and CRUD Matrix an enterprise is flying blind and Data Quality will will be impossible to attain, no matter how much time, money and effort are thrown at it. Even the efforts of the most dedicated DQ staff will amount to nothing without these core models.

      Again, Richard, thank you for your comments.


  3. Gary Allemann April 11, 2014 1:43 pm #

    Hi John

    I agree that with a lot of what you are saying – in particular that data quality premised on “fix the data now” adds very little value.

    Our approach is based on the premise that repeatable and consistent processes need to be put in place both to resolve existing data issues and to protect against these same issues occurring in the future.

    The reality is that most so called data quality practitioners have no interest in solving the problem – they make their money out of placing bodies on site and have every reason to want to come back again next year.

    Until customers recognise the need for a sustainable solution however this is likely to continue. At the end of the day, if a client insists on paying for a project that is likely to be waste of money, very few consultants will turn down the cash on the basis that the client is being shortsighted,

    • John Owens April 13, 2014 12:33 am #

      Hi Gary

      I can appreciate the commercial balancing act that consultants sometimes have to play.

      However, I find it regrettable that so many of them are willing to let clients make really bad decisions and pay for work that in no way moves the enterprise forward. I have actually come across many consultancies (far too many) that actually celebrate a client’s shortsightedness in these areas and milk it for all that it is worth.

      All good consultants will aim to make them selves ‘redundant’. Each time they are engaged to do a piece of work they will do it in such a manner that that piece of work never needs to be done again. The problem is solved and the daily operations changed so that the problem does not recur.

      Thanks for the comments.


  4. Richard Ordowich April 10, 2014 1:25 pm #

    I agree with this point of view but I don’t think the analogy to manufacturing is suitable for data. There are two factors in data quality that are not present in manufacturing quality.

    1. In most business processes there are numerous exceptions and variations that have not been programmed into the system. These exceptions are processed outside the system and then the data is “adjusted” to conform to the reality of what occurred.

    Systems are designed to process only routine transactions. This becomes evident when an organization tries to create business process models from the actual workflows. There are numerous exceptions, human judgments and variances that occur that the system is unaware of. This creates the opportunity for data errors.

    Manufacturing is a more controlled environment, usually the number of variances are finite (ignoring human behavioral factors). Data errors (with the except of perhaps repeatable patterns), are infinite.

    2. Data is a symbol not an object like a physical product. How precise this symbol represents “reality” is the basic data quality challenge. We have to address the semantics, syntax and context of data which are quality measurement concepts not present in manufacturing quality. What are the measures for semantic accuracy? How precisely does the data represent the reality. These measures tend to be both subjective and boundless.

    Data quality should be viewed as a new domain and requires novel solutions. Some concepts from manufacturing can be adopted but many have to be created anew.

    • John Owens April 11, 2014 12:02 am #

      Hi Richard

      Thanks for the comments.

      The excuse “We don’t think the analogy to manufacturing is suitable for data” is used by people in data quality the world over as an excuse for their persistent failure to deliver real business improvements!!! They are crying out “That can’t be done!” when what they really mean is “We don’t know how to do it!”

      The quality of data is an indicator of the quality of the execution of the Business Functions of the enterprise. When the data is wrong, do not try to fix the data, fix the way the function is done, then the data will be correct.

      You are right, a datum is simply a symbol and, as such, is extremely easy to create correctly first time, every time. It has none of the complexities of a physical product. The number of possibilities is far from infinite. A date in a specified format is simply a date in a specified format. It does not take a super computer or an IQ of 170 to get this right every time.

      The large number of data errors is not down to the complexity of data. Data is the simplest thing to create and control in any enterprise. The errors are created by people and extremely badly designed systems and processes doing the wrong thing over and over again.

      The behaviour of the Data Quality world can definitely be classed as insanity as it persists in doing the same thing over and over again, expecting a different outcome. It also ignores the sage advice of Einstein who pointed out that you cannot solve a problem by using the same logic that created the problem in the first place.


Leave a Reply