Warning: Creating default object from empty value in /home/johnnz/public_html/wp-content/themes/simplicity/functions/admin-hooks.php on line 160

Data Quality: Dead Crows Kill Customers!

Dead Crows also Kill Suppliers!

While recently doing a webinar on with Dylan Jones, of Dataqualitypro, I was the describing the essential role of Logical Data Model plays in Data Quality.  During our discussions, Dylan asked me to give examples of how the Model helped and, of course, I had to tell him how Dead Crows kill Customers!

In a previous post I described the powerful modelling convention of “Dead Crows Fly East” and how this greatly added to the context and readability of an Entity Relationship Diagram (ERD).

Because this convention brings consistency to the positioning of elements on an ERD, it is a powerful means of highlighting patterns and, by this method, identifying and removing hidden duplicates.

A diagram will help.

In the diagram on the right both Customer and Supplier appear as separate data entities.

The patterns formed by the data entities and the relationships running downwards from Bill of Materials are identical.

On any ERD, identical (even similar) patterns suggest identical data entities.

Closer inspection of the structures here reveals that a) Product and Component are types of Stock Item, b) Sale and Purchase are types of Commercial Transaction and that c) Customer and Supplier are types of Third Party.  We can then reflect all of these findings in a revised ERD as shown below.

If enterprises started out with the Logical Data Model they would avoid building inadequate (often flawed) data structures into databases and then more time and money trying to identify and remove them.

This is the principle of Quality Assurance which is all about preventing defects as opposed to Quality Control which is about first allowing and then finding defects!

If you like this post then please Tweet or Share it – feel free to leave a comment too.

2 Responses to “Data Quality: Dead Crows Kill Customers!”

  1. Richard Ordowich July 15, 2011 12:00 pm #

    Over the years I have seen numerous data models and data structures and each one claimed to be the ideal solution. In practice however I have witnessed the results which in hindsight were seldom ideal. When examining the data models and data structures against their intended uses, limitations are evident yet these limitations are seldom exposed or articulated during the design phase.

    A data model should always be accompanied by a description of its intended uses and how effective and efficient it is at satisfying the intended use as well as what the limitations are of this particular model. There is no way to determine if a particular modeling approach is satisfactory without understanding the context in which it will be used. This important consideration is seldom addressed but claims of the “perfect model” persist. The model maybe technically satisfactory but limited in satisfying the varied and sometimes convoluted needs of the business. Business requirements are not architected nor do they follow exacting patterns or logic. The best logical model maybe limited in its ability to satisfy the frequent illogical demands of humans.

    To ensure the quality of the model it is critical that the current and future uses cases of the model be well articulated and presented before design begins and in a form that can be used to validate the data model. Testing these use cases while designing the model is one way of ensuring this aspect of a quality design. Publishing the limitations of the model in a way that business folks can understand is also critical. No model is ideal.

    • john July 18, 2011 11:32 pm #

      @Richard

      In the Integrated Modelling Method the Logical Data Model is extracted directly from the Function Model, which gives full context to the model both now and in the future. No data model should ever be drawn without first having modelled the Business Functons, otherewise it is an abstract modelling exercise.

      When modelling (within the context of a Function Model) then the modelling approach should be convergent, as demonstrated in this example, thus removing all unnecessary duplication from the model and avoiding bad design and build solutions.

      Standards are essential for all good data modelling and, as a minimum these are:

      • Always model data in the context of the Business Functions
      • Always start with a Logical Data Model
      • All logical models should conform to all normal forms up to 5NF
      • Always strive for convergent, generic data structures

Leave a Reply