Warning: Creating default object from empty value in /home/johnnz/public_html/wp-content/themes/simplicity/functions/admin-hooks.php on line 160

Don’t Look in Existing Data for Master Data Entities

Why not? Because it is about as effective as looking in the dumpster outside a restaurant for the ingredients for a gourmet meal.

In my last articles on the major errors made in Master Data Management (MDM), I explained how most MDM projects start in the wrong place and then head off in the wrong direction.  They do this by looking for Master Data Entities in the existing data of the enterprise.

A bold assumption, you say.  However, in this and my next article, I will demonstrate just how badly existing data can mislead anyone using it without using Logical Data Model techniques.

An Example

Let us look at a typical example of data to demonstrate what can go wrong.

This example was taken from the entertaining “Three Shirts Video” by Jim Harris in which he talked about the various approaches people take to MDM.  My thanks to Jim for this.

At first glance the above interpretation of what each item represents seems sensible. However, if we take a more detailed look we will see that looking at items in the form of rows from a database in this way can easily lead to misinterpretations that result in structural errors in the Master Data Architecture of the enterprise.

The above interpretation would lead us to believe that the data structure they represent would look like this diagram.

This might seem to have captured all that that is needed. However, because we started by looking at data in an existing database, shortcomings and structural flaws have been built into this structure.  So what are these?

Sale / Product / Transaction

The key entity of Sale has been missed out altogether.  Transaction does appear, however, this is just a means by which Sales of Products can be grouped.  The Transaction should not have an attribute of Cost, as this is derivable from the Price of each Product (though this attribute of Price is not shown against Product in the example data).  Product and Sale have become confused.

How can we identify what the real structure ares?  By building a normalised Logical Data Model for the entities, like I have done below.

The above LDM shows us what the Sales and Transaction data structures should look like.

Customer & Vendor

The next error comes with Customer and Vendor.  Normally these two entities (we will assume for the moment that they really are entities) would never appear in a database or on an entity model together.  This is all based on knowing your ‘Worldview’  - being able to answer the question “In this transaction, who am I?”.  Are you the ‘Vendor’ or the ‘Customer’.  If you are the ‘Vendor’ then you will not have ‘Vendor’ on your data model or in your database, because it is always you. The same logic applies if you are the ‘Customer’.

However, you will avoid this confusion entirely by realising that there are no such entities as Customer, Vendor or Supplier! These in fact are all derivable Roles played by the entity Party – which is the key entity in MDM that is missed completely by so many people!! Derivable entities ought NEVER appear in your LDM.

I will return to the Role of Customer in my next article.


The example labels two items as ‘Location’.  These are actually Addresses, they are the addresses of two Locations, the first at which the a Party lives and the second at which a Party trades.  Address is another entity that very often badly structured in databases. This is because it has not previously been modelled at the logical level in the LDM.  A powerful Address structure is relatively easy to model.  One thing to remember is that Address is NOT an attribute of Party!

You Don’t Need ‘Customer’

A major paradox is the amount of emphasis that so many MDM practitioners place on ‘Customer’, considering that a) it is not actually a Master Data Entity and b) probably 50% of the business in the world never need to know who the ‘Customer’ is!

Think about it! How many purchases do you make in a week and how many times do you you have to provide you name, address, phone number, etc. before you can pick up your purchase?  Not Many. Most businesses that you deal with do not need to know who you, the Party acting in the role of Customer, are!  So why would it appear on their data model or in their databases?

The only exception to this is in the retail world is where the Sale is a credit Sale.  In this circumstance the business will need to know which Party played the Role of Customer so that they can be billed.

Let us now put all of this information into a single LDM.

The above LDM show us what the data structures represented by the data in the example really look like. This is significantly different to what the original table rows suggested.

Please Share

If you liked this post and think that it could be of value to a colleague or friend please feel free to share it by clicking on one of the social media links below.


2 Responses to “Don’t Look in Existing Data for Master Data Entities”

  1. Todd Everett February 23, 2012 2:01 pm #

    John – great post. Glad to see you doing more blogging recently on these topics as so few professions blog on logical data modeling topics. You can find 50 great blogging resources on physical database design but few on these. I appreciate your efforts to post and often forward these to my co-workers. I’m totally on board with what you are saying here. One question – what tool do you use to create your diagrams? I like the barker-ellis notation but have found only Oracle designer uses it (maybe its proprietary to oracle) and it implements it wrong in that when you make a relationship an identifier (like you did above and noted) not only does it put the line on the crows foot, but it removes the dashing from the other end of the line even though you need that to represent the relationship is “may be” not “must be”. Thanks -Todd

    • John Owens February 28, 2012 4:57 am #

      Hi Todd

      It is amazing how many words are written around Data Quality and MDM and not a single Logical data Model in sight! I think that this possibly explains why so much confusion surrounds these subjects.

      Thanks for your kind comments. The notation is also used in Corporate Modeller from Casewise.

      It is getting hard to find good modelling software. I will have to consider producing an IMM Modeller!


Leave a Reply