Warning: Creating default object from empty value in /home/johnnz/public_html/wp-content/themes/simplicity/functions/admin-hooks.php on line 160

Data Modelling: The Power of QUACKs and UIDs

Codes are Not Identifiers

In my previous post on how unique keys cause duplication in databases I demonstrated that codes are not identifiers for data entities and, by extension, they cannot be unique identifiers (UIDs).

Why Do People Use Codes?

If codes are not identifiers how come they are used so extensively?  Because, when used properly, they are very useful way of uniquely referring to real or abstract objects but, and this is vital to understand, they do not uniquely identify these objects.

A code is merely a word, letter, number, or other symbol used to uniquely refer to something.

In the warehouse example that I used in my previous post (see it here) I used a five digit numeric code. Some would argue that this is why identification failed.  If I had used a more sensible code, a structured code, that identification would have been easy.

For example, if I had asked you to pick up part number EP10KW then you would have had no problem.  Is this true?  Well you could have used all of your ingenuity and tried to work out what the code meant.  Is it perhaps an Electric Pump with 10 KiloWatt rating, EPoxy Resin in a 10 Kilo tub and White in colour or an Extension Pole 10 metres in length made by Kite & Ward?  There is no way of telling.  You would be merely guessing.

QUACKs

Structured codes, such as EP10KW can be very useful when used properly.  In the Integrated Modelling Method they are referred to as QUACKs (QUick Alternative Code or Key).  But they are only of use when you know what they mean.  They are merely a reference code – not an identifier.  In order to be of any use you need to have a means of knowing what the code refers to.  This is typically a list or a catalogue.

QUACK Part Power Source Rating
EP10KW Pump Electric 10 Kilowatt
EP20KW Pump Electric 20 Kilowatt
PP05HP Pump Petrol Engine 5 horsepower

..

Unique Identifier

What is the unique identifier (UID) for each item in the above table?  The QUACK or the other columns?

Let’s add a few more rows that will help clarify this.

QUACK Part Power Source Rating
EP10KW Pump Electric 10 Kilowatt
EP20KW Pump Electric 20 Kilowatt
EP10KWX Pump Electric 10 Kilowatt
EP20KWX Pump Electric 20 Kilowatt
PP05HP Pump Petrol Engine 5 horsepower

..

QUACKs are Not UIDs

Here we have two more rows that refer to electric pumps.  Is there an error? What is the difference between EP10KW and EP10KWX and between EP20KW and EP20KWX?  We don’t know and the code cannot tell us. What makes one uniquely different from the others? In other words, what is the unique identifier of each item?

It is clear that in order to know what the difference is that we need more information about each pump.  It could be that what makes one pump different from the other is the manufacturer, or that one is submersible and the other is not.

No code can give us this information!  It is attributes of the data entity that uniquely identify it, that are its UID. (Sometimes the UID also includes relationships to other entities).

Bigger QUACKs?

Although QUACKs are not UIDs they are still a very powerful business mechanism for uniquely referring to business items.

Would they be more useful if they were longer and we embedded more information in them?  For example we could have ETHIP199012 to stand for Ethylene plant Improvement Project Task 199 for cost centre 012.

This is a real life example from a chemical plant in South Wales. It seemed to work pretty well for a while and the Project Office used to produce lots of reports based on the structure of the QUACK.  Then they ran out of numbers.  Three digits were no longer enough for a task number, the cost codes were now centrally changing to 6 digits, a second ethylene plant was brought online and so ETH was no longer unique.

QUACKS and structured codes are useful mechanisms but when an enterprise tries to embed all of the attributes of an entity into the code we have moved from a useful mechanism to a dangerous time bomb.

For example, PI01 for “Personal Insurance product number 1″ and “CI07 for Corporate Insurance Product number 7″ are good, simple QUACKs.  PI89M1825LD for “Personal Insurance Product number 89 for males aged between 18 and 25 years who are Learner Drivers” is NUTS!

Guidelines for QUACKs

  • Quacks should be short and simple – containing no more than 3 elements.
  • All of the necessary attributes for an entity should be held as columns in the table that represent that entity – NOT embedded in the QUACK.
  • All analysis should be carried out using the attribute values held in the table and not on the elements of the QUACK.
  • QUACKs are not UIDs and every attempt to misuse them as such has and always will result in duplication of records in the database

Useful Links

Ebook with full definition for UIDs.

Rapid Guide to Data Modelling.

Blog Post on Keys and Duplication

Tags:

No comments yet.

Leave a Reply