One reason Master Data Management (MDM) was developed was to assist with the Data Quality areas of Data Consistency and External Integrity. I say “assist” as it is not designed to resolve all consistency or external integrity issues, but only resolve them for items that have been identified as ‘key’ informational items; items for which it is important to have a master record.
Since not every data item is addressed in an MDM solution, having MDM does not obviate the need for having an overall data quality, or data governance, program. For example, MDM would not address items such as a payment amount not having more than two decimal places (I paid $7.505 for a sandwich? How did I pay a half cent?). The problems of data structure and formatting are handled through data quality, whereas the problem of the data accuracy of key data items is handled through Master Data Management.
Master Data Management seeks to fix some data quality issues by setting up a data governance process for key information items, sometimes referred to as “master” data items. Depending on how the MDM program is implemented, the process may, or may not, address the data quality problem at the root.
Master data items are generally considered items that do not change frequently; Customer Name, Product Code, etc. By defining important master data items for an organisation and putting processes in place to manage how that data is retrieved, you have started an MDM program.
For a simple, but easy-to-understand, view of Master Data Management, look at figures 1 and 2. Figure 1 is a system without MDM and Figure 2 is with MDM.
In Figure 1 we see that information about the same company is stored differently by the various systems. In a data quality sense, the data in the different systems lack external integrity which means that information merged between systems will suffer from a data consistency problem as well.
As users access company information from the different systems, they have to know what the company is called in each system. In many cases, users will need information from more than one system and if the data is not homogeneous, how does it match?
Each system may have reasons for the different naming of ABC. In some cases the system may only be looking at a part of the company such as ABC (Renton) as compared to Aut_ABC or other ABC locations.
Regardless of the reasons for the data inconsistency, the fact that it exists will cause problems with accurate and complete data access and analysis. This is the kind of situation that can result in management making inaccurate decisions regarding account management, discounts, promotions, marketing, etc. Master Data Management addresses the data inconsistency issue. Figure 2 shows how inconsistent data in different systems can be unified using MDM.
In the instance shown in Figure 2, the data in the source systems remains in disparate formats but users access the key informational data items (such as company name here) through a central repository. This is an example of one method for implementing Master Data Management. There are several methods and variations there from for implementing Master Data Management. At the highest level, there are three primary methods: a centralised transactional hub, a registry, and a hybrid approach.
A Centralised Transactional Hub maintains the master data (i.e., the ABC company information) in a central database and all transactional systems refer to that central hub to obtain their information in regard to the company. In this instance, the transactional information in each system is altered to be brought into compliance with the central hub. In this model the master database can be both queried and updated.
A Registry Model leaves the data as-is in the various transactional systems but maintains a registry database which contains pointers to the master record for each data item. In such an instance, the master data for customers may be maintained in SAP whereas the master data for employees would be maintained in the HR database. In this model the master database can be queried but no information can be updated as the master data records are stored elsewhere.
A Hybrid system merges both the registry model and the central hub model. In this case, there is a centralised hub of data which has been cleansed and brought into compliance with the master data record but the data in the various transactional systems is not altered. In this model the master database can be queried directly (as with a Centralised Hub system) but the master database looks to the master records in the different systems (as in the Registry model) to get its information and is updated according to the reference records and not updated directly.
As mentioned, these three models are described at a high level and many variations of each can be found. Each of the systems has both advantages and disadvantages.
At a high level, the centralised system provides the benefit of having the data within each source system consistent in regard to any defined master data items. However, this is also the most intrusive, requiring rework on the various systems to refer to the centralised hub when obtaining any master data items and thus this method generally requires a longer implementation time and higher cost.
The registry system is less intrusive as the source system for each master data item is defined and then pointers are used to refer to that information. It does not require rework of the transactional systems but means that when retrieving data directly from the different transactional systems, if the registry is not used to obtain the correct master data items, the between systems data may not match.
Now that you have a basic understanding of what MDM is and its purpose, what are some of the things you should consider as you think about implementing MDM in your organisation?
Implementing Master Data Management
If I asked you “How do you implement a MDM system?” Be honest, your first thought was “this is an IT function” correct? If so, you are both right and wrong.
Without a doubt the IT department can implement and deploy an MDM system. However, without involving the business in the MDM process, it is an implementation that is likely to be fraught with problems. IT can implement the technical aspects of the system, but they are not the data owners. The people who know the data, the people who own the data, they are the business people.
In order to maintain the reliability and accuracy of the master data items, Data Stewards should be designated to watch over master data items. What is a Data Steward? This is someone who is responsible for one or more specific master data items. For example, the account manager for a customer ABC may be assigned to be the data steward over any changes to the master record in regard to the customer ABC. Likewise, a product manager would be the data steward over any data related to his product(s).
Every successful MDM implementation includes both business and IT and will involve setting policies, responsibilities, processes, and then constantly reviewing each of those items to make sure they work properly. Master Data Management is not a destination; it is a process.
An important component for any Data Quality or Master Data Management program is an evaluation of the business and data processes. Bad data does not happen randomly; it happens due to the processes and applications that are in place. By evaluating the processes and applications, you can find, and address, those areas which are prone to introducing data quality issues.
As I mentioned, it is not the purpose of a Master Data Management program to to fix ALL data quality issues, it is meant to address the issue of important, slowly changing data items. To try to address every data issue through MDM would be hugely expensive, cumbersome and – for all practical purposes – impossible.
As I mentioned in my first blog Garbage in, garbage out: The importance of data quality “Master Data Management can be thought of as a component of an overall Data Quality program and Data Governance includes both Data Quality and Master Data Management.” Now that you understand both data quality and master data management at a high level, you may be wondering: “What is Data Governance?” I’m glad you asked! When you boil down all the things data governance covers, it can be summed up as follows: Data Governance is a process used to ensure that data within the organisation is consistent, timely and accurate. In the final instalment in this series I’ll talk about Data Governance so you can understand how everything (at least data wise) can be managed. Thank you for staying with me and I’ll see you again for my final instalment.