Redundancy, caused by integrating multiple definitions of master data across different lines of business into a data warehouse can cause reports to depict inaccurate results. Master Data Management, introduced in SQL Server 2008 R2, overcomes this problem, allowing you to create, organize and manage master data centrally.
Real world example
We have to set up an enterprise data warehouse for a client, but the
problem is, this client has several applications running across different
functions of the business. Each of these applications has their own
definition of master data. Now, because of this duplication of master date,
if we integrate these master data into a data warehouse and generate reports,
it would not give accurate results. For example, consider a product called
"Mount Bike 6 Gear" it has a different definition with a different
name within different applications viz. Sales, Marketing, Inventory etc.
- Mount Bike 6 Gear
- Mount 6 Gear Bike
- Mount 6 Gr Bike
- Mt Bk 6 Gr
- Mount Bike 6 Gr etc.
Now we integrate these records in the data warehouse, even though each
of the above records represents the same product, the report result would be scattered
all over the report because of their different definitions and the interpretation
might be wrong if it is based on this incorrect report result.
Multiple definitions of master data across different lines of business would
not pose a problem as long as we do analytics or reporting on each of this
application data separately. However, when we need to integrate these data into
a data warehouse, this redundancy or duplicity of data would cause the report
to depict incorrect results. This problem can be overcome with the help of
Master Data Management, which allows you to create, organize and manage master
data centrally. SQL Server 2008 R2 introduces the Master Data Management platform,
which is built on SQL Server and WCF (Windows Communication Foundation). However,
before we dig into MDM in detail, that let me explain some of the terminologies related to it.
Defining different terminologies…
Master data represents business entities, for example people (customers,
vendors, employees, etc.), places (departments, locations, etc.), products
(products, parts, etc.), abstracts (Time, etc.) etc. Master data is
non-transactional data, which references the transactional data. If we talk of
analogy in OLAP, master data is contained by the dimensions of the cube.
Master Data Management
Master Data Management is a set of tools, processes and policies, which
allow an enterprise to create, organize and manage correct and consistent
master data centrally. You can learn more about Master Data
Management at Wikipedia.
Gearing up for Master Data Services…
Master Data Services (MDS) of SQL Server 2008 R2 provides an extensible
platform for Master Data Management, which is built on top of SQL Server 2008
R2 Database Engine and WCF (Windows Communication Foundation). It allows you to
create a centralized hub for master data and does transaction logging for all
the changes made to these master data, for example what changes have been done,
who made the changes, when the changes were made, what was the value before the
change and what is value after change. This transaction logging allows user to
revert to old values if changes are not required anymore. MDS also allows you
to define multiple versions of your master data. For example, you can have one
committed version that will be available for consumption by downstream
applications and at the same time, you can have an open version of the same
master data to do any changes to it. This ensures that downstream applications
get consistent data because no changes are permitted on the committed version
of master data and at the same time, you are able to do changes to the open
version of the master data. A version could be either in open (available for
changes), locked (locked for validating master data against defined business
rules) and committed (no changes are permitted and hence this version is
available for reporting and analytics) version.
Master Data Services allows role-based security for all of its objects as
well as for data that it contains. For example, you can have four different
roles and each role might have access to data belonging to either of south,
north, east and west regions, etc.
You can define different types of business/validation rules and can ensure
the master data that MDS contains is validated against these rules. For
example, the product about which I talked before, i.e. Mount Bike 6 Gear has
been recently launched and because of the high demand for this product, I want
to ensure that DaysToManufacture should not be more than 10. MDS allows
defining this kind of validation rules and flags the records if they violate
any of these defined rules. You can also set up to launch a SharePoint workflow
or send notification to the intended audience if these validations fail.
Understanding Master Data Services Architecture…
Master Data Services (MDS) architecture can be divided into three different
layer stacks. On the bottom of this architecture stack is DB Layer, which is
built on SQL Server 2008 R2 database, which stores all the versions of master
data, meta data, system wide setting, security related information and
transaction logging. The Service Layer sits on top of the DB Layer and has all of
the business logic of MDS. It provides WCF API to programmatically access all
of the features of MDS; it also provides an interface for importing master data
in MDS through the MDS staging area and for exporting master data to downstream
applications for consumption via subscription views. On top of this
architecture is the data stewardship portal called Master Data Manager, which
provides an ASP.NET UI to access all the features of MDS. Through this UI, you
can create different MDS objects, add data to it, define business rules,
specify workflow notification, email notification, etc.
Figure 1 - Master Data Services Layers
Master Data Services - Components...
When you start working with Master Data Services, the first object that you
need to create is a model. A model is kind of a container that contains all the
Master Data Services objects created. Talking of an analogy with a relational
database, it's very much like a database or a schema, whereas in an analogy
with SQL Server Analysis Services it's like a cube that contains the other
A model contains at least one entity. An entity contains zero or more members
with similar attributes. Talking of an analogy with a relational database it's
very much like a table whereas in an analogy with SQL Server Analysis Services
it's like a dimension; for example Employees, Locations, Customers, etc.
An individual data item or an instance of an entity is called a member. For
example, "Employees" is an entity and each employee (John, Steve,
Paula, etc.) belonging to this entity is an instance of this entity or member
of this entity.
As I said before all of the members of an entity share the same set of
attributes. Talking of an analogy with a relational database a member is very
much like a row of a table whereas in an analogy with SQL Server Analysis
Services it's like a member of a dimension.
A property or description of a member is called an attribute. For example,
in "Employees" entity EmpID, EmpName, Address, DOB, DOJ are some of
the attributes. Talking of an analogy with a relational database an attribute
is very much like a column of a table whereas in an analogy with SQL Server
Analysis Services it's like an attribute of a dimension.
An entity of an MDS might contain tens or even hundreds of different
attributes but not all of these attributes would be required by all of the
reporting or analytical applications. Hence, MDS allows you to create a group
of logical related attributes, which make more sense to a specific reporting or
analytical application. Because Attribute Group is a securable object, it
simplifies and makes the management and security administration of an entities'
MDS allows you to organize members in tree-based structure, which is called
Hierarchy, to view it in different ways. For example, each ProductCategory
contains several ProductSubCategory and each ProductSubCategory contains
several Products. If any member is modified, all of the related hierarchies are
updated to reflect the change. A hierarchy is called an explicit hierarchy if
all of the members belong to a single entity; it is called derived hierarchy if
members come from multiple related entities (domain-based attribute
Alternatively you can create a collection (a group of explicit hierarchies,
collection members and even collections) if you want different groupings of
your members for analytics and reporting purposes instead of creating a complete
MDS allows you to create business rules, which are nothing but declarative
conditions or actions on a member, which triggers if invalid data gets in. For
example, new products being launched that you don't want DaysToManufature to be
more than 10. If any record comes with greater than 10, you should be notified
and the record should be flagged invalid.
In this article, we saw what master data is and how Master Data Management helps
to better create, organize and manage master data. Then we had a look on Master
Data Services, which is available with SQL Server 2008 R2, a Master Data Management
platform, its architecture and different components of it. In the next article,
I am going to show you how to create different MDS objects.
What, Why, and How of Master Data Management
See All Articles by Columnist