The Silent Problem of Data

There is a silent problem in many organizations. That problem is data. More specifically, the problem is the lack of governance over data. Without governance, data is much like a two year old child who is left alone for too long near the temptations of a cookie jar. Once the child empties the jar, she may experience momentary happiness, not realizing the inevitable price of a tummy ache that will arrive later.

In many respects, data is prone to suffer the same fate as that child.  Data is received. It is housed in a database. It is foundational for running the business, so more data appears to be beneficial at first.   It is only later that the ultimate lesson is evident because without data governance, there is an eventual price to pay. The penalty for lack of data governance however is more permanent and pervasive than a tummy ache.  The negative consequences can vary the full gamut from merely annoying to facilitating the downfall of an organization.

If you are involved with data (and, in one way or another that is all of us), ask yourself if you have ever found an error in data (that will be almost everyone).   Now imagine your singular negative data experience versus the overall potential of negative data experiences. Given the global volume of data out there, if even 10% of it is not being managed by a mature data governance approach, then how can data ever be trusted? In reality, that 10% figure is likely very low. There are so few organizations with effective data governance strategies; the overall percentage of data that is not being governed is probably high.

Even in organizations where a data governance approach is mature, there may be aspects of data that have been ignored.  Some organizations, for example, ensure that their day-to-day transactional business data always meets quality standards, but fail to consider a governance approach for data that has been manipulated or aggregated for reporting purposes.  This means that while the original data in the transactional systems is reliable, the data being used for reporting and analytics may suffer from inaccuracies.   Since many organizations use analytics to determine future strategies, inaccuracies from reported data can cause some serious and unexpected complications. 

The question becomes “How can a voice be given to this silent data problem?” The answer to that question leads directly to the doors of a Data Governance methodology.  Following the thought process, the next question is probably “What is Data Governance?”  In seeking definitions, it is clear that the meaning of Data Governance is not succinctly defined.   According to Wikipedia (http://en.wikipedia.org/wiki/Data_governance),  “Data governance is an emerging discipline with an evolving definition. The discipline embodies a convergence of data quality, data management, data policies, business process management, and risk management surrounding the handling of data in an organization. Through data governance, organizations are looking to exercise positive control over the processes and methods used by their data stewards and data custodians to handle data.” Gwen Thomas, President of DataGovernance.com also provides several definitions (http://www.datagovernance.com/adg_data_governance_definition.html).  My preferred choice of those definitions says that, “Data Governance is the exercise of decision-making and authority for data-related matters.” 

In considering these definitions, some very important concepts begin to surface. Obviously the asset, “data”, needs to be leveraged, secured and aligned to allow appropriate protections and decision-making.    Adding to these concepts, I would suggest that by the completion of a successful data governance implementation, data will be fully and completely identified, standardized across the full organizational structure, appropriately available and verified as accurate.   

On some level, most businesses recognize the foundational needs for data governance, but recognition alone is not enough to solve the silent problem of data.  Data Governance needs a champion, an enabler or perhaps, in the worst case, a crisis that makes the silence intolerable and allows the necessary effort to effect change. Without organizational ‘buy-in’ Data Governance approaches seldom succeed. If you consider that data governance initiatives appear to involve an expense with few monetary rewards in return, you can see that the challenge of championing a Data Governance program is not trivial.

Timelines for a Data Governance implementation will usually be lengthy and the rewards achieved are not always evident. The protections that Data Governance programs provide involve the prevention of negative consequences and since negative consequences that are not realized are invisible to most of the corporation, proving value in order to garner funds for a Data Governance approach may be challenging. When seeking funding, it is easy to explain to management the benefit of a new data warehouse, for example, but much more difficult to explain the benefits of preventing invalid conclusions that could arise from data that wasn’t effectively managed.  Examples of data integrity problems are difficult to explain unless they have already been shown to be occurring, but data integrity issues may occur without any realization long before they are discovered. The perception of the absence of data integrity issues may be inaccurate. The reality of the absence of data integrity issues is the goal of a mature Data Governance program. 

Consider a scenario of using a database to store customer information.  Perhaps you have a table like this:

First Name

Last Name

Number

Street

City

State

Zip

Jane

Doe

800

Pine Street

Ashland

Kentucky

41163

J

Doe

 

Pine Street

Flatwoods

Kentucky

41167

JD

Marsh

800

Pine St

Flatwoods

KY

99999

Could this represent a data integrity issue?  It is possible that all three rows in this table really only represent the same individual, Jane Doe Marsh.   Alternatively, the rows in this table could represent three different individuals.   If this organization had a mature data governance policy in effect, we would know the reality. 

Perhaps you can visualize many considerations for Data Governance. Once you identify one data concern and start to think about solving it, you will likely begin to think of others. While there are standards, approaches and tools that can be used to launch and support a Data Governance initiative, the better the understanding of the data as it exists today, the easier the ability to begin the effort. 

Consider metadata. If your organization has a comprehensive data dictionary, listing all the metadata, with universally understood corporate meanings (a business glossary) for each piece of metadata, you have already achieved one step towards Data Governance.  If you have a thorough inventory that identifies the location of all the data (including flat files and other quasi-database locations), then you have another piece of the data governance information available.  Is there an organizational structure that oversees (owns) data? Is information about data shared throughout the organization in a timely manner? Is data tested for integrity?   Are data relationships understood and documented?  Are data audits being done to verify that the data is being used appropriate by all stakeholders?    Is data change management effective and complete?  All these are facets of a data governance program.

Data Governance has been complicated by mergers and acquisitions, new initiatives, regulations, and simple lack of interest.   However, each new piece of data that arrives may contribute to moving the organization toward additional data integrity problems and concerns. Every report or analysis that involves data could be at risk. 

The silent problem of data continues to exist. Data Governance can give voice to the problem and provide the foundation for a solution. Even initiating a Data Governance program moves the organization one step closer to ensuring that the valuable asset of data continues to be just that, a valuable asset. 

 

Additional Data Governance Research and Resources:

The IBM Data Governance Unified Process: Driving Business Value with IBM Software and Best Practices

The IBM Data Governance Council Maturity Model: Building a roadmap for effective data governance

5 Steps to Data Governance

Data Governance: The Basic Information

The IBM Data Governance Unified Process

See all articles by Keesa Bond

Keesa Bond
Keesa Bond
Keesa Bond describes her technical interests as being those of an investigative Data Scientist. Throughout her career in academia, Keesa has found creative ways to use technology to make data more meaningful for those it should serve. She knows that the stories that data can provide are there, but realizes that data inaccuracies often invalidate the story’s ending. Her search for data validity has led her to the realization that Data Governance is foundational to ensure accurate, reliable, actionable data.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles