Applies to:
SAP Master Data Management (SAP MDM), Version 7.1
Summary
This article is about the importance of Data Quality in an organization and various data issues that an organization face. The article explains how SAP MDM helps in dealing with various data issues in an organization.
Author: Priya Zutshi
Company: Mahindra Satyam
Created on: 10 April 2012
Author Bio
Priya Zutshi has been associated with Mahindra Satyam for 12 months and has been a part of MDM practice. She is skilled in SAP MDM, and completed her Bachelor’s degree in Electronics and Communication Engineering.
Table of Contents
1. Introduction
2. Data Quality Dimensions
3. How Data Quality Affects Business Growth
4. The Price Organizations Pay for Poor Data Quality
5. Five main data quality issues that an organization face
5.1. Data Duplication
5.2. Stale Data
5.3. Incomplete Data
5.4. Data Conflicts
5.5. Invalid Data
6. Features of Data Quality Management
6.1. Data Profiling
6.2. Data Quality
6.3. Data Integration
6.4. Data Augmentation
7. How SAP MDM (Master data Management) can help in maintaining Data Quality?
7.1. Mapping & Conversion
7.1.1. Field Mapping
7.1.2. Value Mapping
7.1.3. Conversion
7.2. Validations & Assignments
7.2.1. Validations
7.2.2. Assignments
7.2.3. Expressions
7.3. Matching & Merging
7.3.1. Matching
7.3.2. Merging
7.4. Key Mapping
7.5. Enrichment Architecture MDM
7.6. DB Views
8. Conclusion
9. Related Content
10. Copyright
A simple definition for Data quality can be: Assessment of Data’s fitness. Data quality management (DQM) plays an important role in all kinds of organizations whether it is private or public organization. Today the enterprises are facing challenges by working in a volatile global business environment. They have disparate global spread of systems. The data is exchanged between various SAP and Non- SAP systems worldwide. As there are many data entry points, this leads to master data discrepancy. Poor data quality can have a critical effect on business processes leading to increased costs and lowering customer satisfaction.
Nearly every IT system has erroneous data. Most system architects know why data quality is important. Still, data management is surprisingly underdeveloped in many companies.
Accuracy: The extent to which data are correctly representing an action or real world object.
Completeness: The extent to which values are present in a data collection.
Update status: The extent to which data is updated at regular interval of time.
Relevance: The extent to which data is applicable and helpful for the particular application.
Consistence: The extent to which data knowable in one database correspond to the data in a redundant or distributed database.
Presentation: The extent to which data is presented in an appropriate manner.
Accessibility: The extent to which data is available at a given point in time.
Data is of high quality if it is fit for its proposed use in operations, decision making and planning. Within an organization, adequate data quality is crucial to operational and transactional processes and to the reliability of business analytics business intelligence reporting.
Data quality is affected by the way data is entered, stored and managed.
Maintaining data quality requires going through the data periodically and cleansing it. Typically this involves updating it, standardizing it, and de-duplicating records to create a single version of truth, even if it is stored in multiple disparate systems.
As master data includes the core information of an organization that is used globally, hence many times data issues center around it. This data must fulfill the governance policies/business rules of the organization.
If data quality is not good, users will not have the confidence in the data and might make decisions that could affect the bottom line or might lead to unnecessary risks for an organization.
According to the survey conducted by Forbes Insights in association with SAP, 82% of respondents agreed that bad information leads to costly mistakes by business managers and 61% agreed that business processes suffer from inconsistent or otherwise flawed information
All sizes of companies experience a data quality problem sooner or later. Bad data can lead to wrong decisions. And data entry errors can keep you from knowing what your customers have bought from you or cause you to ship products to the wrong location. Few among those surveyed thought data quality problems did not hit their enterprise’s balance sheet. In fact, a majority of respondents said the yearly damage to the bottom line due to data quality problems exceeded $5 million. Nearly one-in-five (18%) estimated the annual cost to be more than $20 million. This survey was conducted by Forbes Insights in association with SAP.
Multiple copies of the same data
Cause:
· Incorrect data entry
· Poor integration
· Faulty database design
Impact:
· Wasted storage space
· Ongoing problems with direct sales and/or marketing communications
Solution:
· Data quality tools
· Better integration
Using old data i.e. using data which is not updated
Cause:
· Contacts changing position
· One-time integration with no ongoing delta import
· Data not being available fast enough from source systems
Impact:
· Problems with marketing correspondence, leading to lost sales and damaged customer relationships
Solution:
· Update the data after regular interval of time
Cause:
· End user lack of concern
· Required fields not made mandatory
· Poor user interface
Impact:
· Missing data can lead to productivity losses and wrong decision-making
Solution:
· Apply data validations
· Easy-to-use interfaces
· User Training
Data contained in one system is at odds with data contained
Cause:
· No designated system of record
· Poor integration
· Lack of data interchange between systems
Impact:
· Data conflicts confuse users
· Wasted time and effort
· Threat of using incorrect data
Solution:
· Tighter system integration
· Data auditing
Cause:
· Ineffective validation rules
· Data type mismatches between integrated systems
Impact:
· Creates integration exception reports
· Interferes with operational reporting
Solution:
· Strong data validation
· User training
This process involves looking at the actual data. Data profiling determines if the data is complete and accurate.
Data profiling can be explained from the following table:
On the basis of data profiling, we can understand the cause of the problem. For example, data profiling can tell that we have duplicate data, like
different representations of the same product. Once we identify the specific data problem, we can work on it.
Here, we can choose one of four basic options:
Exclude the data: If the problem with the data is severe, the best approach is to remove the data.
Accept the data: Even if we know that there are errors in the data, if the error is within our acceptance limits, the best approach is to accept the data with the error.
Correct the data: When we come across different variations of a customer name, we could select one to be the master so that the data can be consolidated.
Insert a default value: Sometimes it is important to have a value for a field even when we are unsure of the correct value. We can create a default value and insert that value in the field.
Sometimes data about the same item often exists in multiple databases. For e.g. customer name, customer address, product data, etc.
For example, one company had two product files – a master product extract from US and a product extract from Europe. The company sold the same products in both areas, but the products may be sold under different names, and the product, brand, and description patterns in each file were based on the data entry personnel.
The first challenge in data integration is to recognize that the same customer exists in each of the two sources, this is known as linking, and the second challenge is to combine the data into a single view of the product which is known as consolidation.
With customer data, we often find a common field (e.g., PAN number) that can be used to identify commonality. When this occurs, then multiple records for the same customer can quickly be identified. With product data, this is often not the case.
Data integration can be explained from the following table:
Data augmentation is used for increasing the value of data. It entails additional external data not directly related to the base data. With customer data, it’s
very common to combine internal data with data from third parties to increase an understanding of the customer. We might also obtain data about the behavior of customers with certain attributes. By combining that data with data about our specific customers, we can segment customers more effectively to identify specific opportunities.
The goal of SAP MDM is to provide top quality master data for sustained cross-system data consistency. It reduces the error rate as there is a high probability of identifying mistakes by the integration of multiple systems.
This capability includes the whole array of features used to ensure supreme quality standards for master data regarding accuracy, validity, completeness, consistency, or timeliness.
SAP MDM is inherently multi-data-domain via one platform, and it provides a single version of master data for supplier, product, and customer in heterogeneous environments.
SAP MDM accepts data in various formats and from various sources: Access, Delimited text, Excel, Fixed text, Oracle, Port, SQL Server, XML, XML Schema.
The data from various SAP and Non-SAP System enters SAP MDM where it performs filtering of data. It removes unwanted data, duplicates, applies business rules and many more. The data we get after all this processing is high quality data. SAP MDM does not deal with transactional data. Its main focus is on Non- Transactional data.
By following methods we can increase the data quality:
1. Mapping & Conversion
2. Validations & Assignments
3. Matching & Merging
4. Key Mapping
5. Enrichment Architecture MDM
6. DB Views
In this we do Normalization and standardization of data using MDM Import Manager
It allows you to map a source field to the corresponding destination field.
Source fields can be mapped at the value level against the corresponding destination value. It eliminates the need to pre-cleanse source data in an external application.
The data type of source value can manually or automatically be converted or reformatted into the data type of mapped destination field.
Ensures data is defined according to some criteria.
They are excel-like formulas that return Boolean success or failure result. It can be used to perform all sorts of tests against some group of records. For e.g. making sure required fields have a non-NULL value.
It also returns Boolean success or failure result like validation but it also returns the result value of the expression given. Each assignment has got a single fixed field in which the result value of the expression is stored. For e.g. we want to concatenate the two fields: Created by and Updated by and we want to see the value in Description field, we will use an Assignment.
MDM expressions are Excel-like formulas used within MDM in validations, assignments, and calculated fields that are evaluated and return a distinct value for each record. It defines complex formulas based on the data values of the record and evaluate those formulas against a group of one or more records.
In this De-duplication of records is done for consistent master data.
Matching and Merging is performed in Data Manager. It finds duplicate records for consolidation within an MDM repository. Various records are selected and then compared. The matching results are None (the records are not matching), Low (the records are partially matching) or High (the records are exactly matching).
It uses the matching score to decide which of the potential duplicates to merge. It then merges 2 or more duplicates into a single record.
Provide cross-system identification to ensure enterprise-wide data quality. A remote system’s objects are mapped to master data objects within MDM using key mapping.
It maintains the relationship between the remote system’s identifier (key) for an object and the corresponding master data object in MDM. It also provides cross-system identification for reliable, company-wide analytics and business operations. It is the main deliverable of Master Data Consolidation.
The Enrichment Architecture framework is used to integrate MDM with external data enrichment services for improving data quality. Existing MDM functionality is also used during the enrichment process to distribute data to and from remote systems.
During an enrichment process, MDM data elements are sent to an external data enrichment service and a response is imported back to the MDM repository. Process is controlled by the Enrichment Controller, which is an Enterprise JavaBean(EJB).
Using DB Views we can directly Connect SAP NetWeaver MDM with SAP BusinessObjects Data Services (BODS). It generates a read-only database view of an MDM repository’s underlying database schema by using join operations. There is no data replication hence it saves space and resources.
Data is a vital resource of an organization. Ignoring data quality is costly and it affects every organization that relies on accurate and consistent information. SAP MDM helps you to deal with the data quality issues and provides you single version of truth.
http://help.sap.com/saphelp_mdm71/helpdata/en/48/df9e2ead793698e10000000a42189b/frameset.htm
http://images.forbes.com/forbesinsights/StudyPDFs/SAP_InformationManagement_04_2010.pdf
http://en.wikipedia.org/wiki/Data_quality
© Copyright 2012 SAP AG. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice.
Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors.
Microsoft, Windows, Excel, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation.
IBM, DB2, DB2 Universal Database, System i, System i5, System p, System p5, System x, System z, System z10, System z9, z10, z9, iSeries, pSeries, xSeries, zSeries, eServer, z/VM, z/OS, i5/OS, S/390, OS/390, OS/400, AS/400, S/390 Parallel Enterprise Server, PowerVM, Power Architecture, POWER6+, POWER6, POWER5+, POWER5, POWER, OpenPower, PowerPC, BatchPipes, BladeCenter, System Storage, GPFS, HACMP, RETAIN, DB2 Connect, RACF, Redbooks, OS/2, Parallel Sysplex, MVS/ESA, AIX, Intelligent Miner, WebSphere, Netfinity, Tivoli and Informix are trademarks or registered trademarks of IBM Corporation.
Linux is the registered trademark of Linus Torvalds in the U.S. and other countries.
Adobe, the Adobe logo, Acrobat, PostScript, and Reader are either trademarks or registered trademarks of Adobe Systems Incorporated in the United States and/or other countries.
Oracle is a registered trademark of Oracle Corporation.
UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group.
Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc.
HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C®, World Wide Web Consortium, Massachusetts Institute of Technology.
Java is a registered trademark of Oracle Corporation.
JavaScript is a registered trademark of Oracle Corporation, used under license for technology invented and implemented by Netscape.
SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP Business ByDesign, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and other countries.
Business Objects and the Business Objects logo, BusinessObjects, Crystal Reports, Crystal Decisions, Web Intelligence, Xcelsius, and other Business Objects products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of Business Objects S.A. in the United States and in other countries. Business Objects is an SAP company.
All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary.
These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
37 | |
10 | |
5 | |
4 | |
4 | |
3 | |
3 | |
3 | |
2 | |
2 |