|
Seven Misconceptions about Data QualityOriginally published in DM Review. The narrow definition of data quality is that it's about bad data - data that is missing or incorrect. A broader definition is that data quality is achieved when a business uses data that is comprehensive, consistent, relevant and timely. If you focus only on the narrow data definition you may be lulled into a false security when, in fact, your efforts fall short. We will address several more misconceptions about data quality. In order to fix a problem you have to recognize you have a problem. According to recent Gartner research, 25 percent of Fortune 1000 companies are working with poor quality data. The Data Warehousing Institute (TDWI) estimated that data quality problems cost U.S. businesses $600 billion each year. Regulatory initiatives such as Sarbanes-Oxley and Basel II dictate that companies must provide transparent data. But even with the documented high costs of poor data quality and the tight regulatory environment, many companies are turning a blind eye to their data quality problems. Why? Perhaps it is because of their mistaken belief that bad data is the only data quality issue they need to worry about. A corollary to the above: to fix a problem you first have to take responsibility for it. That's the rub. Taking responsibility is the biggest roadblock to dealing with data quality. In order to achieve a high level of quality, data has to be viewed from an enterprise and holistic perspective. Data may be correct within each data silo, but the information will not be consistent, relevant or timely when viewed across the enterprise. To make matters worse, you've got each report or analysis interpreting the data differently, so even when the numbers start off the same in each silo, the end results will not be consistent. Data is a corporate asset and has to be consistent across the entire corporation, not just within the business function or division where it originated. Misconception #1: You Can Fix Data Misconception #2: Data Quality is an IT Problem Misconception #3: The Problem is in the Data Sources or Data
Entry The larger issue is that you need to manage data from its creation all the way to information consumption. You need to be able to trace its flow from data entry, transactional systems, data warehouse, data marts and cubes all the way to the report or spreadsheet used for the business analysis. Data quality requires tracking, checking and monitoring data throughout the entire information ecosystem. To make this happen you need data responsibility (people), data metrics (processes) and meta data management (technology). (We'll address how in a future column.) Misconception #4: The Data Warehouse will Provide a Single Version
of the Truth However, two significant conditions lessen the likelihood that the data warehouse solves your data quality issues by itself. First, people get data for their reports and analysis from a variety of data sources - data warehouse (sometimes there are multiple data warehouses in an enterprise), data marts and cubes (that you hope were sourced from the data warehouse). They also get data from systems such as ERP, CRM, and budgeting and planning systems that may be sourced into the data warehouse themselves. In these cases, ensuring data quality in the data warehouse alone is not enough. Multiple data silos mean multiple versions of the truth and multiple interpretations of the truth. Data quality has to be addressed across these data silos, not just in the data warehouse. Second, data quality involves the source data and its transformation into information. That means that even if every report and analysis gets data from the same data warehouse, if the business transformations and interpretations in these reports are different then there still are significant data quality issues. Data quality processes need to involve data creation; the staging of data in data warehouses, data marts, cubes and data shadow systems; and information consumption in the form of reports and business analysis. Applying data quality to the data itself and not its usage as information is not sufficient. Misconception #5: The ERP System will Provide a Single Version
of the Truth Misconception #6: The Corporate Performance Management (CPM)
System will Provide a Single Version of the Truth Misconception #7: BI Standardization will Eliminate the Problem
of Different "Truths" Represented in the Reports or Analysis Data quality is defined as comprehensive, consistent, relevant and timely data for use by the business. Don't shrug it off as issue of bad data entry. Data needs to be addressed on an enterprise scale and in a holistic manner incorporating people, processes and technology.
|
© 2011 Athena IT Solutions![]() |
Privacy Policy | Sitemap | Contact Us |
| Data warehouse & Business Intelligence consulting | |