Data Quality

From SemWebQuality.org
(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
There are multiple different ways to define data quality and there is currently no commonly agreed definition on what data quality is. However,   
+
There are multiple different ways to define data quality and there is currently no commonly agreed definition on what data quality is. However, the following tables provides an overview of popular data quality definitions.  
  
A large part of literature says that data is of high quality when it is fit for use. However, data must Not only Be fit for use but also Be Störer securely or meet legal requirements which may Even hinder its Fitness for use. e.g some Data Might Need to Be encoded and may only Be understood After Learning a decryption methodology. This is Not very convenient but may Be necessary to Protest Data from unwahres use.
 
 
{| class="wikitable sortable"
 
{| class="wikitable sortable"
 
|+ Popular Data Quality Definitions
 
|+ Popular Data Quality Definitions
Line 17: Line 16:
 
|ISO 8000 || Quality is the "degree to which a set of inherent characteristics fulfils requirements"<ref>ISO (2005) ISO8000-102:2009, Data quality — Part 102: Master data: Exchange of characteristic data: Vocabulary</ref>
 
|ISO 8000 || Quality is the "degree to which a set of inherent characteristics fulfils requirements"<ref>ISO (2005) ISO8000-102:2009, Data quality — Part 102: Master data: Exchange of characteristic data: Vocabulary</ref>
 
|}
 
|}
 +
 +
All of the above definitions have something in common: data quality is something that compares the '''status quo''' to a '''desired state'''. The desired state is called "fitness for use", "specification", "consumer expectations","defect-free","desired features", or simply "requirements". The desired state may thereby not only be stated by data consumers, but also by data providers, administrators, legal authorities, and many other stakeholders. Thus, there are multiple different perspectives on requirements, but all of the definitions basically agree that '''data quality is the degree to which requirements are fulfilled'''.
 +
 +
The requirements can thereby be manyfold due to multiple different tastes, needs, and perspectives. Hence, data quality is also multi-dimensional. Wang and Strong
 +
 +
  
  
This project sees data quality as the '''degree to which data fulfills requirements''' based on the definition of quality by ISO 8000-102:2009 and IS0 9000:2005. Furthermore, we see data quality as a '''multiple perspective phenomenon'''. Hence, data requirements may not only be stated by data consumers, but also by data providers, administrators, legal authorities, and many other stakeholders.
 
 
----
 
----
 
<references />
 
<references />

Revision as of 12:35, 9 October 2011

There are multiple different ways to define data quality and there is currently no commonly agreed definition on what data quality is. However, the following tables provides an overview of popular data quality definitions.

Popular Data Quality Definitions
Authors Data Quality Definition
Wang and Strong (1996) “[…] data that are fit for use by data consumers.”[1]
Kahn, Strong, and Wang (2002) “conformance to specifications” and “meeting or exceeding consumer expectations”[2]
Redman (2001) “Data are of high quality if they are fit for their intended uses in operations, decision making, and planning. Data are fit for use if they are free of defects and possess desired features.”[3]
Olson (2003) “[…] data has quality if it satisfies the requirements of its intended use.”[4]
ISO 8000 Quality is the "degree to which a set of inherent characteristics fulfils requirements"[5]

All of the above definitions have something in common: data quality is something that compares the status quo to a desired state. The desired state is called "fitness for use", "specification", "consumer expectations","defect-free","desired features", or simply "requirements". The desired state may thereby not only be stated by data consumers, but also by data providers, administrators, legal authorities, and many other stakeholders. Thus, there are multiple different perspectives on requirements, but all of the definitions basically agree that data quality is the degree to which requirements are fulfilled.

The requirements can thereby be manyfold due to multiple different tastes, needs, and perspectives. Hence, data quality is also multi-dimensional. Wang and Strong




  1. Wang, R. Y., & Strong, D. M. (1996). Beyond accuracy: what data quality means to data consumers. Journal of Management Information Systems, 12(4), 5-33.
  2. Kahn, B. K., Strong, D. M., & Wang, R. Y. (2002). Information quality benchmarks: product and service performance. Commun. ACM, 45(4), 184-192.
  3. Redman, T. C. (2001). Data quality: the field guide. Boston: Digital Press.
  4. Olson, J. (2003). Data quality: the accuracy dimension. San Francisco, USA: Morgan Kaufmann; Elsevier Science.
  5. ISO (2005) ISO8000-102:2009, Data quality — Part 102: Master data: Exchange of characteristic data: Vocabulary
Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox