Create Data Requirements

From SemWebQuality.org
(Difference between revisions)
Jump to: navigation, search
(What are Data Requirements?)
(Type 1: Property Requirements)
Line 48: Line 48:
 
* [[dqm:LegalValueRangeRule]]
 
* [[dqm:LegalValueRangeRule]]
 
* [[dqm:LegalValueRule]]
 
* [[dqm:LegalValueRule]]
* [[dqm:PropertyCompletenessRule]]
+
* [[dqm:DataCompletenessRule]]
 
** [[dqm:MissingPropertyRule]]--> <span style="color:red;">'''RENAME'''</span>
 
** [[dqm:MissingPropertyRule]]--> <span style="color:red;">'''RENAME'''</span>
 
** [[dqm:MissingValueRule]]--> <span style="color:red;">'''RENAME'''</span>
 
** [[dqm:MissingValueRule]]--> <span style="color:red;">'''RENAME'''</span>

Revision as of 13:11, 8 October 2011

Contents

What are Data Requirements?

Data requirements are prescribed directives or consensual agreements that define the content and/or structure that constitute high quality data instances and values. Data requirements can thereby be stated by several different individuals or groups of individuals. Moreover, data requirments may also be based on laws, standards, or other directives. They may be agreed upon or contrary to each other.

However, data requirements are required as a prerequisite to measure data quality. Hence, they serve as a benchmark that defines the desired state of data. In the following, we describe how you can express your data requirements via the DQM-Vocabulary.

Types of Data Requirements

Data requirements usually refer to different data items. When we look at a table we usually have at least four types of data items, (1) columns, (2) rows, (3) schemata, and (4) the table/spreadsheet itself.

Table to illustrate used terminology

In Semantic Web environments, we can compare columns to properties, rows to instances, schemata to ontologies, and tables to classes. Data requirements can usually be related to one of these elements. In particular, there are

  1. data requirements related to the values of a single property (column)
  2. data requirements related to the values of multiple properties within an instance (multiple columns in a row)
  3. data requirements related to the instances of a whole class (table)
  4. data requirements related to the ontology elements (schema)

With the DQM-Vocabulary, you can model the first three types of requirements. Schema/ontology requirements are currently not part of the vocabulary, but may be added in future releases. In the following, we explain how Property-, Multi-Property-, Class-, and Custom-Requirements can be modelled with the current version of the DQM-Vocabulary.

Define Tested Elements

Before you can use your data with the DQM-Vocabulary, you have to declare the elements of your ontology that shall be used in the data requirements. You have two options to do this:

OWL Full Definition

You make the classes and properties that shall be tested for data quality problems direct instances of the classes dqm:TestedClass or dqm:TestedProperty.

foo:MyClass a dqm:TestedClass
foo:MyProperty a dqm:TestedProperty

Attention: This will make your knowledge base OWL Full which is potentially not useful if you plan to use reasoning.

OWL DL Definition

You map the classes and properties that shall be tested for data quality problems to new instances of the classes dqm:TestedClass and dqm:TestedProperty.

foo:Class_1 a dqm:TestedClass
                dqm:hasURI "http://www.example.org/MyClass"^^xsd:anyURI
foo:Property_1 a dqm:TestedProperty
                dqm:hasURI "http://www.example.org/MyProperty"^^xsd:anyURI

Type 1: Property Requirements

Property requirements are data requirements that are related to values of a single property. The DQM-Vocabulary provides you requirements to model


Type 2: Class Requirements

Type 3: Multi-Property Requirements

Type 4: Custom Requirements

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox