Create Data Requirements

From SemWebQuality.org
Revision as of 15:32, 10 October 2011 by Cfuerber (Talk | contribs)
Jump to: navigation, search

This site is currently under construction!

Contents


What are Data Requirements?

Data requirements are prescribed directives or consensual agreements that define the content and/or structure that constitute high quality data instances and values. Data requirements can thereby be stated by several different individuals or groups of individuals. Moreover, data requirments may also be based on laws, standards, or other directives. They may be agreed upon or contrary to each other.

However, data requirements are required as a prerequisite to measure data quality. Hence, they serve as a benchmark that defines the desired state of data. In the following, we describe how you can express your data requirements via the DQM-Vocabulary.

Types of Data Requirements

Data requirements usually refer to different data items. When we look at a table we usually have at least four types of data items, (1) columns, (2) rows, (3) schemata, and (4) the table/spreadsheet itself.

Table to illustrate used terminology

In Semantic Web environments, we can compare columns to properties, rows to instances, schemata to ontologies, and tables to classes. Data requirements can usually be related to one of these elements. In particular, there are

  1. data requirements related to the values of a single property (column)
  2. data requirements related to the values of multiple properties within an instance (multiple columns in a row)
  3. data requirements related to the instances of a whole class (table)
  4. data requirements related to the ontology elements (schema)

With the DQM-Vocabulary, you can model the first three types of requirements. Schema/ontology requirements are currently not part of the vocabulary, but may be added in future releases. In the following, we explain how Property-, Multi-Property-, Class-, and Custom-Requirements can be modelled with the current version of the DQM-Vocabulary.

Define Tested Elements

Before you can use your data with the DQM-Vocabulary, you have to declare the elements of your ontology that shall be used in the DQM-Vocabulary. You have two options to do this with impact on decidablility of potential reasoning with your knowledge base:

Option 1: Classes and Properties as Instances (OWL Full)

Classes and properties that shall be tested for data requirement violations are defined as direct instances of the classes dqm:TestedClass or dqm:TestedProperty.

foo:MyClass a dqm:TestedClass
foo:MyProperty a dqm:TestedProperty

Attention: This will make your knowledge base OWL Full which is potentially not useful if you plan to use reasoning.

Option 2: Mapping of Classes and Properties to new URIs (OWL DL)

Classes and properties that shall be tested for data requirement violations are mapped to new instances of the classes dqm:TestedClass and dqm:TestedProperty.

foo:Class_1 a dqm:TestedClass
                dqm:hasURI "http://www.example.org/MyClass"^^xsd:anyURI
foo:Property_1 a dqm:TestedProperty
                dqm:hasURI "http://www.example.org/MyProperty"^^xsd:anyURI

Type 1: Property Requirements

Property requirements are data requirements that are related to values of a single property. The DQM-Vocabulary provides the following property requirements:


Example 1: PropertyCompletenessRule (Minimal Input)

A property completeness rule is a data requirement that specifies that a certain property and/or its value must exist in all instances of a certain class.

If you mapped your own ontology elements to new URIs (Option 1, OWL DL), then the following example will help you to define a Property Completeness Rule:

Definition in OWL-DL

foo:PropertyCompletenessRule_1
      a       dqm:PropertyCompletenessRule ;
      dqm:testedClass foo:Class_1 ;
      dqm:testedProperty1 foo:Property_1 .

If you defined your data elements in OWL Full (Option 2), then you can simply use the URIs of your ontology in the definition of the Property Completeness Rule as follows:

Definition in OWL Full

foo:PropertyCompletenessRule_1
      a       dqm:PropertyCompletenessRule ;
      dqm:testedClass http://www.example.org/MyClass ;
      dqm:testedProperty1 http://www.example.org/MyProperty .

Congratulations, you can use now generic SPARQL queries to test the completeness of "foo:Property_!" or "MyProperty" in instances of "foo:Class_1" or "MyClass".

Example 2: PropertyCompletenessRule (Maximum Input)

Type 2: Class Requirements

Type 3: Multi-Property Requirements

Type 4: Custom Requirements

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox