Create Data Requirements

From SemWebQuality.org
(Difference between revisions)
Jump to: navigation, search
(Example 1: PropertyCompletenessRule (Minimal Input))
(Example 1: PropertyCompletenessRule (Minimal Input))
Line 63: Line 63:
  
 
If you defined your data elements in OWL Full ([[#Option 1: Classes and Properties as Instances (OWL Full)|Option 1]]), then you can simply use the URIs of your ontology in the definition of the Property Completeness Rule as follows:
 
If you defined your data elements in OWL Full ([[#Option 1: Classes and Properties as Instances (OWL Full)|Option 1]]), then you can simply use the URIs of your ontology in the definition of the Property Completeness Rule as follows:
 +
  
 
'''Definition in OWL Full'''
 
'''Definition in OWL Full'''
Line 74: Line 75:
 
</pre>
 
</pre>
 
[[Generate_Problem_Reports#Example 2: Property Completeness Violations (OWL Full Design)|See here how to generate a problem report from this]]<br />
 
[[Generate_Problem_Reports#Example 2: Property Completeness Violations (OWL Full Design)|See here how to generate a problem report from this]]<br />
 +
 +
 
The property [[dqm:requiredProperty]] specifies that the property "MyProperty" must exist in each instance. The property [[dqm:requiredValue]] specifies that a value must exist for property "MyProperty".
 
The property [[dqm:requiredProperty]] specifies that the property "MyProperty" must exist in each instance. The property [[dqm:requiredValue]] specifies that a value must exist for property "MyProperty".
  
 
If you mapped your own ontology elements to new URIs ([[#Option 2: Mapping of Classes and Properties to new URIs (OWL DL)|Option 2]], OWL DL), then the following example will help you to define a Property Completeness Rule:
 
If you mapped your own ontology elements to new URIs ([[#Option 2: Mapping of Classes and Properties to new URIs (OWL DL)|Option 2]], OWL DL), then the following example will help you to define a Property Completeness Rule:
 +
  
 
'''Definition in OWL-DL'''
 
'''Definition in OWL-DL'''

Revision as of 18:12, 10 October 2011

This site is currently under construction!

Contents


What are Data Requirements?

Data requirements are prescribed directives or consensual agreements that define the content and/or structure that constitute high quality data instances and values. Data requirements can thereby be stated by several different individuals or groups of individuals. Moreover, data requirments may also be based on laws, standards, or other directives. They may be agreed upon or contrary to each other.

However, data requirements are required as a prerequisite to measure data quality. Hence, they serve as a benchmark that defines the desired state of data. In the following, we describe how you can express your data requirements via the DQM-Vocabulary.

Types of Data Requirements

Data requirements usually refer to different data items. When we look at a table we usually have at least four types of data items, (1) columns, (2) rows, (3) schemata, and (4) the table/spreadsheet itself.

Table to illustrate used terminology

In Semantic Web environments, we can compare columns to properties, rows to instances, schemata to ontologies, and tables to classes. Data requirements can usually be related to one of these elements. In particular, there are

  1. data requirements related to the values of a single property (column)
  2. data requirements related to the values of multiple properties within an instance (multiple columns in a row)
  3. data requirements related to the instances of a whole class (table)
  4. data requirements related to the ontology elements (schema)

With the DQM-Vocabulary, you can model the first three types of requirements. Schema/ontology requirements are currently not part of the vocabulary, but may be added in future releases. In the following, we explain how Property-, Multi-Property-, Class-, and Custom-Requirements can be modelled with the current version of the DQM-Vocabulary.

Define Tested Elements

Before you can use your data with the DQM-Vocabulary, you have to declare the elements of your ontology that shall be used in the DQM-Vocabulary. You have two options to do this with impact on decidablility of potential reasoning with your knowledge base:

Option 1: Classes and Properties as Instances (OWL Full)

Classes and properties that shall be tested for data requirement violations are defined as direct instances of the classes dqm:TestedClass or dqm:TestedProperty.

foo:MyClass a dqm:TestedClass
foo:MyProperty a dqm:TestedProperty

Attention: This will make your knowledge base OWL Full which is potentially not useful if you plan to use reasoning.

Option 2: Mapping of Classes and Properties to new URIs (OWL DL)

Classes and properties that shall be tested for data requirement violations are mapped to new instances of the classes dqm:TestedClass and dqm:TestedProperty.

foo:Class_1 a dqm:TestedClass
                dqm:hasURI "http://www.example.org/MyClass"^^xsd:anyURI
foo:Property_1 a dqm:TestedProperty
                dqm:hasURI "http://www.example.org/MyProperty"^^xsd:anyURI

Type 1: Property Requirements

Property requirements are data requirements that are related to values of a single property. The DQM-Vocabulary provides the following property requirements:


Example 1: PropertyCompletenessRule (Minimal Input)

A property completeness rule is a data requirement that specifies that a certain property and/or its value must exist in all instances of a certain class.

If you defined your data elements in OWL Full (Option 1), then you can simply use the URIs of your ontology in the definition of the Property Completeness Rule as follows:


Definition in OWL Full

foo:PropertyCompletenessRule_1
      a       dqm:PropertyCompletenessRule ;
      dqm:testedClass http://www.example.org/MyClass ;
      dqm:testedProperty1 http://www.example.org/MyProperty ;
      dqm:requiredProperty "true"^^xsd:boolean ;
      dqm:requiredValue "true"^^xsd:boolean .

See here how to generate a problem report from this


The property dqm:requiredProperty specifies that the property "MyProperty" must exist in each instance. The property dqm:requiredValue specifies that a value must exist for property "MyProperty".

If you mapped your own ontology elements to new URIs (Option 2, OWL DL), then the following example will help you to define a Property Completeness Rule:


Definition in OWL-DL

foo:PropertyCompletenessRule_1
      a       dqm:PropertyCompletenessRule ;
      dqm:testedClass foo:Class_1 ;
      dqm:testedProperty1 foo:Property_1 ;
      dqm:requiredProperty "true"^^xsd:boolean ;
      dqm:requiredValue "true"^^xsd:boolean .

See here how to generate a problem report from this


The property dqm:requiredProperty specifies that the property "MyProperty" which is mapped to "foo:Property_1" must exist in each instance of the class "MyClass" which is mapped to "foo:Class_1". The property dqm:requiredValue specifies that also a value must exist for property "foo:Property_1".

Congratulations, you can use now generic SPARQL queries to test the completeness of "MyProperty" / "foo:Property_1" in instances of "MyClass" / "foo:Class_1".

Example 2: PropertyCompletenessRule with Rule-Metadata

You can annotate your data requirements with several meta-information, such as information about its provenance, its task-dependency, a natural language description, and how the requirement shall be used. Below you can see an example that makes extensive use of the DQM-Vocabulary regarding the specification of data requirements.

foo:PropertyCompletenessRule_1
      a       dqm:PropertyCompletenessRule ;
      dqm:testedClass foo:Class_Location ;
      dqm:testedProperty1 foo:Prop_Location_Country ;
      dqm:requiredProperty "true"^^xsd:boolean ;
      dqm:requiredValue "true"^^xsd:boolean ;
      dqm:reqName "Country Completeness in Class Location"^^xsd:string ;
      dqm:reqDescription "Each instance of the class \"Location\" must have a property
                           value for the property \"Country\""^^xsd:string ;
      dqm:reqSource "Christian Fürber"^^xsd:string ;
      dqm:taskDependent "false"^^xsd:boolean ;
      dqm:assessment "true"^^xsd:boolean ;
      dqm:confidence "80"^^rdfs:Literal ;
      dqm:filtering "true"^^xsd:boolean ;
      dqm:validation "true"^^xsd:boolean ;
      dqm:importance "3" ;
      dqm:lastModified "2011-10-10T18:20:55.106+01:00"^^xsd:dateTime ;
      dqm:validFrom "2011-10-10T18:19:32.917"^^xsd:dateTime ;
      dqm:validUntil "2012-10-10T18:19:57.191+01:00"^^xsd:dateTime .

Type 2: Class Requirements

Type 3: Multi-Property Requirements

Type 4: Custom Requirements

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox