StaticRdfCatalog

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽


Overview

The StaticRDFCatalog uses semantic web technologies to create mappings between DAP data sets and WCS Coverages. A WCS Coverage is a very specific data type and is much more constrained than the more general DAP data model. Thus, only certain DAP data sets will be representable as WCS Coverages. Evaluating which ones can be represented as WCS Coverages requires the semantic analysis of the metadata associated with each data set. Since the DAP has no actual semantic metadata requirements for metadata it is necessary (at least for the moment) to look only at DAP data sets that have metadata that conforms to some well known metadata convention or standard. Since the semantics of the convention are know it is then possible to write inferencing rules that relate pieces of information in one convention/standard to the equivalent information in another convention/standard.

Even when using a DAP data set that has metadata that conforms to a well known metadata convention (such as CF-1.0) the existing metadata in the DAP framework may be inadequate to make a complete 1:1 mapping into the representation of a WCS Coverage. In that case we rely on the Hyrax NcML Module to allow us to add metadata components to the DAP metadata that will allow the semantic engine to complete the construction of a wcs:Coverage metadata element.

Semantic Generation Of WCS Catalogs: A more detailed discussion of the semantic processing may be found here.

What makes a DAP data set a coverage?

A DAP data set is a WCS Coverage when it has a Grid variable that can be geolocated. Elaborate: Discuss the syntactic metadata connections between the DAP and WCS (Grid variables become wcs:Field elements, etc.) and how incomplete metadata can be augmented (using NcML...)

Supported Conventions

DAP data sets that use these metadata conventions and that have the required data structures can (potentially) be served as WCS Coverages.

CF-1.0 Convention

The Climate and Forecast convention is heavily used at UCAR and NOAA and many DAP data sets are available that conform to this convention.

Augmenting Dataset Metadata with NcML

As previously mentioned it will most likely be the case that existing DAP metadata, even that which follows the CF-1.0 convention, will be missing information required to form a complete wcs:CoverageDescription of the data set. We will use the new Hyrax NcML Module to add metadata to the data set in such a manner that the semantic engine may work upon it to produce a happy result. The most direct way of doing this is to directly add the missing WCS information to the data set components by placing the WCS XML content to DAP Attributes of type OtherXML. Elaborate: What is missing from CF? Geographical extents? How are they added? Supply example.

Read about the NcML Module here:

Configuration

As part of the WCS DispatchHandler the configuration of the StaticRDFCatalog is held in the olfs.xml file used to configure the OLFS. StaticRDFCatalog reads it's configuration from the content of the WcsCatalog element in the DispatchHandler declaration for the WCS DispatchHandler. In this section we discuss the various content elements used to configure the catalog.

Catalog Implementation Class Name
opendap.semantics.IRISail.StaticRDFCatalog
Usage
<WcsCatalog className="opendap.semantics.IRISail.StaticRDFCatalog"> ... </WcsCatalog>


Adding Data Sets To The Catalog

In order for the WCS service to work it must know which DAP data sets are to be served as WCS coverages.

Currently we cannot import data set metadata directly into the semantic repository, as the current implementation does not support the complete set of inferencing rules. In order to add data sets to a catalog the following must happen:

  • Source data must be available from a server running Hyrax 1.6 or newer.
  • A list of data set URLs must be emailed to Benno.
  • Benno will take these URLs and run them through the complete inferencing engine to build a single RDF file holding the result.
  • Benno makes the resulting file available on the network.
  • The user adds the URL of the result file to an RdfImport element (see below) in the configuration.

RdfImport

An RdfImport element identifies a single RDF file to load directly into the semantic repository. This is a mechanism for loading additional OWL ontologies and inference rules into the system at start-up. The RdfImport element must contain the fully qualified URL for an RDF file.

Usage
<RdfImport>http://iri.columbia.edu/~benno/opendaptest/daptestAll.owl</RdfImport>

Coverage (Not yet supported)

A Coverage element identifies a single DAP data set that is to served as a WCS coverage. The Coverage element must contain the fully qualified data access URL for a DAP data set that is to be served as a coverage. This URL will be examined and the software will attempt to get the RDF version of the data set's DDX from the DAP server.

Usage
<Coverage>http://localhost:8080/opendap/data/nc/examples/200803061600_HFRadar_USEGC_6km_rtv_SIO.nc.ddx</Coverage>

ThreddsCatalog (Not yet supported)

A ThreddsCatalog element identifies a THREDDS catalog which contains DAP data access URLs that point to DAP data sets that will be served as WCS coverages. Each DAP data set in the catalog will be served as a separate wcs:Coverage. The recurse attribute determines if the software will follow catalogRef links in the THREDDS catalog to ingest the entire catalog hierarchy starting at the node identified by the URL.

Usage
<ThreddsCatalog recurse="false" >http://test.opendap.org:8080/opendap/coverage/catalog.xml</ThreddsCatalog>

Overriding Default Paths

StaticRDFCatalog relies on two local paths for its operation. Normally these are determined at runtime by the WCS DispatchHandler and passed to the catalog. However, there may be times when it is useful/necessary to override the defaults encoded into the software.

PeristentContentPath

The PeristentContentPath element is an optional element used to inform the software where it can write things to the local disk, either as a scratch space or as a way to persist state. If omitted it defaults to: $CATALINA_HOME/content/opendap/<prefix>/StaticRDFCatalog where <prefix> is specified in the WCS DIspatchHandler configuration.'This is a debugging option and should be omitted for normal operations.

Usage
<PeristentContentPath>/Users/ndp/OPeNDAP/Projects/Hyrax/swdev/ioos/apache-tomcat-6.0.14/content/opendap/WCS/StaticRDFCatalog</PeristentContentPath>

ResourcePath

The ResourcePath element is an optional element used to inform the software where it find documents (such as XSLT files) that it relies on to function. If omitted it defaults to: $CATALINA_HOME/webapps/opendap/<prefix> where <prefix> is specified in the WCS DIspatchHandler configuration.This is a debugging option and should be omitted for normal operations.

Usage
<ResourcePath>/Users/ndp/OPeNDAP/Projects/Hyrax/swdev/ioos/apache-tomcat-6.0.14/webapps/opendap/WCS/</ResourcePath>

Controlling Catalog Update Behavior

The useUpdateCatalogThread element is used to control the way in which the StaticRDFCatalog updates its semantic repository and Coverage catalog. Using this element will cause StaticRDFCatalog to spawn a worker thread that will update the semantic repository (and thus the WCS catalog holdings) in the background. If the useUpdateCatalogThread element is omitted StaticRDFCatalog will not spawn a worker thread, and will attempt to update it's holdings at startup. This attempt may fail, and the update will not be made.

The firstUpdateDelay attribute controls how long (in seconds) the worker thread will wait before making the first update. The updateInterval attribute is used to specify how frequently (in seconds) the catalog should be updated.

Usage
<useUpdateCatalogThread updateInterval="3600" firstUpdateDelay="60"/>

Example Configuration

   <Handler className="opendap.wcs.v1_1_2.DispatchHandler">
       <prefix>WCS</prefix>
       <ServiceIdentification>/absolute/path/to/the/document/ServiceIdentification.xml</ServiceIdentification>
        <ServiceProvider>/absolute/path/to/the/document/ServiceProvider.xml</ServiceProvider>
        <OperationsMetadata>/absolute/path/to/the/document/OperationsMetadata.xml</OperationsMetadata>

       <WcsCatalog className="opendap.semantics.IRISail.StaticRDFCatalog">
           <useUpdateCatalogThread updateInterval="3600" firstUpdateDelay="60" />
           <RdfImport>http://iri.columbia.edu/~benno/opendaptest/daptestAll.owl</RdfImport>
       </WcsCatalog>

   </Handler>