2009 IPT Meeting: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
(New page: ==Challeneges== ===WCS=== ====GetCapabilities: Aggregation and Semantic mapping==== The OGC GetCapabilites response returns (among other things) a listing of the Coverages available fr...)
 
Line 1: Line 1:
==Challeneges==
==Challeneges==


===WCS===
===WCS===


====GetCapabilities: Aggregation and Semantic mapping====
====GetCapabilities: Semantic Mapping and Aggregation ====


The OGC GetCapabilites response returns (among other things) a listing of the Coverages available from the service. Providing this listing for an OPeNDAP server presents a couple of challenges:
The OGC GetCapabilites response returns (among other things) a listing of the Coverages available from the service. Providing this listing for an OPeNDAP server presents a couple of challenges:


*  Identifying which DAP data sets represent Coverages and mapping their metadata into wcs:CoverageDescription and wcs:CoverageSummary elements.
*Providing an concise view of the available holdings.


;  ''Identifying which DAP data sets represent wcs:Coverages and mapping their metadata into wcs:CoverageDescription and wcs:CoverageSummary elements''
: In our prototype service we have isolated this activity through a programming API and temporarily  implemented it to use user developed wcs:CoverageDescriptions stored on the local filesystem. These are connected to data sets available on the server via  their ID's. We are developing a semantic web engine that will ingest RDF representations of the DAP data sets that conform to the CF-1.0 convention and using OWL inferencing and semantic query languages to map these representations into wcs:CoverageDescriptions for the data sets. This will allow the server to "discover" and generate it's own list o WCS Coverage holdings.




; ''Aggregation: Providing a concise view of the available holdings''
: OPeNDAP servers often contain hundreds, if not thousands, of data sets that could be mapped to a wcs:Coverage. Generating a ows:Capailities document with an ows:Contents section that details each one is problematic. The resulting document would be 100's of megabytes in size and would potentially exhaust the available memory for clients attempting to retrieve and parse it. However, in many circumstances,  these large groups of data sets represent could be aggregated on a common instance variable such as time, elevation, or depth. Providing the mechanism for such an aggregation in the OPeNDAP servers would provide a more useful view of the available data, and also decrease the sheer size of the owc:Capabiities document. In addition adding an aggregation feature to the OPeNDAP servers has been one of the most requested  by users. This enhances the value of this activity even further.


==== WCS schemas====
==== WCS schemas====

Revision as of 16:13, 7 August 2009

Challeneges

WCS

GetCapabilities: Semantic Mapping and Aggregation

The OGC GetCapabilites response returns (among other things) a listing of the Coverages available from the service. Providing this listing for an OPeNDAP server presents a couple of challenges:


Identifying which DAP data sets represent wcs:Coverages and mapping their metadata into wcs:CoverageDescription and wcs:CoverageSummary elements
In our prototype service we have isolated this activity through a programming API and temporarily implemented it to use user developed wcs:CoverageDescriptions stored on the local filesystem. These are connected to data sets available on the server via their ID's. We are developing a semantic web engine that will ingest RDF representations of the DAP data sets that conform to the CF-1.0 convention and using OWL inferencing and semantic query languages to map these representations into wcs:CoverageDescriptions for the data sets. This will allow the server to "discover" and generate it's own list o WCS Coverage holdings.


Aggregation: Providing a concise view of the available holdings
OPeNDAP servers often contain hundreds, if not thousands, of data sets that could be mapped to a wcs:Coverage. Generating a ows:Capailities document with an ows:Contents section that details each one is problematic. The resulting document would be 100's of megabytes in size and would potentially exhaust the available memory for clients attempting to retrieve and parse it. However, in many circumstances, these large groups of data sets represent could be aggregated on a common instance variable such as time, elevation, or depth. Providing the mechanism for such an aggregation in the OPeNDAP servers would provide a more useful view of the available data, and also decrease the sheer size of the owc:Capabiities document. In addition adding an aggregation feature to the OPeNDAP servers has been one of the most requested by users. This enhances the value of this activity even further.

WCS schemas

WCS protocol issues (version negotiation)

Missing metadata in CF-1.0/DAP provided via NcML

SOS

Most data served by DAP is gridded and is semantically closest to the OGC coverage data model. While the SOS specification indicates that it can (and should?) be used to serve all types of data from all types of sensors, the available examples focus solely on in situ style observations. NOAA-IOOS have indicated that they intend to use SOS for these in situ type measurements, and have developed the required extensions to the SOS O&M schemas. However the both the SOS specification and NOAA-IOOS's implementation of it create serious challenges for OPeNDAP. We feel that while it may be possible to write a custom OPeNDAP server to work directly with the data held in a particular frame work, such as the NDBC, the resulting software would be so site specific that it would be of little use to other partners.

Next Steps

GeoDAP

As the scientific community moves towards making data available through a number of standards that explicitly provide temporal and spatial (meta)data, OPeNDAP has been repeatedly challenged to find mechanisms to accommodate this drive. The basic problem is that the DAP protocol and data model do not contain explicit semantics for either spatial or temporal information. The DAP can represent and carry the information, but the explicit meaning is not embedded in the system and thus must be inferred.

Extending the DAP data model (and programming API) to create spatial/temporally "aware" data types would promote a much higher degree of interoperability between DAP data sources and other spatial/temporaily explicit data representations.

This extension can be done in such a way that older DAP clients that do not have the facility to process the newer data types will not be broken, as the new data types will "decay" to usable DAP2 versions where the additional semantic information for spatial/temporal use will be placed in the DAP2 Attribute frameworks.

Newer clients will be able to take direct advantage of the explicit spatial/temporal by being able to constrain responses using the values of the time and location information.

This work would entail:

  • Data model development.
  • API extension.
  • Client software changes.
  • Data handler changes.

Changing the data handlers for the OPeNDAP Hyrax server is significant. It means that the handlers will be rewritten to evaluate their respective input types and determine if the data being served can be promoted to GeoDAP data types. Initially the netCDF handler would be modified to identify files adhering to CF-1.0 and easily promote the contained types to GeoDAP types. Clearly other conventions will need to supported and we think that it leveraging the work we have done with WCS and the semantic web might make real sense in this area.