DAP Service Terminus: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
Line 1: Line 1:
== Overview ==
== Overview ==


Currently DAP servers provide a number of  dataset level responses (services). These include DAP services such as the DDS, DAS, and DDX, plus other services that have been added over time such as the RDF and ISO responses. There is currently no way for a user (or software agent) to discover which of these many services might be available for a particular dataset.  While THREDDS catalogs do provide a mechanism for determining which services are associated with which datasets, they don't actually provide a mechanism for discovering the details of the various services URLs. A THREDDS catalog might identify a DAP service, but the URL that can be assembled from the THREDDS catalog for that service is limited to just the base dataset URL. Typically dereferencing this URL will provide one of two things - either the underlying data file (for example a netcdf file) or an HTTP 403 error if direct access to the underlying files has been disabled in the server configuration.
Currently DAP servers provide a number of  dataset level responses (services). These include DAP services such as the DDS, DAS, and DDX, plus other services that have been added over time such as the RDF and ISO responses. There is currently no way for a user (or software agent) to discover which of these many services might be available for a particular dataset.  While THREDDS catalogs do provide a mechanism for determining which services are associated with which datasets, they don't actually provide a mechanism for discovering the details of the various service URLs. A THREDDS catalog might identify a DAP service, but the URL that can be assembled from the THREDDS catalog for that service is limited to just the base dataset URL. Typically dereferencing this URL will provide one of two things - either the underlying data file (for example a netcdf file) or an HTTP 403 error if direct access to the underlying files has been disabled in the server configuration.


'''Example:'''
'''Example:'''

Revision as of 17:40, 26 January 2012

Overview

Currently DAP servers provide a number of dataset level responses (services). These include DAP services such as the DDS, DAS, and DDX, plus other services that have been added over time such as the RDF and ISO responses. There is currently no way for a user (or software agent) to discover which of these many services might be available for a particular dataset. While THREDDS catalogs do provide a mechanism for determining which services are associated with which datasets, they don't actually provide a mechanism for discovering the details of the various service URLs. A THREDDS catalog might identify a DAP service, but the URL that can be assembled from the THREDDS catalog for that service is limited to just the base dataset URL. Typically dereferencing this URL will provide one of two things - either the underlying data file (for example a netcdf file) or an HTTP 403 error if direct access to the underlying files has been disabled in the server configuration.

Example:

This URL points to a dataset on a DAP server:

http://test.opendap.org:8080/opendap/data/nc/fnoc1.nc

Dereferencing the link will not connect you to a DAP service or DAP response. It will only return the underlying netcdf file. The only way to know that DAP services exist for this is by reading the URL path and recognizing that it is likely an instance of a DAP server based on the position of the 'opendap' string in the URL.

There is no guarantee that dereferencing this URL will even yield the underlying file, as that behavior is a configuration option for the server. If that access is not allowed in the configuration, then dereferencing the URL will simply return an HTTP 403 (forbidden) error.

Possible Technologies

I looked at Web Services Description Language (WSDL), Web Application Description Language (WADL), and OGC GetCapabilities as possible syntaxes to leverage. All three make assumptions about the structure of the URL and constraint expression (aka query string) that would require an extension to an existing standard be written to support our existing syntax.

Proposal

I propose that base DAP URLs should return an XML document describing the DAP services available for the dataset.

At minimum this document should provide:

  • The name of the service (DDX, DDS, etc.)
  • A link that can be dereferenced to get the service response for the dataset in question.
  • A brief description of the service
  • A link to a complete description of the service response and it's semantics
  • A reference to an XSLT that a browser would use to render the description into HTML for a presentation view.

Additionally this dataset services description document might contain:

  • Descriptions and syntax of server side functions available for the dataset
  • Please add things here that I have missed!


As the DAP servers evolve to support user authentication and sessions the content of the dataset services description document should become dynamic. By this I mean that some services and server functions might be available only to certain users. If an authenticated user asks for a particular dataset services description document they would see all the the services available to them, which might differ significantly from those available to a different user. As powerful server side functions such as re-gridding and re-projection become available we anticipate that data providers may not wish to allow just anyone to utilize them because of the potential burden they may place on the data center's computing resources.\

It may that some services may not be available for all datasets. This argues (along with the previous comment about users and roles) that the service description needs to be generated dynamically and that it should be context sensitive w.r.t. user, role, and dataset.

  • Any thoughts about this?

Current Responses

  • DAP 4 Data (dataset_id + .dap)
  • DAP 2 Data (dataset_id + .dods)
  • DDX (dataset_id +.ddx)
  • DDS (dataset_id +.dds)
  • DAS (dataset_id +.das)
  • Information in HTML (dataset_id +.info)
  • HTML Data Access Form (dataset_id +.html)
  • NetCDF file out (dataset_id +.nc)
  • ISO 19115 in XML (dataset_id +.iso)
  • ISO Conformance rubric (dataset_id +.rubric)
  • RDF representation of the DDX (dataset_id +.rdf)
  • Server version information in XML (dataset_id +.ver)

Proposed Responses

I want to change the behavior so that there is a new service response to just the dataset_id and direct file access gets it's own suffix:

  • Raw file access (dataset_id +.file)
  • Service (dataset_id)

Prototype Service Response Document

I implemented a prototype response for Hyrax. It is not dynamic - all datasets and all users get the same basic response (delta the embedded URL's that return the various service responses).

It looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/opendap/xsl/serviceDescription.xsl"?>
<DatasetServices xml:base="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc">
 <Service name="HTML Data Request Form" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc.html" xlink:type="simple">
   <Description xlink:href="http://services.opendap.org/dataRequestForm.html" xlink:type="simple">OPeNDAP HTML Data Request Form for data constraints and access</Description>
 </Service>
 <Service name="DAP4 Data" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc.dap" xlink:type="simple">
   <Description xlink:href="http://services.opendap.org/dap4_data.html" xlink:type="simple">DAP4 Data Object</Description>
 </Service>
 <Service name="DAP2 Data" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc.dods" xlink:type="simple">
   <Description xlink:href="http://services.opendap.org/dap2_data.html" xlink:type="simple">DAP2 Data Object</Description>
 </Service>
 <Service name="DDX" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc.ddx" xlink:type="simple">
   <Description xlink:href="http://services.opendap.org/ddx.html" xlink:type="simple">OPeNDAP Data Description and Attribute XML Document</Description>
 </Service>
 <Service name="DDS" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc.dds" xlink:type="simple">
   <Description xlink:href="http://services.opendap.org/dds.html" xlink:type="simple">OPeNDAP Dataset Description Structure</Description>
 </Service>
 <Service name="DAS" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc.das" xlink:type="simple">
   <Description xlink:href="http://services.opendap.org/das.html" xlink:type="simple">OPeNDAP Dataset Attribute Structure (DAS)</Description>
 </Service>
 <Service name="INFO" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc.info" xlink:type="simple">
   <Description xlink:href="http://services.opendap.org/info.html" xlink:type="simple">OPeNDAP Dataset Information Page</Description>
 </Service>
 <Service name="RDF" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc.rdf" xlink:type="simple">
   <Description xlink:href="http://services.opendap.org/rdf.html" xlink:type="simple">An RDF representation of the DDX document.</Description>
 </Service>
 <Service name="NetCDF-File" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc.nc" xlink:type="simple">
   <Description xlink:href="http://services.opendap.org/netcdf_fileout.html" xlink:type="simple">NetCDF file-out response.</Description>
 </Service>
 <Service name="ISO-19115" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc.iso" xlink:type="simple">
   <Description xlink:href="http://services.opendap.org/iso_metedata.html" xlink:type="simple">ISO 19115 Metadata Representation of the DDX.</Description>
 </Service>
 <Service name="ISO-19115-Score" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc.rubric" xlink:type="simple">
   <Description xlink:href="http://services.opendap.org/iso_rubric.html" xlink:type="simple">ISO 19115 Metadata Representation conformance score for this dataset.</Description>
 </Service>
 <Service name="FileAccess" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc.file" xlink:type="simple">
   <Description xlink:href="http://services.opendap.org/dataset_file_access.html" xlink:type="simple">Access to dataset file.</Description>
 </Service>
 <Service name="ServiceDescription" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://localhost:8080/opendap/hyrax/NWAtlanticDec_1km.nc" xlink:type="simple">
   <Description xlink:href="http://services.opendap.org/service_description.html" xlink:type="simple">Service Description response.</Description>
 </Service>
 <ServerSideFunctions>
   <Function name="geogrid" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://docs.opendap.org/index.php/Server_Side_Processing_Functions#geogrid">
     <Description>Allows a DAP Grid variable to be sub-sampled using georeferenced values.</Description>
   </Function>
   <Function name="grid" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://docs.opendap.org/index.php/Server_Side_Processing_Functions#grid">
     <Description>Allows a DAP Grid variable to be sub-sampled using the values of the coordinate axes.</Description>
   </Function>
   <Function name="linear_scale" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://docs.opendap.org/index.php/Server_Side_Processing_Functions#linear_scale">
     <Description>Applies a linear scale transform to the named variable.</Description>
   </Function>
   <Function name="version" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://docs.opendap.org/index.php/Server_Side_Processing_Functions#version">
     <Description>Returns version information for each server side function.</Description>
   </Function>
 </ServerSideFunctions>
</DatasetServices>

I also put together a simple XSLT to make this a human friendly HTML. I added a reference to it in the XML document. When you point a browser at the Service Description response the XML file (above) comes back, the browser grabs the XSLT, converts the XML to HTML and renders it like this:

ServiceDescriptionPrototype-01.png


Jimg 16:22, 14 November 2011 (PST) We should think about making this a JSON response, or think about having http://server/file.nc return the XML and http://server/file.nc.json return JSON using an XSLT transform. Jimg 16:22, 14 November 2011 (PST) Nathan and I both conclude that the XML response should include the full URL and not do some weird hack where URLs are 'built'.

Issues

  • This a change in server behavior that will change the response content of file access URLs. These URLs are not part of the regular DAP URL pattern and their current behavior is a function of server configuration.
  • How might we describe the way to elicit DAP3.x and DAP4 versions of the responses in this description given that we currently rely on the XDAP-Accept HTTP header to inform the server of the version being requested? Jimg 16:18, 14 November 2011 (PST) We can define a new set of extensions for these responses, especially since there are only two (ddx4 and dodsx4, e.g.). There are several other syntaxes that put this information in the URL, but if we use unique URLs to reference these new responses, then we don't break REST. This would mean that the XDAP-Accept: header would be dropped, but that's a separate discussion.
  • I know we discussed additional issues, but I can't recall them. Please add them if you do!