AIS Using NcML: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
Line 7: Line 7:
== Straw man Design ==
== Straw man Design ==


''This is a Straw man design, as the title suggests, so that there's something to bounce the use cases off.''
We will build a new handler that uses DDS/DAS/DDX and/or DataDDS/DataDDX objects from other handlers along with information in NcML files, to return a response that describes teh contents of a virtual data set. This design covers only using the NcML handler to build an ''AIS'' for Hyrax, but the same handler software can be used as part or other designs such as an aggregation handler.
 
We will build a new handler, similar to the HTML Form interface or ASCII Data handlers that uses DDS/DAS/DDX and/or DataDDS/DataDDX objects from other handlers along with information in NcML files, to return a new response. This handler will combine information in a NcML file with information in another DAP object to return a new, modified, DAP object.


Simple example: Suppose you wanted to add the string attribute "color" with value "red" to a variable "temperature" in some data set. In NcML, this would look like:
Simple example: Suppose you wanted to add the string attribute "color" with value "red" to a variable "temperature" in some data set. In NcML, this would look like:
Line 15: Line 13:
<netcdf xmlns:nc="..."
<netcdf xmlns:nc="..."
         location="file:/.../data.nc">
         location="file:/.../data.nc">
     <variable name="red">
     <variable name="red" type="Int16">
         <attribute name="color" type="string" value="red"/>
         <attribute name="color" type="string" value="red"/>
     </variable>
     </variable>
Line 33: Line 31:


We will scrap the existing AIS code in libdap - it's just too far from anything we want.
We will scrap the existing AIS code in libdap - it's just too far from anything we want.
=== Notes on adopting NcML ===
There are some aspects to NcML, which is based on the Common Data Model, that don't exactly match one-to-one with DAP syntax, even though the two models do match up semantically in most respects.
==== Syntax ====
* NcML considers everything to have ''rank'' and thus does not name a special constructor type ''Array''. Instead scalars have rank zero, arrays have rank greater than zero. A rank of one or more is denoted using ''variable@shape'' which lists the dimension names or index sizes. Because we don't need to define variables using NcML, we only have to specify the variables so they can be found, we can list the sizes in ''shape'' and be done with it.
* NcML does not have a ''variable@type'' for unsigned types or for 'Grid' or 'Sequence'.


=== Alternative Design ===
=== Alternative Design ===

Revision as of 22:58, 3 February 2009


Introduction

This and the BES Aggregation using NcML page go hand-in-hand. The essential idea is to use NcML as a syntax to describe both aggregations of data sets (e.g., HDF4 files) and ancillary information that should be added to a data set. The motivation for using NcML is to not invent a new syntax and instead build on an accepted one, maybe adding new features where we need them.

Straw man Design

We will build a new handler that uses DDS/DAS/DDX and/or DataDDS/DataDDX objects from other handlers along with information in NcML files, to return a response that describes teh contents of a virtual data set. This design covers only using the NcML handler to build an AIS for Hyrax, but the same handler software can be used as part or other designs such as an aggregation handler.

Simple example: Suppose you wanted to add the string attribute "color" with value "red" to a variable "temperature" in some data set. In NcML, this would look like:

<netcdf xmlns:nc="..."
        location="file:/.../data.nc">
    <variable name="red" type="Int16">
        <attribute name="color" type="string" value="red"/>
    </variable>
</netcdf>

And lets say this is stored in new_data.ncml

When the BES is asked to return the DAS for /.../new_data.ncml:

  1. The NcML handler would open this file, parse it and see that the it contains new information for the data file file:/.../data.nc.
  2. It will use the BES to get the DAS for file:/.../data.nc
  3. It will then apply the changes in the new_data.ncml file
    1. first parse the variable element and find the named variable in the DAS initially returned by the BES
    2. See that the attribute color is to be added (or overwritten, using NcML's rules for applying this stuff)
  4. Return the resulting DAS

This means that the the NcML file effectively defines a new data set. The data.nc data set is still available as before.

We will scrap the existing AIS code in libdap - it's just too far from anything we want.

Notes on adopting NcML

There are some aspects to NcML, which is based on the Common Data Model, that don't exactly match one-to-one with DAP syntax, even though the two models do match up semantically in most respects.

Syntax

  • NcML considers everything to have rank and thus does not name a special constructor type Array. Instead scalars have rank zero, arrays have rank greater than zero. A rank of one or more is denoted using variable@shape which lists the dimension names or index sizes. Because we don't need to define variables using NcML, we only have to specify the variables so they can be found, we can list the sizes in shape and be done with it.
  • NcML does not have a variable@type for unsigned types or for 'Grid' or 'Sequence'.

Alternative Design

We could use the existing Ancillary DAS code instead. This can be used to achieve the same end as shown here. However, it's far more limited in that it:

  1. Does not provide a separately addressable thing (the original data set's responses are modified)
  2. It can only be used with Attributes (that doesn't matter for these use cases, but other uses would be precluded).
  3. It does not have a mode where the modifications can be applied to a set of data sets described using some syntax.

Yet Another Alternative Design

Replace the DAS in Ancillary DAS with DDX. Then add elements from NcML's syntax so the souped-up DDX could specify and aggregation, name patterns of things, et cetera.

Use Cases

  1. Add the NcML handler to the BES
  2. Add attributes to a single data set so that it can be served using the WCS front-end
  3. Adding one or more attributes to a group of data sets
  4. Using the NcML Handler to get information

Definitions

Data set
Anything that can be referenced by a DAP URL and will return the DAP responses when requested.
NcML
Syntax for ancillary data (attributes and variables) and aggregations used by the TDS

Background

This new handler will be used to introduce new attributes into data sets for the IOOS/WCS project and likely for the REAP project. In the first case, the augmented DDX response generated by the handler will be filtered through XSLT to produce a WCS response of one form or another. In the second case, the DDX will be filtered to produce an EML document. So, this handler and the collection(s) of XML/NcML/? documents will be an important part of several projects we're working on.

NcML Information

Here are links that describe NcML 2.2:

Notes:

  1. NcML 2.2 is based on the CDM and thus includes Groups and shared dimensions, which DAP 3.2 does not support. We will want to elide that feature until DAP 4 is done and well supported.

Deliverables

  1. The NcML handler. It will run in the BES.
  2. Instructions on how to use said handler.

Period of use