AIS Using NcML: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
Line 67: Line 67:
# [[Add the NcML handler to the BES]]
# [[Add the NcML handler to the BES]]
# [[Add attributes to a single data set so that it can be served using the WCS front-end]]
# [[Add attributes to a single data set so that it can be served using the WCS front-end]]
# [[Adding one or more attributes to a group of data sets]]
# [[Adding one or more attributes to a group of data sets]] This use case is not complete since the ''scan'' element is not defined outside of an ''aggregation'' element
# [[Using the NcML Handler to get information]]
# [[Using the NcML Handler to get information]]



Revision as of 23:43, 3 February 2009


Introduction

This and the BES Aggregation using NcML page go hand-in-hand. The essential idea is to use NcML as a syntax to describe both aggregations of data sets (e.g., HDF4 files) and ancillary information that should be added to a data set. The motivation for using NcML is to not invent a new syntax and instead build on an accepted one, maybe adding new features where we need them.

Design

We will build a new handler that uses DDS/DAS/DDX and/or DataDDS/DataDDX objects from other handlers along with information in NcML files, to return a response that describes teh contents of a virtual data set. This design covers only using the NcML handler to build an AIS for Hyrax, but the same handler software can be used as part or other designs such as an aggregation handler.

Simple example: Suppose you wanted to add the string attribute "color" with value "red" to a variable "temperature" in some data set. In NcML, this would look like:

<netcdf xmlns:nc="..."
        location="file:/.../data.nc">
    <variable name="red" type="Int16">
        <attribute name="color" type="string" value="red"/>
    </variable>
</netcdf>

And lets say this is stored in new_data.ncml

When the BES is asked to return the DAS for /.../new_data.ncml:

  1. The NcML handler would open this file, parse it and see that the it contains new information for the data file file:/.../data.nc.
  2. It will use the BES to get the DAS for file:/.../data.nc
  3. It will then apply the changes in the new_data.ncml file
    1. first parse the variable element and find the named variable in the DAS initially returned by the BES
    2. See that the attribute color is to be added (or overwritten, using NcML's rules for applying this stuff)
  4. Return the resulting DAS

This means that the the NcML file effectively defines a new data set. The data.nc data set is still available as before.

We will scrap the existing AIS code in libdap - it's just too far from anything we want.

Notes on adopting NcML

There are some aspects to NcML, which is based on the Common Data Model, that don't exactly match one-to-one with DAP syntax, even though the two models do match up semantically in most respects.

Syntax

  1. NcML considers everything to have rank and thus does not name a special constructor type Array. Instead scalars have rank zero, arrays have rank greater than zero. A rank of one or more is denoted using variable@shape which lists the dimension names or index sizes. Because we don't need to define variables using NcML, we only have to specify the variables so they can be found, we can list the sizes in @shape or omit the required @shape attribute as teh examples in the NcML 2.2 tutorial show.
  2. NcML does not have a variable@type for unsigned types or for 'Grid' or 'Sequence'.
  3. NcML uses one DataType for both variables and attributes and we can use the Structure data type value for attribute containers even though the names are not the same.
  4. NcML does not have an otherXML attribute@type so we'll have to add that. Maybe we can overload the attribute@shape attribute so that it has the special dimension name otherXML?

Semantics

  1. Although not captured by the schema, it appears that a NcML that modifies the attributes of an existing variable does not have to specify either the variable@type or variable@share attributes. This might make #2 above moot. In that case, the variable@type and @shape attributes might only come into play when/if we use NcML/AIS to add new variables to the data set.
  2. The dimension element is in line with CDM's notion of a dimension and more closely related to DAP's Grid Maps.

Design Alternatives

jimg 15:22, 3 February 2009 (PST) I'm leaving these in even though we have pretty much settled on using NcML as the AIS's control file syntax for at least the first version.

Alternative Design 1

We could use the existing Ancillary DAS code instead. This can be used to achieve the same end as shown here. However, it's far more limited in that it:

  1. Does not provide a separately addressable thing (the original data set's responses are modified)
  2. It can only be used with Attributes (that doesn't matter for these use cases, but other uses would be precluded).
  3. It does not have a mode where the modifications can be applied to a set of data sets described using some syntax.

Yet Another Alternative Design

Replace the DAS in Ancillary DAS with DDX. Then add elements from NcML's syntax so the souped-up DDX could specify and aggregation, name patterns of things, et cetera.

Use Cases

  1. Add the NcML handler to the BES
  2. Add attributes to a single data set so that it can be served using the WCS front-end
  3. Adding one or more attributes to a group of data sets This use case is not complete since the scan element is not defined outside of an aggregation element
  4. Using the NcML Handler to get information

Definitions

Data set
Anything that can be referenced by a DAP URL and will return the DAP responses when requested.
NcML
Syntax for ancillary data (attributes and variables) and aggregations used by the TDS

Background

This new handler will be used to introduce new attributes into data sets for the IOOS/WCS project and likely for the REAP project. In the first case, the augmented DDX response generated by the handler will be filtered through XSLT to produce a WCS response of one form or another. In the second case, the DDX will be filtered to produce an EML document. So, this handler and the collection(s) of XML/NcML/? documents will be an important part of several projects we're working on.

NcML Information

Here are links that describe NcML 2.2:

Notes:

  1. NcML 2.2 is based on the CDM and thus includes Groups and shared dimensions, which DAP 3.2 does not support. We will want to elide that feature until DAP 4 is done and well supported.

Deliverables

  1. The NcML handler. It will run in the BES.
  2. Instructions on how to use said handler.

Period of use