DAP4: DAP4 Paths

From OPeNDAP Documentation
Revision as of 19:53, 30 March 2012 by DennisHeimbigner (talk | contribs) (Created page with "== Background == Consider the following example (~ CDL syntax). <pre> group g1 { dimensions: x=10; group g2 { dimensions: x=5, y=6; } } dimensions: y=100; float32 V[...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
⧼opendap2-jumptonavigation⧽

Background

Consider the following example (~ CDL syntax).

group g1 {
    dimensions: x=10;
  group g2 {
    dimensions: x=5, y=6;
  }
}
dimensions: y=100;
float32 V[x,y];

The question arises as to which dimension declarations are being referred to in the declaration of V. In HDF5 or the netcdf CDL syntax, this would be answered by replacing the shortname, x, with the full path name such as /g1/g2/x, and for y, /y. The path specifies, starting at the root group, the sequence of subgroups to traverse to get to the declaration of interest, x or y in this case.

Problem Addressed

The same ambiguous reference problem will also occur in DAP4 because, like HDF5/CDM/netcdf, it has a lexical group structure.

Proposal

The specific proposal is that for any occurrence in a DAP4 DDX of a reference to some item declared elsewhere in the DDX tree, that it be possible to specify a path to disambiguate the specific object to be referenced.

Using the above example, and not using this proposal, V would be declared as follows.

<Float32 name="V">
  <Dimension name="x"/>
  <Dimension name="y"/>
</Float32>

Using this proposal, the dimension references would be rewritten as follows.

<Dimension name="x">
  <path group="g1"/>
</Dimension>

and

<Dimension name="y">
  <path group=""/> <!-- alternatively <path group="/"/> -->
</Dimension>

The latter case shows how to reference an object in the top level group using either of two equivalent notations.

Rationale

The obvious alternative to using the <path> construct is to actually embed the full path name as the name; like this for example.

<Dimension name="/g1/g2/x"/>
<Dimension name="/y"/>

I dislike this solution because we are, in effect, embedding xml structural information in a string. This means that the string must be parsed to extract that structural information. No other lexical element has this requirement.

Discussion

An open question is this: should it be possible to leave off the path information and use some algorithm to infer the path.

It turns out that CDM does this for dimensions, and in fact disallows the use of path names for dimensions. I am not sure what it does for coordinate variable references.

In any case, for CDM, the inference rule is as follows.

  1. If the dimension of the same name is declared in the immediately enclosing group, then use that declaration.
  2. Otherwise, recurse up the sequence of enclosing groups to locate the first occurrence of a dimension declaration with the same name.
  3. If no such dimension is found, then declare an error.

It should be noted that this CDM rule means that certain DAP4 DDXs could not be converted to CDM because this proposal would allow use of dimension declarations that violated the CDM "up-the-group-tree" search rule.