DAP4: DAP4 Grids Proposal

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽

Grids Delenda Est

(with apologies to Cato the Elder)

The grid construct as originally established in the DAP2 protocol has been a source of problems from its inception. The evolution of the notion of coordinate variables makes its use in its current form (or even closely similar forms) untenable.

1. Problem: Grid as scoping/lexical container

This means that properly sharing coordinate variables is not possible without duplication, which is highly undesirable.

Consider the following situation.

    Arrays: D1(x,y), D2(y,z), D3(x,z).
    coord vars: x(x), y(y), z(z)

No grid, as currently defined can represent this because the three coordinate variables x(x), y(y), and z(z), cannot be properly distributed across needed three grids without duplication. The only way this can work is if all the Arrays and all the coordinate variables reside in a single grid; not, I maintain, a useful solution. Further, the Grid must change if new arrays are defined that use any of the coordinate variables, D4(x,w), for example.

2. Problem: Grid projections

When a projection is applied to a grid, the result cannot be a grid. This has been an ongoing source of problems in DAP2 where projecting the array component of a grid results in a structure. From the point of view of semantics, this is a really bad idea.

3. Problem: Multi-dimensional coordinate variables

When representing point data, it is desirable to have coordinate variables distinguished using more than a single dimension. Consider the following:

    array: temp(x,y,z)
    coordinate vars: lat(x,y,z), lon(x,y,z), and depth(x,y,z).

Here we are trying to represent point data where each point is defined by three dimensions: lat, lon, and depth. Grids are not capable of properly representing this case. I should note that neither is, for example, netcdf-3 or netcdf-4. CDM can do it, by only by encoding the proper relationships as attributes with complex internal structure.

4. Problem: Coordinate Variable Duplication In examining a large number of DAP2 DDS's, I note that coordinate variables inside grids are almost always duplicated outside the grid. My hypothesis has been that this a result of the fact of problem (1) above. In any case, this proposal below would obviate the need for duplication.

Proposal: Grid as mapping

Rather than making grids be scope containers, grids need to be simple relationship instances between an array and its coordinate variables. For example, the first case above (D1,D2,D3) might be represented as:

<map array="D1" dimensions="x" coordinates="x"/>
<map array="D1" dimensions="y" coordinates="y"/>
<map array="D2" dimensions="y" coordinates="y"/>
<map array="D2" dimensions="z" coordinates="z"/>
<map array="D3" dimensions="x" coordinates="x"/>
<map array="D3" dimensions="z" coordinates="z"/>

The case of point data would be represented as follows:

<map array="temp" dimensions="x y z" coordinates="lat lon depth"/>

Notes:

1. The only scope containers would be structures and sequences, both of which properly handle projections.

2. The above is probably more verbose than necessary and the maps might alternatively be associated with the array variables:

<variable name="D1"...>
  <map dimensions="x" coordinates="x"/>
  <map dimensions="y" coordinates="y"/>
</variable>

-Dennis Heimbigner