DAP4: Subsetting Arrays and Grids By Value: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
No edit summary
Line 65: Line 65:


== Discussion ==
== Discussion ==
[[User:dmh|Dennis]](4/28/2012) More to the point, I like this proposal. It avoids the problem of using variable length dimensions and provides an additional use for sequences.
[[User:dmh|Dennis]](4/29/2012)
[[User:dmh|Dennis]](4/28/2012) <del>I am not sure, however that I see the algorithm for choosing the values for the map columns. Nathan, can you elaborate on the rule for doing that?</del> Never mind; this issues requires some detailed discussion.
[[User:dmh|Dennis]](4/28/2012) slightly off topic. I thought we had agreed that map variables are just declared using ordinary variable declarations. The example above is using the element keyword to define coordinate variables. E.g.
[[User:dmh|Dennis]](4/28/2012) slightly off topic. I thought we had agreed that map variables are just declared using ordinary variable declarations. The example above is using the element keyword to define coordinate variables. E.g.
<pre>
<pre>
Line 80: Line 86:
</pre>
</pre>
[[User:Ndp|ndp]] That's probably the case - I was just doing a copy-pasta from somewhere else on the wiki for the example. And I'm sure that somewhere hasn't been brought into line with our current discussion. I suspect that once we iron out an agreement on the various bits of the data model we'll need to go back and rework the wiki content to reflect our thinking.
[[User:Ndp|ndp]] That's probably the case - I was just doing a copy-pasta from somewhere else on the wiki for the example. And I'm sure that somewhere hasn't been brought into line with our current discussion. I suspect that once we iron out an agreement on the various bits of the data model we'll need to go back and rework the wiki content to reflect our thinking.
[[User:dmh|Dennis]](4/28/2012) More to the point, I like this proposal. It avoids the problem of using variable length dimensions and provides an additional use for sequences.
I am not sure, however that I see the algorithm for choosing the values for the map columns. Nathan, can you elaborate on the rule for doing that?
Also, it does lose dimensional information that might be both useful and performance enhancing. For instance, it might be useful to be able to say something like
<pre>?sal[0:22][0:9][]<3.2</pre>
and get something back like this:
<font size="2"><source lang="xml">
<Sequence name="sal">
    <Float64 name="sal">
      <Dimension size="23" />
      <Dimension size="10" />
    </Float64>
    <Float32 name="lat"/>
    <Float32 name="lon"/>
    <Float32 name="depth"/>
</Sequence>
</source></font>

Revision as of 17:46, 29 April 2012

<-- back to OPULS Development

ndp

Background

DAP2 did not support sub-setting of Arrays and Grids using relational operators. Many of our support questions over the years have been from frustrated users who were attempting to perform relational sub-setting on these objects and couldn't understand why it wasn't working.


Problem addressed

Allow users to subset Arrays (and "Grids") using relational operators.

Proposed solution

Allow users to apply relational constraints to the values of Arrays (and "Grids"). The server should return a Sequence which holds the matching Array values, along with the values of all of the associated Maps.

For example, let's consider this data object:

<Dimension name="x" size="1024"/>
<Dimension name="y" size="1024"/>
<Dimension name="z" size="12"/>

<!-- The dimensions of a Coordinate MUST be SharedDimensions -->
<Map name="lon" type="Float32">
    <dimension ref="x"/>
    <dimension ref="y"/>
</Map>

<Map name="lat" type="Float32">
    <dimension ref="x"/>
    <dimension ref="y"/>
</Map>

<Map name="depth" type="Float32">
    <Attribute name="unit" type="String"><value>meters</value></Attribute>
    <dimension ref="x"/>
    <dimension ref="y"/>
    <dimension ref="z"/>
</Map>

<Float64 name="sal">
    <dimension ref="x"/>
    <dimension ref="x"/>
    <dimension ref="z"/>
    <map name="lat" />
    <map name="lon" />
    <map name="depth" />
</Float64>

Applying the constraint "?sal<3.2" would return this:

<Sequence name="sal">
    <Float64 name="sal" />
    <Float32 name="lat"/>
    <Float32 name="lon"/>
    <Float32 name="depth"/>
</Sequence>

Rationale for the solution

Subsetting arrays and "grids" using relational expressions is unlikely to yield a set of matching items that can still be viewed as the same object in the data model. By representing the result as a Sequence we are able to return a reasonable representation of the result of the applied constraint using another representation found in the data model.

Discussion

Dennis(4/28/2012) More to the point, I like this proposal. It avoids the problem of using variable length dimensions and provides an additional use for sequences.

Dennis(4/29/2012)

Dennis(4/28/2012) I am not sure, however that I see the algorithm for choosing the values for the map columns. Nathan, can you elaborate on the rule for doing that? Never mind; this issues requires some detailed discussion.

Dennis(4/28/2012) slightly off topic. I thought we had agreed that map variables are just declared using ordinary variable declarations. The example above is using the element keyword to define coordinate variables. E.g.

<Map name="lon" type="Float32">
    <dimension ref="x"/>
    <dimension ref="y"/>
</Map>

should instead be:

<Float32 name="lon">
    <dimension ref="x"/>
    <dimension ref="y"/>
</Float32>

ndp That's probably the case - I was just doing a copy-pasta from somewhere else on the wiki for the example. And I'm sure that somewhere hasn't been brought into line with our current discussion. I suspect that once we iron out an agreement on the various bits of the data model we'll need to go back and rework the wiki content to reflect our thinking.