DAP4: Constraint and Query

From OPeNDAP Documentation
Revision as of 13:26, 26 September 2012 by Ndp (talk | contribs) (→‎Background)
⧼opendap2-jumptonavigation⧽

<-- back to OPULS Development

ndp 16:25, 24 September 2012 (PDT)

Background

In DAP2 the constraint expression (CE) was defined as a projection and a selection. The projection being a list of dataset variables to be returned, and the selection being the conditions that must be met to return them. . The projected variables could, if they were arrays, be subset using a square bracket notation (rigorously described elsewhere). The selection was written as a list of "clauses" (my word), separated from the projection and from each other by an "&" character. The selection was applied to all of the requested variables, and in practice was rarely used because the only legitimate application of the selection was to constrain a Sequence object.

The DAP2 CE consumed the entire URL query string, in other words everything after the "?" in the URL was considered to be the DAP2 CE string.

Terms

DAP4 Constraint Expression (CE)
The constraint expression that encapsulates various sub-setting of, and possibly the application of server side functions to variables in a DAP4 dataset.
Query String (QS)
Everything after the "?" character in a URL.

Problems addressed

This proposal puts forth a framework for a DAP4 Query String syntax, which is required for DAP4 the protocol.

Throughout the web the predominate interpretation of the query string is to view it as a collection of key-value pairs (KVP), where each pair is separated by an "&" character.

?key=value&key=value&key=value ...

Many web services utilize this pattern, including our friends at OGC. Because the DAP2 CE subsumed the entire query string it doesn't fit into this model. Tomcat (and other web server frameworks) provide specific API methods for collecting the KVP from the query string, but again DAP2 doesn't play well with this.

DAP4 has many needs (problems to be addressed) by the syntax and positioning of the CE. This is a place to begin addressing these issues.

Proposed solution

Constraint Expression

The DAP4 CE should be held in a single key value pair. This means that the "&" character must not be used in the content of the CE. Combined with a riff on Dennis' Filter Proposal we could get to something like this:

?dap4=n,a[3:1:100],k|x<=y|z=y

Where we are asking for the variable n, the 3rd through the 100th elements of the array a, and all members of the rows of sequence k for which k.x<=k.y and k.z=k.y

Additional Server Controls

By removing the "&" character from the CE we open up the rest of the query string to be used to pass server controls and options. Consider if we were to take my corollary to Dennis' filter proposal and allow filters to be applied to arrays. The resulting data object could be expressed as a Sequence, or as a array with a mask applied. A server could support both and the type of result might be controlled by a KVP. You might even specify the mask value:

?dap4=u[25:1:1000][75:1:1000]|u>4.0|u<12.7&arraySelection=mask(0x00)

Or, ask for a Sequence:

?dap4=u[25:1:1000][75:1:1000]|u>4.0|u<12.7&arraySelection=Sequence

Examples

Example:

?dap4=CE&checksums=off&arraySelection=mask(0x00)&saveAs=newgranule&async=true


Example:

?dap4=build_data(place,u[100:200])&checksums=off&arraySelection=Sequence&saveAs=newgranule&async=true

Rationale for the solution

Not a solution, just a starting point.

Discussion

This is just a draft to get my ideas down so that we can talk about them.