DAP4: Specification Volume 1 Deltas

From OPeNDAP Documentation
Revision as of 23:11, 15 November 2013 by Jimg (talk | contribs)
⧼opendap2-jumptonavigation⧽

<< back

basic testing support

We might make a collection of DMR files available for debugging/testing

FQNs in a DMR/Data response

Gallagher James wrote: Are the Dim, Map and Enum elements' name attributes always FQNs? So <Dim name="x"/> is never valid and should always be a FQN like: <Dim name="/x"/> ?
Dennis: That is what I put in the spec. The argument is that it is easier for machines while still making it reasonably readable by a person.

augment the vol1 spec with details about CE effects on a DMR

It occurs to me that we need to augment our proposed constraint expression grammars with a description of what kinds of DMRs will result from our proposed constraints.

That is, given a constraint and a DMR for the unconstrained dataset, describe the DMR that corresponds to the result of applying the constraint.

To that end, I have put up a new proposal based on describing rules for constructing the DMR that results from a constraint. http://docs.opendap.org/index.php/DAP4:_Alternate_Proposal_for_a_Constraint_Expression_Syntax

sequence serialization: how to serialize a zero-length sequence

Gallagher James wrote: Dennis, Looking at Sequence and thinking about CE evaluation: When there's a nested Sequence like this:

Seq {
 Int32 i;
 Int32 j;
 Seq {
    Int32 k;
    Int32 l;
 } inner;
} outer;

And a CE requests all of outer (which means inner too) such that k > 10, what should be sent when k is not > 10? Should i and j still be sent and an empty inner (so the count would be 0)? This would be my preference since the alternative is very tricky to code. Is that your understanding?
Dennis: Yes, a count of zero should be allowed.
also: My recollection is that we decided to only allow filtering based on the outermost variables (i and j in this case) and filtering based on k would be illegal. Maybe we should revisit this decision. the best alternative I can think of in this case is that all records in outer are kept and all records in inner are filtered (for each record in outer).

nb: This is a big deal for nested sequences because it you were to require that zero-length child sequences suppress the serialization of their parent sequence (as was the case with DAP2) the code to handle the sequence serialization becomes very complex. We made the correct decision here to allow child sequences to be zero length

IEEE 754 for real numbers

Gallagher James wrote: Dennis, When we adopted 'reader make right' we mostly talked about byte order; do you handle the case where one of the two hosts does not use IEEE754 for either 32 or 64 bit reals?
Reader makes it right was intended to apply only to byte order. We should indeed enforce use of 754 as the only acceptable format. =Dennis

how big are arrays? 64 bits? 63 bits? 61 bits? signed? unsigned?

Gallagher James wrote: I'm wondering what type should be used to hold the number of elements in an array. I can't find where in the spec it says how big an array can be - is it an unsigned 64bit number of elements? Or unsigned 32 bits?
Dennis: it is a signed 64 bit integer.
Yes signed. The argument is that interpretive languages (Java, python...) are not good at handling unsigned 64 bit numbers, so I chose to stick to signed 64 bit integer, which is effectively a 63 bit int We need to fix the text. Here's what the dc says: … The total number of elements in an Array MUST NOT exceed (2^64)-1.

In the telecon we decided that the make number of elements was 2^61-1. We decided that the number of bytes for an array should never be more than 2^64-1 bytes (because the size needs to be signed because Java and Python don't grok unsigned ints) and because C code will need to malloc these and malloc won't take more than a 64-bit int.

About the end chunk in a chunked response

Dennis, I'm tweaking my code to process the chunked responses and thinking about what the end chunk means. Does the end chunk mean that once any data it contains has be consumed EOF has been reached? Or is it possible to have more data chunks (or an error chunk) after an end chunk? James
I believe the rule is that any sequence of chunks must stop at the first end chunk or error chunk. The end chunk may or may not contain data. =Dennis

where we encode the byte order for a DAP4 Data response

On Sep 10, 2013, at 10:06 AM, Dennis Heimbigner <dmh@unidata.ucar.edu> wrote: I think we agreed to put the serialization byte order in the http headers. Do you want to revisit that decision? =Dennis
Yes. There's another part of the spec that says the response document/body is all you need to read (of course, you need the spec too) the document. The HTTP headers are generally lost by the time a client gets the response body, so I think the byte order should go in the response body somewhere. James
ok with me. One possibility is a new flag in the chunk headers. =Dennis