DAP4: Specification Volume 1 Deltas

From OPeNDAP Documentation
Revision as of 23:41, 15 November 2013 by Jimg (talk | contribs)
⧼opendap2-jumptonavigation⧽

<< back

sequences with structures and the resulting CEs

James Gallagher wrote: On Aug 2, 2013, at 7:04 PM, Dennis Heimbigner wrote: Yes I did intend to allow nesting of Sequences and Structures. I suspect that in the implementation I will come to regret it, but until then.... My guess is that's not so bad, but, I think we should our filters to 'id op constant' (and the positional variants) and not support 'id op id'. Or limit the latter case based on the scope of the ids (but I'd rather not support it at all). James
That sounds right to me; we can always extend it later if there is sufficient reason Dennis


basic testing support

We might make a collection of DMR files available for debugging/testing

FQNs in a DMR/Data response

Gallagher James wrote: Are the Dim, Map and Enum elements' name attributes always FQNs? So <Dim name="x"/> is never valid and should always be a FQN like: <Dim name="/x"/> ?
Dennis: That is what I put in the spec. The argument is that it is easier for machines while still making it reasonably readable by a person.

augment the vol1 spec with details about CE effects on a DMR

It occurs to me that we need to augment our proposed constraint expression grammars with a description of what kinds of DMRs will result from our proposed constraints.

That is, given a constraint and a DMR for the unconstrained dataset, describe the DMR that corresponds to the result of applying the constraint.

To that end, I have put up a new proposal based on describing rules for constructing the DMR that results from a constraint. http://docs.opendap.org/index.php/DAP4:_Alternate_Proposal_for_a_Constraint_Expression_Syntax

sequence serialization: how to serialize a zero-length sequence

Gallagher James wrote: Dennis, Looking at Sequence and thinking about CE evaluation: When there's a nested Sequence like this:

Seq {
 Int32 i;
 Int32 j;
 Seq {
    Int32 k;
    Int32 l;
 } inner;
} outer;

And a CE requests all of outer (which means inner too) such that k > 10, what should be sent when k is not > 10? Should i and j still be sent and an empty inner (so the count would be 0)? This would be my preference since the alternative is very tricky to code. Is that your understanding?
Dennis: Yes, a count of zero should be allowed.
also: My recollection is that we decided to only allow filtering based on the outermost variables (i and j in this case) and filtering based on k would be illegal. Maybe we should revisit this decision. the best alternative I can think of in this case is that all records in outer are kept and all records in inner are filtered (for each record in outer).

nb: This is a big deal for nested sequences because it you were to require that zero-length child sequences suppress the serialization of their parent sequence (as was the case with DAP2) the code to handle the sequence serialization becomes very complex. We made the correct decision here to allow child sequences to be zero length

IEEE 754 for real numbers

Gallagher James wrote: Dennis, When we adopted 'reader make right' we mostly talked about byte order; do you handle the case where one of the two hosts does not use IEEE754 for either 32 or 64 bit reals?
Reader makes it right was intended to apply only to byte order. We should indeed enforce use of 754 as the only acceptable format. =Dennis

how big are arrays? 64 bits? 63 bits? 61 bits? signed? unsigned?

Gallagher James wrote: I'm wondering what type should be used to hold the number of elements in an array. I can't find where in the spec it says how big an array can be - is it an unsigned 64bit number of elements? Or unsigned 32 bits?
Dennis: it is a signed 64 bit integer.
Yes signed. The argument is that interpretive languages (Java, python...) are not good at handling unsigned 64 bit numbers, so I chose to stick to signed 64 bit integer, which is effectively a 63 bit int We need to fix the text. Here's what the dc says: … The total number of elements in an Array MUST NOT exceed (2^64)-1.

In the telecon we decided that the make number of elements was 2^61-1. We decided that the number of bytes for an array should never be more than 2^64-1 bytes (because the size needs to be signed because Java and Python don't grok unsigned ints) and because C code will need to malloc these and malloc won't take more than a 64-bit int.

About the end chunk in a chunked response

Dennis, I'm tweaking my code to process the chunked responses and thinking about what the end chunk means. Does the end chunk mean that once any data it contains has be consumed EOF has been reached? Or is it possible to have more data chunks (or an error chunk) after an end chunk? James
I believe the rule is that any sequence of chunks must stop at the first end chunk or error chunk. The end chunk may or may not contain data. =Dennis

where we encode the byte order for a DAP4 Data response

On Sep 10, 2013, at 10:06 AM, Dennis Heimbigner <dmh@unidata.ucar.edu> wrote: I think we agreed to put the serialization byte order in the http headers. Do you want to revisit that decision? =Dennis
Yes. There's another part of the spec that says the response document/body is all you need to read (of course, you need the spec too) the document. The HTTP headers are generally lost by the time a client gets the response body, so I think the byte order should go in the response body somewhere. James
ok with me. One possibility is a new flag in the chunk headers. =Dennis

Order of stuff in the DMR

Gallagher James wrote: On Sep 5, 2013, at 9:30 AM, Dennis Heimbigner wrote: I recall that we placed limits on which dimensions and maps and enums could be referenced to be something in the same group or an enclosing group. Given that, we could do this order: dimensions, enums, variables, groups because the above rule would guarantee no forward reference.
OK, lets adopt this and change the grammar to reflect it. I'll change the rng file and check it in. As for group attributes, currently, I print them at the very end of the group, but this is easily changed. I would suggest however, that they be either at the very beginning or the very end of the group. My vote is to put them at the end. =Dennis

limitations on the DMR Dim element

Gallagher James wrote: Dennis, In an array in DAP4, Can a Dim element have both a name and a size? Is the name limited to only names of shared Dimension elements (previously defined)? James
Limited to previously defined dimensions. You can only specify a size within an array. =Dennis

names

Gallagher James wrote: Dennis, You might have guessed that my previous question relates to looking up things (groups, variables, dimensions) based on their FQN. Are we allowing a Group and a variable, for example, to have the same name at the same lexical level (I hope not)? James
I have assumed that yes, two decls of different kinds can have the same name. This is obviously true for dimensions and variables, so I assumed it held generally. Why is this causing a problem? =Dennis