DAP4: Encoding for the Data Response: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
Line 8: Line 8:


==Problem addressed==
==Problem addressed==
There is a need to move information from the server to a client. The way this is done should facilitate as many different architectures for both server and client components.
There is a need to move information from the server to a client. The way this is done should facilitate many different designs for both server and client components.


Assumptions:
Assumptions:

Revision as of 15:55, 6 June 2012

<< Back to OPULS Development

Background

There are two different approaches to deserializing the data received by a DAP client: The client may process the data as it is received (i.e., eager evaluation) or it may write those data to a store and process them after the fact (lazy evaluation). A variant of these techniques is to process the data and write it to a store, presumably because the initial processing steps are useful while having the data stored for later processing enables still other uses. However, in this document I'm not going to look at the latter case because experience so far with DAP2 has not provided any indication that would present any performance benefits. We do have example clients that use both eager and lazy evaluation.

HTTP/1.1 HTTP/1.1 defines a chunked transport scheme. In the past we have spent a fair amount of time on the notion of chunking as a way to achieve reliable transmission of errors when those errors are encountered during response formulation, that won't be addressed in this document. Instead, this document will assume that the entire response document described here is chunked in a way that enables reliable transmission of errors. The details of that transfer encoding will be described elsewhere.

Problem addressed

There is a need to move information from the server to a client. The way this is done should facilitate many different designs for both server and client components.

Assumptions:

  1. Processing on either the client or the server is order of magnitude faster than network transmission
  2. Server memory should be conserved with favor given to a design that does not require storage of large parts of a response before it is transmitted (but large is a relative term).
  3. Clients are hard to write and the existence of a plentiful supply of high-quality clients is important (of course, servers are hard to write, too, there are between 5 and 10 times the number of DAP2 clients as servers).
  4. DAP4 is designed to work over transports that are inherently serial in nature.
  5. The response does not explicitly support a real-time stream of data (e.g., from a temperature sensor which is a data structure of essentially infinite size). It may be the case that the response can encode that kind of information.

Proposed solution

The DataDDX response document will use the multipart-mime standard. A DataDDX response is the server's answer to a request for data from a client. Each such request must either include a Constraint Expression enumerating the variables requested or a null CE that is taken to mean 'return the entire dataset.' A response will consist of two parts:

  1. A DDX that has no attribute information and contains (only) the variables requested; and
  2. A binary part that contains the data for those variables

Structure of the binary part

Data in the 'binary part' will be serialized in the order of the variables listed in the DDX part. Essentially this is the serialization form of DAP2.

Serialization of varying-sized variables

There are several kinds of varying data:

  1. Strings
    • String s;
  2. Array variables that vary in size
    • Int32 i[*];
    • Float64 j[10][*];
  3. Structure variables with varying dimensions and Sequence variables
    • Structure { int32 i; int32 j[10]; } thing[*];
    • Sequence { int32 i; int32 j[10]; } thing2;
  4. Structure variables that have a varying dimension and one or more fields that vary
    • Structure { int32 i[*]; int32 j[10][*]; } thing[*];

Note that there is no practical difference between a (character) String and an integer or floating point array with varying size except that the type of elements differ. Thus the issues associated with encoding Int32 i[*] are really no different than encoding the String type. This same logic can be extended to a varying array of Structure (with fixed size fields). A varying array of Structure that contains one or more varying fields requires that the fields and the enclosing Structure be specially encoded. Note that a Sequence and a Structure with a varying dimension present essentially the same encoding issues.

Assumptions:

  1. String: it is assumed that the server will know ( or can determine w/o undue cost) the length of the String at the time serialization begins.
  2. Sequence and Structure { ... } [*]; It is assumed that the total size may be considerable and not known at the time serialization begins.

General serialization rules

Narrative form:

  1. Fixed size types: Serialized by their data, then a CRLF pair
  2. Strings (and equivalent types): Serialized by writing their size as a N-bit integer, then a CRLF pair
  3. Structures that have no varying dimensions themselves (but which may have String or varying-dimension fields):
  4. Sequences are serialized row by row: First a Start of Instance marker is written, then each of the fields of the row are serialized, until the last row of the Sequence is serialized, then a CRLF pair

Notes:

  1. How to encode the checksum information? - By field for Structures but for an entire Sequence
  2. The CRLF pairs simplify ensuring that implementations are correctly encoding the binary data
  3. This will use receiver-makes-right and thus needs a header (big-endian - reference the IETF std) to convey that

STOP Jimg 15:32, 5 June 2012 (PDT)


Since we have previously agreed on the use of multipart-mime, the incoming data is presumed to be sequence of variable length parts with a unique id for each part and (optionally) a known length for each part. In order to accommodate streaming, the length is allowed to have the value -1, which indicates that the length was unknown at the time the part was sent out by the server and must be computed by the client when received.

Under these assumptions, I propose the format described in this grammar.

Notes:

  • The grammar has a number of semantic (context sensitive) constraints not representable in the given context-free grammar.
  • In order to disambiguate the grammar, I had to add some extra tokens, BOA,EOA,BOS,EOS, to delimit certain unbounded lists. In a real implementation, the equivalent of these tokens would be handled by the enforcement of the semantic constraints.
  • The concept of group does not appear in the grammar. This is because a group is a lexical notion and thus only affects names, not the structure of the on-the-wire format.
  • The format presented here supports a limited form of self-description in that various tags and counts are included in the format sufficient to allow for parsing an instance of the format.

<< Back to OPULS Development

Grammar Overview

A narrative description of the grammar is as follows. The names enclosed in {...} are non-terminals in the grammar.

Note: the following overview uses the term "sequence" (e.g {sequencepart}). However, vis-a-vis the sequence versus vlen discussion, this term also encompasses variable length dimensions as well.

{request}
The complete on-the-wire {request} consists of a {mainpart} followed by zero or more {sequencepart} instances.
{mainpart}
The {mainpart} consists of a part header followed by a {structure} followed by a {stringannex} part. The {mainpart} instance is assumed to be a multi-part mime part and The {structure} represents a single, top-level dataset.

The {mainpart} is of known computable length. That is, its size can be computed solely knowing the DDX for the incoming data. This means that strings and sequences are not represented inline, but instead are represented by "pointers" into subsequent {sequencepart} and {stringannex} parts that contain the sequence records and/or string data. Note that the "pointers" are actually the unique id's of those parts.
{partheader}
The {partheader} is at the beginning of every multipart-mime part. It includes a unique identifier. With respect to the[[DAP4: DAP4 Multipart Mime Format | multipart mime proposal], this unique identifier is encoded as the Content-Id: header for the each part. This is different than the boundary marker. The {partheader} also contains the a header, Content-Length:, recording the length in bytes of the data part of the mime part. If this is -1, then the length is unknown and must be computed by the client. The {partheader} also contains a {parttype} (Content-Type:) that indicates the type of the part.
{length}
The {length} is a 64 bit integer representing a length in bytes.
{parttype}
The {parttype} is a 64 bit integer that defines the type of part and is equivalent to this enumeration.
enum parttype {mainpart=1, sequencepart=2, stringannex=3}
{structure}
A {structure} consists of a {tag} and a {fieldlist}. The {tag}'s {name} gives the name attribute value of the DDX element for this structure. The {tag}'s {typecode} indicates that this is a {structure} and the {tag's} count gives the number of fields in the structure. The {tag} is part of the self description feature.
{tag}
A {tag} consists of a {name}, a {typecode} and a {count}. The {name} gives the name attribute of some DDX element. The {typecode} gives the type of the following component of the part. The count gives the number of items in that component.

It should be noted that the tag could be removed from the on-the-wire format because it is reconstructable from the DDX. Removing it, however, would mean that the format is not parseable at all without knowing the DDX.
{name}
A {name} is a character array of size 16. The purpose of the name is soley to aid in self-description, so it is an inlined fixed size string that is presumed to represent the 16 character prefix of the the name attribute of some DDX element.
{count}
A {count} is a 64-bit integer.
{typecode}
A {typecode} is a 64-bit integer encoding the equivalent of the following enumeration listing all the defined DAP4 primitive data types plus some structural types (structure,sequence,record).
enum typecode {char=1,
int8=2,int16=3,int32=4,int64=5,
uint8=6,uint16=7,uint32=8,uint64=9,
float32=10,float64=11,
opaque=12,string=13,
structure=14,sequence=15,record=16}
{fieldlist}
A {fieldlist} is a sequence of {field}s. The number of fields was defined by the count in the tag for the enclosing {structure}.
{field}
A {field} consists of a {tag} giving the field name, a {typecode} indicating the type of the field and a {count}, and finally, an {array}, where the count specifies the number of elements in the {array}.
{array}
An {array}, generically, consists of the concatenation of one or more elements, where the elements all have the same {typecode} value, and whose length is defined by the {count} in the {tag} of the defining {field}.
  • For most of the simple types, the {array} is a simple concatenation of instances of the specified type.
  • For three of the simple types (char, int16, and uint16), the {array} is a simple concatenation of instances of the specified type that is then padded with ASCII NUL characters until the length of the array in bytes is a multiple of four; this is an accomodation to XDR.
  • For an array of opaque, the array is preceded by a count of the size of each opaque instance (this is separate from the count in the field of the total number of opaque instances). Each instance is an array of 8 bit bytes (uint8).
  • For an array of strings, the array is a concatenation of {stringref}s (see below).
  • For {structure} arrays, the array is a concatenation of the structure instances.
  • For {sequence} arrays, the array is a concatenation of {sequenceref}s.
{stringref}
A {stringref} is a "pointer" to a string in the {stringannex}. It consists of an {offset}, which indicates the relative start of the string in the annex, and a {count} to indicate the length of the string in utf-8 bytes. Note that this count is technically redundant with respect to the same count in the string in the annex. Note also that the unique id of the annex is not needed because the string annex part is presumed to immediately follow the part containing the {stringref}.
{sequenceref}
A {sequenceref} is a concatenation of "pointers", where each "pointer" gives the unique id of some corresponding {sequencepart} instance in some following multi-part mime part.
{stringannex}
A {stringannex} is a part and consists of a {partheader} followed by a {stringlist}. Each {mainpart} or each {sequencepart} part is presumed to be followed immediately with an associated {stringannex} part. The annex holds the content of all the strings referenced in that preceding part. If the annex is empty, then it may be elided.
{stringlist}
A {stringlist} is a concatenation of {string} instances.
{string}
A {string} is a {count} followed by a {chararray} whose length is specified by that {count}. The string content is followed by padding of ASCII NUL characters sufficient to bring the total size of the {string} instance up to a multiple of four (an accomodation to XDR).
{sequencepartlist}
A {sequencepartlist} represents the content of the sequences referenced in the {mainpart} followed by the transitive closure of the content of all nested sequence instances referenced in all the sequence parts.
{sequencepart}
A {sequencepart} is similar in form to a {mainpart} in that it consists of a {partheader} followed by data and followed by an optional {stringannex}. It differs from a {mainpart} in that its size is a variable number of {record} instances (i.e. a {sequence}). The number of records can be computed knowing the length of the {partheader} and knowing the size of the {record}. Note that the record's size is computable from the DDX because, like a {mainpart}, all nested sequences and strings have been moved to subsequence parts.
{sequence}
A {sequence} is a tag indicating the {name} of the sequence from the DDX, the {typecode}, which is always the sequence typecode, followed by a {count} of the number of records, and a {recordarray}.
{recordarray}
A {recordarray} is a concatenation of {record} instances.
{record}
A {record} is essentially identical in format to a single {structure}.

Discussion

The format described above assumes that the on-the-wire encoding is loosely based on the XDR encoding. However, some variations are assumed.

  • The encoding assumes "receiver makes it right", which means that the byte order on the wire is that of the sender, and that order may differ from that of the receiver. It is the duty of the receiver to determine if byte re-ordering is necessary and perform it as needed.
  • The above format includes short and unsigned short types. Normally, in XDR, all short/ushort instances are promoted to int/uint values. This encoding does not assume that, but in order to be consistent with the XDR four-byte rule, arrays of shorts are treated like arrays of bytes and are extended with ASCII NUL characters to a multiple of four bytes in length.

Rationale

The above representation makes lazy evaluation very simple and a given item in a packet can be reached in o(1) time. Even with the case of nested sequences, the proper item can be reached in o(log n) time where n is the depth of the nesting.

One cost is that a hash map is needed to map unique id's to offsets in the file or heap memory.

The lazy versus eager cases also apply on the server side. Currently, for example, the opendap code on the thredds server takes the underlying data source (a .nc file for example), converts it to DAP2 and annotates the DDS with the data. Then as a second pass, the annotations are converted as need and send out over the wire.

A lazy version would associate elements of the underlying source with the DDS. Tranfer of the data to the wire would then occur directly from the original source to the wire format a needed. Note that the ability to specify the part length as -1 (unknown) is useful in this situation.

It can be hypothesized that the proposed encoding will also simplify the use of lazy evaluation on the server side.

The above encoding has a number of pieces of extra data to support usable parsing of the format without needing the DDX. This extra information is primarily included in the {tag}. It is possible to remove that {tag} and still be able to parse the format as long as the DDX is available. Given that the {tag} overhead seems small, it seems reasonable to keep it.

Dennis Heimbigner