DAP4: Responses: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
No edit summary
 
(11 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[Category:Development|Development]] [[Category:DAP4|DAP4]]
<font size="+1" color="red">This is an old document that captures the starting point of the OPULS design work. It's out of date and should be referenced only as a baseline for the work.</font>
[[OPULS_Development | <-- back to OPULS Development]]
Author: [[User:Jimg|Jimg]], NDP, ?
== Overview ==
== Overview ==


Line 23: Line 30:
;Data: The ''Data'' response is requested by appending the suffix '''''.dap''''' to the ''file part'' of the dataset's referent (aka base) URL. The ''Data'' response is a multipart MIME document that contains a N+1 parts for a response with N variables.<!-- The document starts with a preamble (i.e., 'part' in MIME parlance) that contains a link to the ''Dataset'' response (which contains the dataset's metadata) followed by N parts which contain the name and type of the data, the data, encoded in XDR, and a checksum. Each of these three pieces of information is separated by a newline. All character data is assumed to be UTF8 encoded. [a little to much detail here, I think ~~~] -->
;Data: The ''Data'' response is requested by appending the suffix '''''.dap''''' to the ''file part'' of the dataset's referent (aka base) URL. The ''Data'' response is a multipart MIME document that contains a N+1 parts for a response with N variables.<!-- The document starts with a preamble (i.e., 'part' in MIME parlance) that contains a link to the ''Dataset'' response (which contains the dataset's metadata) followed by N parts which contain the name and type of the data, the data, encoded in XDR, and a checksum. Each of these three pieces of information is separated by a newline. All character data is assumed to be UTF8 encoded. [a little to much detail here, I think ~~~] -->


=== Dataset Response ===
=== Dataset Metadata Response ===


In DAP2, there existed important information was present only in the HTTP headers. In DAP4, all of the information specified by the protocol will be present in the Dataset document. Some of that information may also be present in HTTP headers when it's appropriate, because doing so simplifies processing the response.
In DAP2, there existed important information was present only in the HTTP headers. In DAP4, all of the information specified by the protocol will be present in the Dataset Metadata Response (DMR) document. Some of that information may also be present in HTTP headers when it's appropriate, because doing so simplifies processing the response.


==== Document Organization ====
==== Document Organization ====
Line 89: Line 96:
=== Data Response ===
=== Data Response ===


A Data response is the way DAP4 returns data to a client. Each Data response is returned over the wire as a multipart MIME document where the first MIME part contains the ''constrained'' Dataset response describing the data requested and the following MIME parts contain the data values, encoded using XDR, for each variable in the dataset.
A Data response is the way DAP4 returns data to a client. Each Data response is returned over the wire as a multipart MIME document where the first MIME part contains the ''constrained'' Dataset Metadata response describing the data requested and the following MIME part contains the binary encoded data values for each variable in the dataset. MIME headers are included in the binary part the identify the endianness of the binary content.


Some aspects of this design have been borrowed from the W3C's "[http://www.w3.org/TR/SOAP-attachments SOAP Messages with Attachments]" and the OGC's "WCS Version 1.1 Corrigendum 2" specifications. See also [http://www.ietf.org/rfc/rfc2387.txt The MIME Multipart/Related Content-type (rfc 2387)] and [http://www.ietf.or/rfc/rfc1521.txt MIME part one].
Some aspects of this design have been borrowed from the W3C's "[http://www.w3.org/TR/SOAP-attachments SOAP Messages with Attachments]" and the OGC's "WCS Version 1.1 Corrigendum 2" specifications. See also [http://www.ietf.org/rfc/rfc2387.txt The MIME Multipart/Related Content-type (rfc 2387)] and [http://www.ietf.or/rfc/rfc1521.txt MIME part one].
Line 100: Line 107:


The Data response follows the basic design of DAP2's DataDDS response closely. The Dataset document included describes the number, type, shape and order of each variable with values in the binary part of the response. However, while the DAP2 response used a simple ''application/octet-stream'' document, DAP4 uses a multipart MIME document. The design of this document/response can accommodate including including several different data requests in one document, a feature useful for implementations of DAP that do  not use HTTP for transport.
The Data response follows the basic design of DAP2's DataDDS response closely. The Dataset document included describes the number, type, shape and order of each variable with values in the binary part of the response. However, while the DAP2 response used a simple ''application/octet-stream'' document, DAP4 uses a multipart MIME document. The design of this document/response can accommodate including including several different data requests in one document, a feature useful for implementations of DAP that do  not use HTTP for transport.
==== Response Chunking Of Binary Data Part ====
The binary part of the DAP4 Data Response will be chunked, independently of any chunking utilized by underlying protocols such as HTTP. This chunking is essentially is a message based communications scheme. Messages are sent in chunks and handled by the recipient. Out-of-band information can be passed via the extension chunks. The information contained in the extension chunks MAY be used to change the meaning/context of subsequent data chunks. For now we will only be sending Errors in the extension chunks.
[[DAP4 Chunking | The details of DAP4 Chunking are here.]]
This chunking schema provides the following desired outcomes:
# It provides a way for the server to send errors to the client in an out-of-band manner.
# It allows clients to know exactly how many bytes they can expect to read without a server (as opposed to a connection) error.
# It allows the client software to be in a position to deal with error messages as they arise, and easily locate the error content in the input stream.
# Because all messages are chunked, even errors generated by, say, the parseing of the constraint expression can be returned to the client in an error chunk (as the only chunk in the stream).
# Does not preclude a client from reading data from a partially completed response.


==== Transmitting Attributes in constrained Dataset documents  ====
==== Transmitting Attributes in constrained Dataset documents  ====
Line 117: Line 138:
<font size="2">
<font size="2">
<pre>
<pre>
   Content-Type: multipart/related; type="text/xml"; start="<<start id>>";  boundary="<<boundary>>"
   Content-Type: multipart/related; type="application/vnd.org.opendap.dap4.data"; start="<<start id>>";  boundary="<<boundary>>"
   
   
   --<<boundary>>
   --<<boundary>>
   Content-Type: text/xml; charset=UTF-8
   Content-Type: application/vnd.org.opendap.dap4.dataset-metadata+xml; charset=UTF-8
   Content-Transfer-Encoding: binary
   Content-Transfer-Encoding: binary
   Content-Id: <<start id>>
   Content-Id: <<start id>>
Line 128: Line 149:


   --<<boundary>>
   --<<boundary>>
   Content-Type: application/octet-stream
   Content-Type: application/vnd.org.opendap.dap4.data.big-endian
   Content-Transfer-Encoding: binary
   Content-Transfer-Encoding: binary
   Content-Id: <<cid:variable-fqn>>
   Content-Id: <<data id>>
   Content-Description: data
   Content-Description: data
    
    
   <<XDR encoded binary data for the 1st variable>>
   <<Binary data>>
 
     
        .
        .
        .
 
  --<<boundary>>
  Content-Type: application/octet-stream
  Content-Transfer-Encoding: binary
  Content-Id: <<cid:variable-fqn>>
  Content-Description: data
 
  <<XDR encoded binary data for the nth variable>>
 
   --<<boundary>>
   --<<boundary>>
</pre>
</pre>
</font>
</font>


The example shows four sets of MIME headers separated by three <font size="2"><code>--<<boundary>></code></font> lines; a third boundary line terminates the document. The first group of headers (in a real response, there would be other headers here like Date, XDAP, and others) provide information need to recognize the boundary separators and to find the first part of the document by matching the value of ''start'' to a Content-Id of one of the parts. The payload of that first part contains references to the related parts using the values of their Content-Id headers.  
The example shows three sets of MIME headers separated by three <font size="2"><code>--<<boundary>></code></font> lines; the third boundary line terminates the document. The first group of headers (in a real response, there would be other headers here like Date, XDAP, and others) provide information need to recognize the boundary separators and to find the first part of the document by matching the value of ''start'' to a Content-Id of one of the parts. The payload of that first part contains references to the related parts using the values of their Content-Id headers.  


The Dataset document in the first part is unlike the one sent as it's own response in that it  
The Dataset document in the first part is unlike the one sent as it's own response in that it  
Line 168: Line 177:
# Sequences are encoded in a way that's optimal but which requires fairly complex Constraint expression evaluation. We can reduce the likelihood that servers fail to implement the Selection sub-expression evaluation by simplifying it a bit.  
# Sequences are encoded in a way that's optimal but which requires fairly complex Constraint expression evaluation. We can reduce the likelihood that servers fail to implement the Selection sub-expression evaluation by simplifying it a bit.  
# We can embed tags in the binary data to make it easier to read.
# We can embed tags in the binary data to make it easier to read.


=== Error Response ===
=== Error Response ===
Line 229: Line 237:
</font>
</font>


== Asynchronous Responses ==
== [[DAP4: Asynchronous Responses | Asynchronous Responses]] ==
 
The base use case for this is data that are stored 'near-line.' The client has no idea that it'll take 10 minutes to get the data staged - longer than the typical tcp time out. Other examples include server side functions that re-project lots of data, and thus take lots of time. The key thing is that the server can figure it out, but clients don't get much choice.
 
 
The "story" goes something like this:
* Server gets a Request
* Something in the server evaluates the request and determines if it has to be an async response. A pseudo-code example might look like:
 
<font size="2">
<source lang="java">
if ((fileSize>ITS_A_REALLY_BIG_FILE  &&  fileCompressed==true) || fileOnTape)
      isAsyncResponse = true;
</source>
</font>
: Whatever test makes sense… 
 
* If the response is going to be asynchronous, the server returns a specialization of the Dataset response like this:
<font size="2">
<source lang="xml">
<Dataset xml:base="balahblahblah">
  <async
      xlink:href="http://host.server.gov/opendap/tmp/file.h5.xml"
      xlink:role="http://services.opendap.org/dap4/dataset#">
      <dc:date>2012-11-05T13:15:30Z</dc:date>
  </async>
</Dataset>
</source>
</font>
: And similarly for the Data response:
<font size="2">
<source lang="xml">
<Dataset xml:base="balahblahblah">
  <async
      xlink:href="http://host.server.gov/opendap/tmp/file.h5.dap"
      xlink:role="http://services.opendap.org/dap4/data#">
      <dc:date>2012-11-05T13:15:30Z</dc:date>
  </async>
</Dataset>
</source>
</font>
 
* '''Notes:'''
** The Dublin Core specification supports both simple dates and ranges, so the <dc:date/> element can be use to indicate when the response might/will be available as well as when an ephemeral response might be removed.
** The use of <font size="2"><code>xlink:role</code></font> in this response should be exactly as is used in the Dataset Services response.
** In the DAP4 Dataset and Data responses returned from the <font size="2"><code>xlink:href</code></font> in these examples, the value of <font size="2"><code>xml:base</code></font>  should be the same as the original request: <font size="2"><code>xml:base="balahblahblah"</code></font>. In other words, the value of the xml:base should be the URL that references the original dataset and not the URL to the response object.
** The staged dataset must maintain the same authentication and authorization controls of the original data source.
 
=== Regarding HTTP headers ===
 
When the transport is HTTP the server should return a 202 ''Accepted'' (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.3) response with a ''Location:'' header pointing to the new URL (also stored as the value of the <font size="2"><code>xlink:href</code> attribute of the <font size="2"><code><async /></code> response element. ). Accessing the new URL should return either a '''404 Not Found''' if the response is not ready; '''200 OK''' if it is; and '''410 Gone''' if it has been generated and deleted after some time.
 
[[User:Jimg|Jimg]] 09:56, 4 April 2012 (PDT) Random thoughts about the 410-Gone response:
* How will servers know that a purged 'ephemeral' response is 'Gone?' To implement this, the server will need to store some information about each response for a while beyond the response's lifetime. That's doable, but I think to keep server implementors from feeling like they should store information about each ephemeral response forever, we can say that the 410 response is only supported for a day (to be arbitrary) and after the ephemeral response is gone for more than 24 hours, trying to access it returns a 404. Servers can stretch that time out longer if they want to, but the purged asynchronous response has to return 410 for at least 24 hours.
* I can see the allure of the 410 response, but I wonder how many clients will really make use of it and, if they do, how? The big benefit seems to be that a client can realized that either it or the person using it has missed the window of opportunity for the response. Contrast this with using 404, the 404 just tells you it's not there - not that it was but now it's gone. So a smart client can say, you need to make the request again.
* As an implementation note, this 404, 200, 410 behavior could be coded using a lookup scheme. If the URL references something that's there, return it (200), if not look in a data store of some sort and if a 'gone' record is present, return 410, else return 404. Not too hard, but is does mean that the server can't just write the response to a file and let Apache/Tomcat just serve the file. That's what I'd want as a server-writer. That way there's no web service to hassle with, no data store to maintain and I can let cron purge my 'cache' of asynchronous responses every hour using whatever Perl/Python/sh program I or the data provider want.
* I think simplicity argues for just using the 404 in both cases and betting that a smart client will use the time ranges in the response.
 
; [[User:Ndp|ndp]] 08:42, 5 April 2012 (PDT)
: I agree with James, the 410 is a tough response to code, although I code see implementing server to use a file based semaphore: When the file is purged from the cache the server "touches" a file of the same name and a prepended  ". Then if a client asks for "x" and the server finds ".x"  then the client gets a 410.  If the server stages again "x" it just removes ".x" once "x" is staged. But oh god - the concurrency nightmare that this creates. Like I said: I agree with James, the 410 response might be unreasonably tough to code.


(This scheme for applying HTTP status and HTTP headers to asynchronous responses was suggested by Roberto De Almeida (roberto at dealmeida.net))
Rather than duplicating content (and maintaining multiple copies) I have simply moved the content of [[DAP4: Asynchronous Responses  | this section to it's own DAP4 proposal page]]. When it's sorted out and adopted I'll move it back. [[User:Ndp|ndp]] 13:26, 5 April 2012 (PDT)

Latest revision as of 19:15, 31 August 2012

This is an old document that captures the starting point of the OPULS design work. It's out of date and should be referenced only as a baseline for the work.

<-- back to OPULS Development

Author: Jimg, NDP, ?

Overview

In response to DAP4 requests, a DAP4 system returns a chunked, multi-part MIME document containing the appropriate DAP4 response. This document describes the DAP4 responses, the manner in which they are bundled into MIME-Documents, and the chunked response structure.

Response Chunking

All DAP4 responses from a DAP4 server will be chunked, independently of any chunking utilized by protocols such as HTTP. This chunking is essentially is a message based communications scheme. Messages are sent in chunks and handled by the recipient. Out-of-band information can be passed via the extension chunks. The information contained in the extension chunks MAY be used to change the meaning/context of subsequent data chunks. For now we will only be sending Errors in the extension chunks.

The details of DAP4 Chunking are here.

This chunking schema provides the following desired outcomes:

  1. It provides a way for the server to send errors to the client in an out-of-band manner.
  2. It allows clients to know exactly how many bytes they can expect to read without a server (as opposed to a connection) error.
  3. It allows the client software to be in a position to deal with error messages as they arise, and easily locate the error content in the input stream.
  4. Because all messages are chunked, even errors generated by, say, the parseing of the constraint expression can be returned to the client in an error chunk (as the only chunk in the stream).
  5. Does not preclude a client from reading data from a partially completed response.

Persistent representations

DAP4 defines only two core responses that represent all of the information in a dataset: The Dataset and Data. (See the DAP4 Web Services document for a complete list of response objects - both required and suggested.)

Dataset
The Dataset response is requested by appending the suffix .xml to the file part of the dataset's referent (aka base) URL. The Dataset response is an XML document that contains all of the metadata included in the original dataset.
Data
The Data response is requested by appending the suffix .dap to the file part of the dataset's referent (aka base) URL. The Data response is a multipart MIME document that contains a N+1 parts for a response with N variables.

Dataset Metadata Response

In DAP2, there existed important information was present only in the HTTP headers. In DAP4, all of the information specified by the protocol will be present in the Dataset Metadata Response (DMR) document. Some of that information may also be present in HTTP headers when it's appropriate, because doing so simplifies processing the response.

Document Organization

In DAP4 the DAP2 data model has been be extended to include many new concepts and components. Groups, Shared dimensions and user-defined types are just a few of the new additions. For a more complete discussion see the new data model.

A rough syntax which describes how these additions will fit into the DAP and the existing Dataset notation is:

Dataset :== Groups
Groups :== null | Group Groups
Group :== SharedDimensions Attributes Groups Variables 
Dimensions :== null | SharedDimension Dimensions
Attributes :== null | Attribute Attributes
Variables :== null | Variable Variables

This pseudo-grammar does not capture what can be produced for a Group, et cetera. Instead it shows how these sections of the <Dataset/> document must be organized.

An XML schema for the Dataset response object may be found here: http://scm.opendap.org/trac/browser/trunk/xml/dap/dap4.xsd

NB: If a <Dataset/> document describes a dataset that has been constrained, attributes will not be included. It is not possible to know if attributes correctly describe the data once it has been constrained.

The Dataset Element

The Dataset element is the root element of the Dataset response.

The Dataset element has the following attributes:

name
The name of the dataset. This can be any name the server chooses. This should probably be the name of the file or database table/token.
version
The version of DAP used by the server to form this Dataset. This must be in int dot int form (e.g., "3.2", "4.11").
xml:base
The value of the xml:base attribute is the URL which was dereferenced to get this Dataset. The xml namespace should also be declared in the Dataset element.

NB: Because the <Dataset/> element, as defined by the schema, uses the Dublin Core, XLink and XML namespaces, those must be present in the element or elsewhere in the document (although, of course, you don't have to use the prefixes dc, xlink and xml, please do use them and please do define the namespaces in the <Dataset/> element). As with any XML document, you can define other namespaces anywhere they are needed.

Here's an example of the Dataset element declaration:

<Dataset name="fnoc1.nc"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://xml.opendap.org/ns/DAP/4.0#  http://xml.opendap.org/dap/dap4.xsd"
    xmlns="http://xml.opendap.org/ns/DAP/4.0#"
    xmlns:dap="http://xml.opendap.org/ns/DAP/4.0#"
    dapVersion="4.0"
    xmlns:xml="http://www.w3.org/XML/1998/namespace"
    xml:base="http://test.opendap.org/opendap/data/nc/fnoc1.nc.ddx"
    xmlns:xlink="..."
    xmlns:dc="..."
>
.
.
.
</Dataset>

Data Response

A Data response is the way DAP4 returns data to a client. Each Data response is returned over the wire as a multipart MIME document where the first MIME part contains the constrained Dataset Metadata response describing the data requested and the following MIME part contains the binary encoded data values for each variable in the dataset. MIME headers are included in the binary part the identify the endianness of the binary content.

Some aspects of this design have been borrowed from the W3C's "SOAP Messages with Attachments" and the OGC's "WCS Version 1.1 Corrigendum 2" specifications. See also The MIME Multipart/Related Content-type (rfc 2387) and MIME part one.

In DAP2 the 'data' or 'DataDDS' response is a MIME document with Content-Type 'application/octet-stream' which means essentially that the contents of the MIME document are binary and application specific, in this case specific to applications the understand DAP2. Within that dcoument, the DDS is used to provide the syntax needed to decode the binary information. Following the DDS is a separator and following that are data values written to the document using XDR.
The use of XDR is solely to ensure that the data values can be read on both little- and big-endian machines and that floating-point values do not suffer from the many different representations commonly found. In additon, XDR is used to include information about the size of arrays, string ans URLs, the latter two of which are really special case arry types. Thus XDR provides a common encoding for the bits and bytes to be transferred. It does not. however, represent any of the more complex structural information such as the organization of relational data.
The DDS sent with the DataDDS response is used to describe the organization of the data not covered by XDR. For example, if the response calls for values from three variables to be returned, the DDS in the DataDDS response will list those three variables and, furthermore, do so in the order that their values appear in the response. The variables described in the DDS response match exactly in number, type, shape and order with the data in the 'data part' of the response.

The Data response follows the basic design of DAP2's DataDDS response closely. The Dataset document included describes the number, type, shape and order of each variable with values in the binary part of the response. However, while the DAP2 response used a simple application/octet-stream document, DAP4 uses a multipart MIME document. The design of this document/response can accommodate including including several different data requests in one document, a feature useful for implementations of DAP that do not use HTTP for transport.

Response Chunking Of Binary Data Part

The binary part of the DAP4 Data Response will be chunked, independently of any chunking utilized by underlying protocols such as HTTP. This chunking is essentially is a message based communications scheme. Messages are sent in chunks and handled by the recipient. Out-of-band information can be passed via the extension chunks. The information contained in the extension chunks MAY be used to change the meaning/context of subsequent data chunks. For now we will only be sending Errors in the extension chunks.

The details of DAP4 Chunking are here.

This chunking schema provides the following desired outcomes:

  1. It provides a way for the server to send errors to the client in an out-of-band manner.
  2. It allows clients to know exactly how many bytes they can expect to read without a server (as opposed to a connection) error.
  3. It allows the client software to be in a position to deal with error messages as they arise, and easily locate the error content in the input stream.
  4. Because all messages are chunked, even errors generated by, say, the parseing of the constraint expression can be returned to the client in an error chunk (as the only chunk in the stream).
  5. Does not preclude a client from reading data from a partially completed response.

Transmitting Attributes in constrained Dataset documents

The Dataset document contained in a constrained Dataset or in the Data response will not contain any Attribute nodes. (The Dataset document in the Data response is always 'constrained'.)

Since the contents of the Data response are the result of access to the data subject to a constraint, various aspects of any of the variables in the response may have been changed. To make these changes the DAP must take into account the semantics of each of the variables' data types. It can do this because the semantics for the types are well defined and known a priori. However, this is not the case for attributes, where the semantics are intentionally not part of the DAP. The DAP is merely an 'envelope' for the name-type-value tuples of the attributes.

To understand why this restriction is placed on the Dataset document returned in the Data response, lets examine a common example. Suppose an image has some extent and has attributes that name that extent. A geographical image might have attributes that provide the latitude and longitude of two opposite corners and a medial image might have attributes that provide the height and width in millimeters. Now suppose the image is constrained in one or more dimensions, how should the attribute values be treated? If they are left alone they are likely no longer correct but to modify them requires detailed information about how they map to the image and while this information might be know to a client that has an understanding of a particular subject area, expecting the server to handle them correctly would require it to know about every subject area for all of the data to be served.

An alternative to 'universal knowledge' is to allow servers to return attributes that have 'well known' semantics and drop other attributes. While this is appealing at first, it presents a complex situation to clients because to make use of the attributes in the return DataDDX response they must know to test for them and if not present, fallback to some default behavior. In our opinion, it is easier to present clients with fewer 'optional behaviors', especially when the fallback is likely to compute the needed value anyway.

Organization of the multipart MIME document

Here's what the shell of the document looks like:

   Content-Type: multipart/related; type="application/vnd.org.opendap.dap4.data"; start="<<start id>>";  boundary="<<boundary>>"
 
   --<<boundary>>
   Content-Type: application/vnd.org.opendap.dap4.dataset-metadata+xml; charset=UTF-8
   Content-Transfer-Encoding: binary
   Content-Id: <<start id>>
   Content-Description: ddx

   <<Dataset document here. This includes a reference to <<data id>> >>

   --<<boundary>>
   Content-Type: application/vnd.org.opendap.dap4.data.big-endian
   Content-Transfer-Encoding: binary
   Content-Id: <<data id>>
   Content-Description: data
   
   <<Binary data>>
      
   --<<boundary>>

The example shows three sets of MIME headers separated by three --<<boundary>> lines; the third boundary line terminates the document. The first group of headers (in a real response, there would be other headers here like Date, XDAP, and others) provide information need to recognize the boundary separators and to find the first part of the document by matching the value of start to a Content-Id of one of the parts. The payload of that first part contains references to the related parts using the values of their Content-Id headers.

The Dataset document in the first part is unlike the one sent as it's own response in that it

  1. Contains no Attribute objects.
  2. Each DAP4 variable declaration will contain an xlink:href whose value is the value of the Content-Id of the MIME part containing the XDR encoded binary data for the variable.

Choosing values for the Data document Content-Ids and Boundaries

We would like the software that builds these Data responses to be compatible with as many different transport protocols as possible, so long as the cost to the implementation for which we know we must support is low. One thing that some transport protocols may do is combine several Data responses into a single document and, while the specifics of that will vary between protocols, one choice we can make now that will facilitate that is to ensure that the values of the Content-Ids and <<boundary>>s are unique within and across systems. This will free software that combines Data responses from having to process the Dataset document and Content-Id header to ensure that no name collisions are present. While using UUIDs, for example, makes the result values 'ugly', it adds virtually nothing to the time needed to build or process the responses. Other schemes, that combine a URI with some system-generated token could also be employed. The important point is to ensure that these symbols are unique not only within a system, but across systems.

Changes to the encoding of data

There are some issues with the way data values are encoded in DAP2 that we can address now.

  1. Arrays are prefixed with their sizes, the total number of elements, twice in DAP 2 because of an initial misuse of the xdr library. Now is the time to fix that and have just one copy of the Array size in DAP 4.
  2. Sequences are encoded in a way that's optimal but which requires fairly complex Constraint expression evaluation. We can reduce the likelihood that servers fail to implement the Selection sub-expression evaluation by simplifying it a bit.
  3. We can embed tags in the binary data to make it easier to read.

Error Response

An unsuccessful DAP4 request will cause the server to return a DAP4 error response. The error response may be returned in lieu of the Dataset response, or as part of the Data response. The XML used in the Error response is detailed in the DAP4 schema.

DAP4 Data responses are chunked and DAP4 errors always appear in an error chunk. As the client processes a DAP4 response it reads the (fixed length) chunk header prior to reading the chunk. The chunk header will signal to the client that the following chunk contains a DAP4 error. This enables the client to transition to an error processing state prior to ingesting the error. This is true even if the response contains only an error chunk.

Internal Error

The error is internal to the Server, most likely a programming bug/issue.

Example
<Error type="Internal">
    <Message>The server encountered a null pointer. Ouch.</Message>
    <Administrator>admin.email.address@your.domain.name</Administrator>
</InternalError>

User Syntax Error

The request contains a syntax error in the selection or the projection clause.

Example
<Error type="Syntax">
    <Message>Relational constrains may not be applied to DAP Structures.</Message>
    <Administrator>admin.email.address@your.domain.name</Administrator>
</Error>

Forbidden Error

The requestor is not allowed to access the resource.

Example
<Error type="Forbidden">
    <Message>The requested resource may not be accessed.</Message>
    <Administrator>admin.email.address@your.domain.name</Administrator>
</Error>

Not Found Error

The request resource cannot be found

Example
<Error type="NotFound">
    <Message>Unable to locate resource /data/nc/fnoc10.nc</Message>
    <Administrator>admin.email.address@your.domain.name</Administrator>
</Error>

Asynchronous Responses

Rather than duplicating content (and maintaining multiple copies) I have simply moved the content of this section to it's own DAP4 proposal page. When it's sorted out and adopted I'll move it back. ndp 13:26, 5 April 2012 (PDT)