Character Arrays in the netCDF 3 handler for Hyrax: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
No edit summary
No edit summary
Line 10: Line 10:


[[User:Jimg|jimg]] 16:08, 8 January 2009 (PST) My suggestion is that we adopt the solution John has since there's code (client code in particular) already out there that recognizes it. I will look into the changes that need to be made in the netCDF handler and also how those will affect the more prominent clients. I just looked at the code and the three files we will need to modify are NCArray.cc, NCStr.cc and ncdds.cc. We might look at the whole attribute thing, too.
[[User:Jimg|jimg]] 16:08, 8 January 2009 (PST) My suggestion is that we adopt the solution John has since there's code (client code in particular) already out there that recognizes it. I will look into the changes that need to be made in the netCDF handler and also how those will affect the more prominent clients. I just looked at the code and the three files we will need to modify are NCArray.cc, NCStr.cc and ncdds.cc. We might look at the whole attribute thing, too.
==Problem Summary==
''This is from an email from John Caron''
The semantic mismatch between OpenDAP and netCDF has led to several possible conventions in mapping from netCDF to OpenDAP on the server, and from OpenDAP to netCDF on the client.  One design goal is that netCDF files on the server should be semantically equivalent as seen by a client using the netCDF API. Another goal is to make the common case of a rank 1 char array in netCDF map to an OpenDAP String.
Older versions of the netCDF - OpenDAP C++ library mapped char[n] arrays to a DArray of element type DString, where each DString has length 1. Currently, the following convention is the "correct" way for a OpenDAP server to represent netCDF char[n] arrays:
# A char[n] array maps to a DString. (this is the common case)
# A rank k char[n, m, …, p, q] NetCDF array maps to a rank k-1 OpenDAP DArray[n, m, …p] of element type DString, where each DString has length q. An attribute "strlen" is added to the variable, inside an attribute table called "DODS" (to distinguish it as an attribute added by the DODS layer). The strlen attribute means that all of the DString data elements have the same data length (in this example, length q).
<pre>
Attributes {
    var1 {
        DODS {
    Int32 strlen 54;
}
    }
}
</pre>
===Another Approach===
Also from John: Another approach to this problem is "to add a 'this is really a char' attribute and use a byte type. At the moment, that seems like a simpler choice."

Revision as of 19:13, 9 January 2009

The problem facing the netCDF 3 handler is that it needs to provide a representation of character arrays so that common clients can easily use those types. One logical choice is to encode a netCDF 3 character array as a DAP String. However, a client that models data as arrays where the size of all (or most) data objects is know prior to accessing their value will have a hard time with the DAP String since it's declaration does not convey its length.

The solution to this problem taken in the netcDF 3 handler is to represent a char array as an array of Strings, each of which has only one character. This does not really solve the problem because there's still no way to know that the individual Strings that are the elements of the array hold only one character, but if a client sees the NC_GLOBAL attribute container, it likely is the case. However, to be certain that the String array really holds just one character per element, a client must access each and look at its value.

Another solution, used by the TDS, is to package an array of char in a single String and include an attribute bound to the variable named string_length which contains the length of the string (number of characters, not including a null as would be required to store the string in a C program).

John: How does the TDS handle a variable like char[10][128] which might well be an array of ten Strings? I guess you could say that the all have a string_length of 128. But to a client that thinks that any DAP String[] is really an array of one-character strings, it will probably do something very wrong with those data.

Dennis suggested that we used an array of Byte for the character array. This means that we would loose the notion of string-ness but there would be no size ambiguity. We could add an attribute that the Byte array contains data from the char type.

jimg 16:08, 8 January 2009 (PST) My suggestion is that we adopt the solution John has since there's code (client code in particular) already out there that recognizes it. I will look into the changes that need to be made in the netCDF handler and also how those will affect the more prominent clients. I just looked at the code and the three files we will need to modify are NCArray.cc, NCStr.cc and ncdds.cc. We might look at the whole attribute thing, too.

Problem Summary

This is from an email from John Caron

The semantic mismatch between OpenDAP and netCDF has led to several possible conventions in mapping from netCDF to OpenDAP on the server, and from OpenDAP to netCDF on the client. One design goal is that netCDF files on the server should be semantically equivalent as seen by a client using the netCDF API. Another goal is to make the common case of a rank 1 char array in netCDF map to an OpenDAP String.

Older versions of the netCDF - OpenDAP C++ library mapped char[n] arrays to a DArray of element type DString, where each DString has length 1. Currently, the following convention is the "correct" way for a OpenDAP server to represent netCDF char[n] arrays:

  1. A char[n] array maps to a DString. (this is the common case)
  2. A rank k char[n, m, …, p, q] NetCDF array maps to a rank k-1 OpenDAP DArray[n, m, …p] of element type DString, where each DString has length q. An attribute "strlen" is added to the variable, inside an attribute table called "DODS" (to distinguish it as an attribute added by the DODS layer). The strlen attribute means that all of the DString data elements have the same data length (in this example, length q).
Attributes {
    var1 {
        DODS {
	    Int32 strlen 54;
	}
    } 
}

Another Approach

Also from John: Another approach to this problem is "to add a 'this is really a char' attribute and use a byte type. At the moment, that seems like a simpler choice."