AHVRR Image |
DAS |
DDS
The DAP models all datasets as collections of
[[7][variables]].
The [[8][DDS and<nop>DataDDS]]
objects are containers for those variables. How you represent your dataset
using the three objects and the DAP's data type hierarchy is covered in
[[../ws-html/server-implementing-dds.html][Implementing the DDS object]] \\
in Writing an <nop>OPeNDAP Server .
The DAP also defines services
_Information of the DAP services is presented here for completeness and
because using these can help speed and simplify development of you client.
For example, you can use the HTML and ASCII services to look at a data
source using only a web browser. Similarly, the INFO response can be used
to look at the attributes and variables in a given data source_
In the previous section we said that the DAP defined three objects and
all interaction with the server involved those three objects. In fact,
the DAP also defines other responses. They are:
Object |
Response
|
ASCII
|
Data can be requested in CSV form.
|
HTML
|
Each server can return an HTML form that facilitates building URLs.
|
INFO
|
Each server can combine the DDS and DAS and present that as HTML.
|
VERSION
|
Each server must be able to respond to a request for it's version and the version of the DAP it implements.
|
HELP
|
Each server must be able to provide a rudimentary help response.
|
In each case the server's response to these requests is built using
one or more of the basic three objects. Here are some links to
various datasets' ASCII, HTML and INFO responses:
Connecting to the server
To manage the connection between the client application and the remote
server, the DAP uses two objects. The \class{Connect} class manages
one connection to either a remote data server, or a local access. The
\class{Connections} class is used to manage a set of instances to the
class \class{Connect}. For each data source that the client
opens, there must be exactly one instance of the \class{Connect}
class. The \OPDapi\ provides a description for the C++ toolkit's
usage.
Getting ready to write your client
An <nop>OPeNDAP-enabled client application creates a connection to the
remote server using the {Connect} class, and then issues
requests to the remote server through the {Connect} class
methods. Please refer to the [[9][Geturl.java]] and
[[10][geturl.cc]] sources as examples of command-line
based DAP client applications written in Java and C++, respectively.
Most of the software is boilerplate. Following are sections of the
Geturl.java client application, later a description of the same
example in C++ will be provided. Again, please refer to the complete
source listings referenced above.
<code>
DConnect url = null;
try {
url = new DConnect(nextURL, accept_deflate);
}
</code>
This code snippet instantiates a new instance of the {DConnect}
class, passing the URL referencing the remote data server, and a
boolean flag indicating that the client can accept responses from the
server which are compressed.
<br><code>
if (get_data) {
if ((cexpr==false) && (nextURL.indexOf('?') == -1)) {
System.err.println("Must supply a constraint expression with -D.");
continue;
}
for (int j=0; j < times; j++) {
try {
StatusUI ui = null;
if (gui)
ui = new StatusWindow(nextURL);
DataDDS dds = url.getData(expr, ui);
processData(url, dds, verbose, dump_data, accept_deflate);
}
catch (DODSException e) {
System.err.println(e);
System.exit(1);
}
catch (java.io.FileNotFoundException e) {
System.err.println(e)
System.exit(1);
}
catch (Exception e) {
System.err.println(e);
e.printStackTrace();
System.exit(1);
}
}
}
</code></br>
This compound statement block initiaties a data request to the remote
server. The {DConnect} method {getData} forms the data
request to the remote server by appending the string, {expr},
containing the constraint-expression, onto the URL used in creating
the initial {DConnect} to the remote site. The parameter,
{ui}, provides an optional {StatusWindow} object to provide
the status of the current request to the client.
<br><code>
catch (DODSException e) {
System.err.println(e);
System.exit(1);
}
catch (java.io.FileNotFoundException e) {
System.err.println(e);
System.exit(1);
}
catch (Exception e) {
System.err.println(e);
e.printStackTrace();
System.exit(1);
}
</code><br>
Completing the try block is a series of catch blocks that catch
exceptions thrown by the DAP library code, and Java input/output and
general exceptions.
The C++ toolkit provides similar functionality as the Java toolkit
though the parameters to the individual {Connect} methods may vary.
The C++ client application [[geturl.cc.html][geturl.cc]]
uses the similar C++ toolkit classes as the Java toolkit to implement the
Geturl client application:
<br><code>
string name = argv[i];
Connect url(name, trace, accept_deflate);
</code><br>
This code fragment declares an instance of the {Connect} class,
passing the URL referencing the remote data server, and a boolean
flag indicating that the client can accept responses from the server
that are compressed.
<br><code>
else if (get_data) {
if (expr.empty() && name.find('?') == string::npos)
expr = "";
for (int j = 0; j < times; ++j) {
<nop>DataDDS dds;
try {
DBG(cerr << "URL: " << url->URL(false) << endl);
DBG(cerr << "CE: " << expr << endl);
url->request_data(dds, expr);
if (verbose)
fprintf( stderr, "Server version:
url->get_version().c_str() ) ;
print_data(dds, print_rows);
}
catch (Error &e) {
e.display_message();
continue;
}
}
}
</code><br>
This compound statement block initiaties a data request from the remote
server. The {Connect} method {request_data} forms the data request
to the remote server by appending the string, <code>{expr}, containing the
constraint-expression, onto the URL used in creating the initial
<code>{Connect} to the remote site.
<pre>
<br><code>
catch (Error &e) {
e.display_message();
continue
}
</code><br>
Completing the try block is a catch block that picks up exceptions thrown by
the DAP library code. The DAP C++ library throws several types of exceptions,
the most common of which are {Error} and {InternalErr}. All of
the exceptions are either instances of {Error} or are specializations
of it, so catching just {Error} will get everything.
Subclassing the data types
The DAP defines a data type hierarchy as the core of its data model. This
collection of data types includes scalar, vector and constructor types. Most
of the types are available in all modern programming languages with the
exceptions being {Url}, <pre>{Sequence} and <pre>{Grid}. In the DAP
library, the class <pre>{BaseType} is the root of the data type tree.
==A quick review of the data types supported by the DAP==
The DAP supports the common scalar data types such as Byte, 16- and 32-bit
signed and unsigned integers, and 32- and 64-bit floating point numbers. The
DAP also supports Strings and Urls as basic scalar types. The DAP includes
Arrays of unlimited size and dimensionality. The DAP also supports three
type-constructors: <pre>{Structure}, <pre>{Sequence} and <pre>{Grid}. A
<pre>{Structure} on the DAP mimics a struct in <pre>{C}. A <pre>{Sequence} is a
table-like data structure inherited from the JGOFS data system. It can be
used to hold information that might be stored in relational databases or
tables, either flat or hierarchical. The JGOFS, FreeForm and HDF servers all
use the <pre>{Sequence} data type. Lastly, the <pre>{Grid} data type is used to
bind an array to a group of ''map vectors'' , single dimension arrays that
provide non-integral values for the indexes of the array. The most typical
use of a Grid is to provide latitude and longitude registration for some
georeferenced array data (e.g., a projected satellite image). The DAP does
not have a pointer data type, but in some cases the <pre>{Url} data type can
be used as a pointer to variables between files. More information about the
[[http://www.unidata.ucar.edu/packages/dods/api/pguide-html/pguide_9.html][DAP's data typehierarchy]]
is given in the Programmer's Guide.
==Creating the subclasses==
When you build a DAP client, you must create a collection of data type
subclasses. That is each of the leaf classes in the preceding class
diagram must be subclassed by your client. This is pretty easy since a
good bit of the work is rote.
First we'll illustrate the parts that are mechanical. Here's an example from
the C++ Matlab client. The class is the Byte class. In the case of the matlab
client, this class doesn't do anything beyond the bare minimum, so it's a
good starting point:
<pre>
<br><code>
Byte *
NewByte(const string &n)
{
return new ClientByte(n);
}
BaseType *
ClientByte::ptr_duplicate()
{
return new ClientByte(*this);
}
bool
ClientByte::read(const string &)
{
throw InternalErr(__FILE__, __LINE__, "Called unimplemented read method");
}
</code><br>
To create a child of any of the data type leaf classes, you must define three
methods and one function. Let's talk about the function first. The function
{NewByte} is what Meyers[[meyers:ecpp]] calls a \emph{virtual
constructor}. It's similar to a low-budget factory class ("low-budget"
because it's not a class). This function is used at various places in the DAP
library when it need to create instances of <pre>{Byte} without knowing in
advance the dynamic type of the object that actually will be created. If all
this sounds a little weird, just remember that your <pre>{Byte}, <pre>{Int16},
..., <pre>{Grid} classes --- whatever they may be called --- must all contain
an implementation of this function and each should all return a pointer to an
instance of the appropriate child class. These functions will be used by the
library to create instance of the classes you have defined when writing your
server. In this case of the example Matlab server, it's an instance of the
<pre>{MATByte} class. If you look in the files for the Matlab server, you'll
see that the function <pre>{NewGrid} returns a pointer to a new <pre>{MATGrid},
and so on.
Second, a constructor must be implemented and should take the name of the
variable as its sole argument.
Third, your child classes should also define the <pre>{ptr_duplicate()}
method. This method returns a pointer to a new instance of an object in the
same class. Occasionally, in the DAP library, objects are declared with
pointers specified as <pre>{BaseType *}. If the <pre>{new} operator was used to
copy such an object, the copied object would be an instance of BaseType (the
static type of the object) not the type of the thing referenced (the dynamic
type)\footnote{This is the oft discussed phenomenon of `slicing,' see
Meyers[[meyers:ecpp]], Stroustrup[[stroustrup:cpp]], et c., for a
complete explanation.}. By using the <pre>{ptr_duplicate()} method the DAP
library is sure that when it copies an object, it's getting an instance of
the subclass defined by your server.
Unlike the case where you are subclassing the DAP variable classes to build a
server, there's no need to implement <pre>{read()} when building a client. The
classes contain a default implementation od <pre>{read()} that throws
InternalErr if it is ever called (which no simple client should ever
do.\footnote{If you're building a gateway, something that is both a client
and a server, you'll need to implement <pre>{read()}.}
=Accessing the DDS object=
(1)
The Data Descriptor Structure (DDS) is a data structure used by the
DODS software to describe datasets and subsets of those datasets. The
DDS may be thought of as the declarations for the data structures that
will hold data requested by some DODS client. Part of the job of a
DODS server is to build a suitable DDS for a specific dataset and to
send it to the client. Depending on the data access API in use, this
may involve reading part of the dataset and inferring the DDS. Other
APIs may require the server simply to read some ancillary data file
with the DDS in it.
For the client, the DDS object includes methods for reading the
persistent form of the object sent from a server. This includes
parsing the ASCII representation of the object and, possibly, reading
data received from a server into a data object.
Note that the class DDS is used to instantiate both DDS and <nop>DataDDS objects.
A DDS that is empty (contains no actual data) is used by servers to send
structural information to the client. The same DDS can be treated as a
<nop>DataDDS when data values are bound to the variables it defines.
For a complete description of the DDS layout and protocol, please
refer to \OPDuser\ and \OPDapi .
The DDS has an ASCII representation, which is what is transmitted from
a DODS server to a client. Here is the DDS representation of an entire
dataset containing a time series of worldwide grids of sea surface
temperatures:
<pre>
<br><code>
Dataset {
Grid {
ARRAY:
Int32 sst[time = 404][lat = 180][lon = 360];
MAPS:
Float64 time[time = 404];
Float64 lat[lat = 180];
Float64 lon[lon = 360];
} sst;
} weekly;
</code><br>
If the data request to this dataset includes a constraint expression,
the corresponding DDS might be different. For example, if the request
was only for northern hemisphere data at a specific time, the above
DDS might be modified to appear like this:
<br><code>
Dataset {
Grid {
ARRAY:
Int32 sst[time = 1][lat = 90][lon = 360];
MAPS:
Float64 time[time = 1];
Float64 lat[lat = 90];
Float64 lon[lon = 360];
} sst;
} weekly;
</code><br>
The constraint has narrowed the area of interest; the range of latitude
values has been halved and there is only one time value in the returned
array.
See \OPDuser\ , \OPDapiref\ for descriptions of the DODS data types.
Reading data from a DDS object is the heart of writing your own <nop>OPeNDAP
client. To integrate the information contained in the DDS, you must do two
things. First you must decide how the data type hierarchy that is part of the
DAP can be represented in your client application. Some client applications
cannot represent all possible DAP data types directly. Where possible the
client developer should strive to support as many data types as possible to
facilitate access to the wide variety of data accessible through <nop>OPeNDAP
servers. In practice, once you know how to map variables from the DAP into
your client application, writing code to build the DDS instance is easy.
<br><code>
else if (get_dds) {
for (int j = 0; j < times; ++j) {
DDS dds;
try {
url->request_dds(dds);
}
catch (Error &e) {
e.display_message();
delete url; url = 0;
continue; // Goto the next URL or exit the loop.
}
if (verbose) {
fprintf( stderr, "Server version:
url->get_version().c_str() ) ;
fprintf( stderr, "DDS:\n" ) ;
}
dds.print(stdout);
}
}
</code><br>
Above, the {Connect} method {request_dds} is called, passing a
reference to a DDS object. Following is an example from the C++ Matlab client
illustrates a simple traversal of the DDS object returned from the
<pre>{connect::request_data} method.
<pre>
<br><code>
static void
process_data(Connect &url, DDS &dds)
{
if (verbose)
cerr << "Server version: " << url.server_version() << endl;
for (DDS::Vars_iter i = dds.var_begin(); i != dds.var_end(); i++) {
BaseType *v = *i ;
v->print_decl(cout, "", true);
smart_newline(cout, v->type());
}
}
</code><br>
In the C++ DAP classes STL iterators are used to iterate over the members
(i.e., variables) in the DDS object. The iterator {i} references pointers
to the top-level <pre>{BaseType} objects held by the DDS. See
Jouspurtis[[josuttis:cpp-stl]] for information about the Standard Template
Library and STL iterators. The \OPDapi\ provides
a description for the each of the data types, and the methods available to
operate on them.
=Accessing the DAS object=
The Data Attribute Structure (DAS) is a set of name-value pairs used to
describe the data in a particular dataset.\footnote{Often this is referred to
as the data set's ''meta data'' or ''semantic meta data'' .} The
name-value pairs are called the ''attributes'' . The values may be of any
of the DODS simple data types (<pre>{Byte}, <pre>{Int16}, <pre>{UInt16},
<pre>{Int32}, <pre>{UInt32}, <pre>{Float32}, <pre>{Float64}, <pre>{String} and
<pre>{URL}), and may be scalar or vector. (Note that all values are actually
stored as string data.)
A value may also consist of a set of other name-value pairs. This makes it
possible to nest collections of attributes, giving rise to a hierarchy of
attributes. The DAP uses this structure to provide information about
variables in a dataset.
In the following example of a DAS, several of the attribute collections have
names corresponding to the names of variables in a hypothetical dataset. The
attributes in that collection are said to belong to that variable. For
example, the <pre>{lat} variable has an attribute units of
<pre>{degrees_north.}
<br><code>
Attributes {
GLOBAL {
String title "Reynolds Optimum Interpolation (OI) SST";
}
lat {
String units "degrees_north";
String long_name "Latitude";
Float64 actual_range 89.5, -89.5;
}
lon {
String units "degrees_east";
String long_name "Longitude";
Float64 actual_range 0.5, 359.5;
}
time {
String units "days since 1-1-1 00:00:00";
String long_name "Time";
Float64 actual_range 726468., 729289.;
String delta_t "0000-00-07 00:00:00";
}
sst {
String long_name "Weekly Means of Sea Surface Temperature";
Float64 actual_range -1.8, 35.09;
String units "degC";
Float64 add_offset 0.;
Float64 scale_factor 0.0099999998;
Int32 missing_value 32767;
}
}
</code><br>
Attributes may have arbitrary names, although in most datasets it is
important to choose these names so a reader will know what they
describe. In the above example, the GLOBAL attribute provides
information about the entire dataset.
Data attribute information is an important part of the the data
provided to a DODS client by a server, and the DAS is how this data is
packaged for sending (and how it is received).
An example of Attribute handling in a client application is provided in
the {www-int} C++ source:
<pre>
<br><code>
void
LoaddodsProcessing::print_attr_table(AttrTable &at, ostream &os)
{
for (AttrTable::Attr_iter i = at.attr_begin(); i != at.attr_end(); ++i) {
int attr_num = at.get_attr_num(i);
switch (at.get_attr_type(i)) {
case Attr_container: {
AttrTable *cont_atp = at.get_attr_table(i);
os << "Structure" << endl << names.lookup(at.get_name(i), translate)
<< " " << cont_atp->get_size() << endl;
print_attr_table(*cont_atp, os);
break;
}
case Attr_string:
case Attr_url:
if (attr_num == 1) {
os << "String" << endl << names.lookup(at.get_name(i), translate) << endl
<< at.get_attr(i) << endl;
}
else {
os << "Array" << endl << "String " << names.lookup(at.get_name(i), translate)
<< " 1" << endl << attr_num << endl;
for (int j = 0; j < attr_num; ++j)
os << at.get_attr(i, j) << endl;
os << endl;
}
break;
// The remainder of this method's code has been elided. To see the
// complete method, look at the source file
// DODS/src/clients/ml-cmdln/LoaddodsProcessing.cc
}
}
}
</code><br>
As with the DDS, the DAS object is a container and STL iterators are used to
access its members (the attributes). There are several differences, however,
between the two containers. The DDS holds complete objects, each of which is
an instance of the class {BaseType}. A DAS, however, holds a collection
of attributes. Unless the attribute is itself a container for other attribute
type-name-value tuples, there is no contained object to access with methods
to run. Instead the DAS and AttrTable classes themselves provide methods that
are used to access the type, name and value of the attributes. These accessor
methods take as their arguments a STL iterator.
For example, in the first case, the method {AttrTable::get_attr_table()}
is used to get a pointer to an AttrTable (which is the `value' of a container
attribute). In the second case the <pre>{AttrTable::get_name()} and
<pre>{AttrTable::get_attr()} methods are used to get the name and value of
simple attributes. In each case the <pre>{Attr_iter} <pre>{i} is passed to the
methods.
=Getting Data: Accessing the <nop>DataDDS object=
Up till now we have talked about access to a data source's meta data. The DDS
provides access to the syntactic meta data and the DAS provides semantic meta
data. Use the <nop>DataDDS to access data values held by the data source.
The <code>{<nop>DataDDS} class is an extension of class <code>{DDS} which
contains the binary data values returned by the remote server. It supports
the same methods as the class <code>{DDS}, but the <pre>{BaseType::buf2val()}
methods can be used to extract data held in the variable instances of
<code>{BaseType}. Use the <code>{DDS}, <code>{BaseType} and their iterators
to access the variables.
Use <code>{Connect} to ask a remote server for a <code>{<nop>DataDDS}. The
<code>{Connect} provides <pre>{Connect::request_data()} to get the
<code>{<nop>DataDDS}. The <pre>{request_data()} method accepts a constraint
expression which can be used to restrict the data returned by the remote
server. See \OPDuser\ for more information about constraint expression
syntax.\footnote{The details of the constraint expression syntax are covered
in The \xlink{DODS User Guide[constraint]} {\OPDuserUrl/constraint.html}.}
Use iterators to traverse the <code>{<nop>DataDDS} and access the individual DAP
data variable objects. The following example
prints the DAP data objects declaration, to access the binary data returned
by the server the DAP provides access methods to retrieve the data object's
buffer contents.
<pre>
<br><code>
static void
process_data(Connect &url, DDS *dds)
{
if (verbose)
cerr << "Server version: " << url.server_version() << endl;
for (DDS::Vars_iter i = dds.var_begin(); i != dds.var_end(); i++) {
BaseType *v = *i ;
v->print_decl(cout, "", true);
smart_newline(cout, v->type());
}
}
</code><br>
The binary data returned by the server is stored in the {_buf} member of
each of the DAP's atomic data types. To retrieve the atomic data type's
buffer contents the <pre>{BaseType::buf2val} method is used.
<pre>
<br><code>
n_bytes = dds->var(q)->buf2val((void **) &localVar);
</code><br>
The following C++ example is from the C++ Matlab client
application and illustrates the use of the C++ DAP classes
and methods to access elements from a \class{Sequence} data type.
<br><code>
void
ClientSequence::print_one_row(ostream &os, int row, string space,
bool print_row_num)
{
const int elements = element_count();
for (int j = 0; j < elements; ++j) {
BaseType *bt_ptr = var_value(row, j);
if (bt_ptr) { // data
bt_ptr->print_val(os, space, true);
}
}
}
void
ClientSequence::print_val_by_rows(ostream &os, string space,
bool print_decl_p,
bool print_row_numners)
{
const int rows = number_of_rows();
for (int i = 0; i < rows; ++i) {
print_one_row(os, i, space, false);
}
}
</code><br>
The C++ client uses rows and columns to access the individual elements of the
{Sequence}. The C++ Matlab client uses two methods to accomplish the
extraction, the first, {print_val_by_row()} determines the number of
rows in the <code>{Sequence} and calls the <pre>{print_one_row()} for each
of the rows in the <code>{Sequence}. The C++ DAP implementation of the
<code>{Sequence} data type provides the <pre>{ var_value(row,col)} method to
access the individual elements of the <code>{Sequence}. The
<pre>{var_value()} method returns a <code>{BaseType} pointer to the row,
column element of the <code>{Sequence}. To access the binary data value
stored in that element, the <code>{BaseType} method <pre>{buf2val()} can be
used. The preceding example simply prints the contents of the element, most
client applications would assign the contents to a local variable in the
workspace.
<pre>
<br><code>
void
ClientArray::print_val(ostream &os, string, bool print_decl_p)
{
if (print_decl_p) {
os << type_name() << endl << var()->type_name() << " "
<< get_matlab_name() << " " << dimensions(true)
<< endl;
// Write the actual dimension sizes on a separate line.
for (Pix p = first_dim(); p; next_dim(p))
os << dimension_size(p, true) << " ";
os << endl;
}
for (int i = 0; i < length(); ++i)
var(i)->print_val(os, "", false);
}
</code><br>
Notes
Here's a collection of information that might be important to specific
clients but is hard to fit into a general tutorial.
- Because there are no meta data requirements to serve data via the <nop>OPeNDAP protocol, client applications may not find all the information they require to make use of the data. The DAP currently supports ancillary DAS files at the remote server site. In development is an Ancillary Information Service (AIS) which will permit these external meta data resources to be located with the remote data itself, at other remote server sites, or on the client. Any meta data augmented by the AIS will be clearly indicated in the attributes.
- It is the client application's responsibility to provide the initial base URLs to the remote server site. In development are data discovery services, including an ImportWizard which can query existing directory services such as the GCMD, to provide the base URLs to providers with installed <nop>OPeNDAP data servers.
- The DODS Project has developed two tools to help with serving datasets that contain many files. The first is to set up a `file server' a kind of catalog of URLs that is itself a DODS data set. The second is called the Aggregation Server (AS). The AS can automatically aggregate discrete datasets, accessed as either as files (in some cases) or URLs to produce a single data set. See the \OPDhome\ and/or contact tech support (\OPDsupport) for help with this.
- You can get help from the \OPDhome\, the \OPDtechList\ and the DODS user support desk (\OPDsupport).
\appendix
How the Java DAP library differs
(2)
Enumerations instead of iterators
Factory classes instead of virtual constructors
DAP core with separate client and server specializations via Java interfaces
The constraint evaluator is an object passed into the DDS (Java) rather than
a set of methods embedded in the DDS (C++)
\bibliographystyle{plain}
\bibliography{../boiler/dods}
|