Wiki Testing/ServerInstallationGuide

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽

The OPeNDAP software provides a way to access data over the Internet, from programs that weren't originally designed for that purpose, as well as some that were. The OPeNDAP software implements the Data Access Protocol DAP, version 2. A DAP server is a program that sends data in a standard transmission format to a client that has requested it. A DAP client is a program, running on a networked computer somewhere in the world, that requests and receives data. A client can be a specialized DAP client, or a standard web browser.

Though originally developed to deal with oceanographic data, the OPeNDAP software has found wide use outside that community, and is now used for several different kinds of science data.

Nothing limits the use of the DAP to science data; its framework will support many different data types. But the DAP has facilities for accommodating large arrays, relational tables, irregular grids, and many other otherwise anomalous data types. DAP servers can be adapted to serve data from any kind of storage format, and versions that support several popular data storage formats are readily available.

The OPeNDAP Server

The OPeNDAP DAP server is just an ordinary WWW server (httpd) equipped with a CGI program that enable it to respond to requests for data from DAP client programs. Web servers and CGI programs are standard parts of the Web, and the details of their operation and installation are beyond the scope of this guide. (That is, there are too many different varieties of web servers out there for us to help you install each different one of them.) Once you have one installed, this guide will explain how to use it to serve data using the OPeNDAP server for the DAP.

Entire books are written about the operation of the Internet and the WWW, and about client/server systems. This is not one of them. To understand the OPeNDAP server's architecture, you need only understand the following:


  • A Web server is a process that runs on a computer (the host machine) connected to the Internet. When it receives a URL from some Web client, such as a user somewhere operating Netscape (or a specialized DAP client) it packages and returns the data specified by the URL to that client. The data can be text, as in a web page, but it may also be images, sounds, a program to be executed on the client machine, or some other data.
  • A properly specified URL can cause a Web server to invoke a CGI program on its host machine, accepting as input a part of the URL, or some other data, and returning the output of that program to the client that sent the URL in the first place. The CGI is executed on the server.

The way the server is configured depends on the storage format of the data you intend to serve. The OPeNDAP server supports a variety of storage formats, including \netcdf , HDF4, and DSP. The server can also read data using the \ffnd and \jgofs libraries, which can be configured to read files with nearly any data format.


Server Architecture

The OPeNDAP server consists of a set of programs, and a CGI \new{dispatch script} used to decide which program can handle whatever request is at hand. A user can make three different sorts of requests to the server. The first request is for the "shape" of the data, and consists of the data descriptor structure (DDS). The second request is for the data attribute structure (DAS) of the data types described \subj{You need to know the shape of the data before you request the data.} in the DDS. Quick Start Guide and User Guide contain more information about these structures.) Both of these requests return plain text data, readable in a web browser like Netscape Navigator. You can see a typical DDS here:

actual size

A DDS (One of the basic response types defined by the DAP,version2)sst.mnmean.nc.dds


Here's a DAS from the same dataset:


actual size

A DAS sst.mnmean.nc.das


After getting the metadata (defined, for DAP purposes, as the contents of the DAS and DDS), the DAP client can request actual data. This is binary data, and is often too large to view easily in a web browser. (See Quick Start Guide for strategies to use to examine DAP data from a standard web browser.)

Depending on the data format in use, the DAS and DDS are either generated from the data served, or from ancillary information you have to supply (or both). The data in these structures may be cached by the client system.

In addition to the three basic message types (DAS, DDS, and data), the OPeNDAPserver can also provide information about the server operation and about the data, can return data in ASCII comma-separated tables, and can provide a query form allowing users to craft subsampling requests to the server. Some of these services are provided by other service programs that must be installed with the dispatch script and its companions.

More Explanation of How It Works

To understand the operation of the OPeNDAP server, it is useful to follow the actions taken to reply to a data request. The diagrams in figure1 and figure2 lay out the relationship between the various entities. Consider a DAP request URL such as the following:

http://test.opendap.org/opendap/nph-dods/data/nc/fnoc1.nc

The URL as written refers to the entire data file, but any particular request must be slightly more specific. The precision is supplied by appending a suffix to the data URL. Do you want binary data (.dods), ASCII data (.asc), the DDS (.dds), the DAS (.das), usage information (.info), or a query form (.html)? To get a DDS, for example, you would use this URL:

http://test.opendap.org/opendap/nph-dods/data/nc/fnoc1.nc.dds

A DAP client may silently add the appropriate suffix to the URL, but if you're using a standard WWW client, such as Netscape, you have to add the suffix yourself.

Once the proper suffix has been appended to the URL, the URL is sent out into the world. Through the magic of IP addressing, it makes its way to the web server (httpd) running on the platform, test.opendap.org. \Figureref{dods-server,fig,server-design1} shows these first steps. The client makes an internet connection to the test.opendap.org machine, and the httpd daemon executes the dispatch script (opendap/nph-dods) and forwards it the remaining parts of the URL it had received 2. (In this case, that would be data/nc/fnoc1.nc.dods.) DAP requests are "GET" requests, not "POST" requests, so all the information forwarded is in the URL 3.

actual size

The Architecture of the OPeNDAP Server, part I.

figure1 illustrates what happens next. Sitting in the CGI directory (here called opendap) with the dispatch script are several \new{service programs}, also called \new{services} or \new{helper programs}. The dispatch script (nph-dods) analyzes the suffix on the URL to figure out what kind of request this is, and executes the corresponding service program.

As of DODS release 3.2, the dispatch

script is named nph-dods and OPeNDAP will use this name for all of its version 3.x releases. Earlier releases of DODS used different dispatch scripts, depending on the storage format of the data. For earlier releases, there would be a nph-nc to handle netCDF data, nph-jg to handle JGOFS data, and so on.


In the case illustrated, the .dods suffix indicates that this is a request for binary data. Therefore, the dispatch script executes the dap_nc_handler service, and forwards to it the rest of the URL, which includes a data object name (which may be a file or not, depending on the API), and possibly a constraint expression 4. It's up to the service program to find the data, read it, read and parse the constraint expression (if any), and output the data message. If the service requires any ancillary data, it may also read an ancillary data file or two, as necessary.

actual size

The Architecture of the OPeNDAP Server, part II.

The standard output of the service program is redirected to the output of the httpd, so the client will receive the program output as the reply to its request.

For APIs that are designed to read data in files, such as netCDF, the CGI program will be executed with the working directory (also called the default directory) specified by the httpd configuration. However, the OPeNDAP software will look for its data relative to the document root tree. On the test.opendap.org server, for example, the nph-dods CGI program is executed native to the directory /usr/local/share/dap-server/, but the document root directory is /var/www/html/. The last section of the URL, then, specifies the file fnoc1.nc in the directory:

/var/www/html/data/nc/

Some existing data APIs, such as JGOFS, are not designed with file access as their fundamental paradigm. The JGOFS system, for example, uses an arrangement of "dictionaries" that define the location and method of access for specified data "objects." A URL addressing a JGOFS object may appear to represent a file, like the netCDF URL above.

http://test.opendap.org/opendap/nph-dods/station43

However, the identifier (station43) after the CGI program name (nph-dods) represents, not a file, but an entry in the JGOFS data dictionary. The entry will, in turn, identify a file or a database index entry (possibly on yet another system) and a method to access the data indicated. These are JGOFS server-specific installation issues covered in the installation documentation for that server.

Note that the name and location of the CGI directory (opendap), as well as the name and location of the working directory used by the CGI programs, are local configuration details of the particular web server in use. The location of the JGOFS data dictionary is a configuration issue of the JGOFS installation. That is to say these details will probably be different on different machines.

Service Programs

When the server gets a DAP request, it executes the dispatch script, which then figures out which service program should be invoked. The output from that program is what gets returned to the client.

Here is a table of the service programs required for each of the services of the server. The dispatch script is called nph-dods. (Though see note 1.2.) For another DODS server, the names of some of the helper programs would have a different root than nc. (For example, ff identifies the FreeForm server, jg for JGOFS, and so on.)


DODS Services, with their suffixes and helper programs. Services, with their suffixes and helper programs. In the OPeNDAP server, different handlers are used for each supported data access format, e.g. nc for netCDF, jg for JGOFS, and so on.

Service Suffix Helper Program
Data Attribute .das dap_nc_handler
Data Descriptor .dds dap_nc_handler
DODS Data .dods dap_nc_handler
ASCII Data .asc or .ascii dap_asciival
Information .info dap_usage, see

( sec,document-data).

\ifh .html dap_www_int
Version .ver None
Compression None deflate
Help Anything else None

The service programs are started by the dispatch script depending on the extension given with the URL. If the URL ends with `.das' and the file name in the URL ends in `.nc', then the DAS service program (dap_nc_handler) is started using the -O das argument. Similarly, the extension `.dds' will cause the dap_nc_handler service to be run with the -O dds argument, and so on.

On the client side, when using a DAP client, the user may never see the `.das,' `.dds,' or `.dods' URL extensions. Nor will the user necessarily be aware that each data URL given to the DAP client may produce three different requests for information. These manipulations happen within the DAP client software, and the user need never be aware of them.

This is only true when using a DAP client . Programs that

don't use the OPeNDAP client libraries, or similar, can still be clients of a DAP server. You can use Netscape to contact a DAP server and get data, in which case you have unmoderated access to the server and need to include the service program URL extensions.

Choosing a Data Handler

There are a variety of data format handlers available for the OPeNDAP data server, each designed to handle a different data storage format. Data handlers from OPeNDAP exist to serve data stored in the netCDF, HDF and HDF-EOS, and DSP storage formats (other groups have built data handlers for other formats). If you have data stored with one of these formats, the choice is quite simple: choose the one that works with your data.


There is also a JDBC server, written in Java, for serving data stored in relational databases. See ( chapter 4) for more information about installing that software. See the DRDS Download page for more information about the DODS Java software.


If your data is not already stored in one of the supported formats, don't despair. Some standard API formats include tools for translating data into that format. For example, netCDF includes an application called ncgen you can use to translate array data into standard netCDF files by writing a data description in the netCDF CDL (Common Data Language). See the [netCDF documentation] for more information about this 5.


If your data is not in a supported format and you don't want to translate it into one of those formats, there is still a way to serve your data. There are two other data handlers available that can be used to serve data that are not already in one of these formats. These are the FreeForm and JGOFS servers. It may be that one of these servers can be easily adapted to your uses. The FreeForm handler is somewhat easier to set up, and the JGOFS handler is more flexible. A key difference is that the JGOFS handler can process data contained in several different data files. (The FreeForm handler can, as well, but, being slightly less flexible, it may require the files to be rearranged or renamed.)


Here's a brief comparison of the two:}

Server Advantages Disadvantages
FreeForm Simple to set up. Serving data in a new format requires only
     creating a text file describing that format. Serves data in
     Arrays or Sequences.  
Not quite as flexible as its name implies. If the format in
     question is too complex or too variable, the FreeForm API cannot
     handle it.  Sequences can be served, but only flat ones.  (That
     is, Sequences that contain other Sequences will not work.)
     Generally, data must line up in columns.
JGOFS Extremely flexible. Uses specialized access methods
     to read data, and these methods can be extensively customized.
     Optimized for Sequence data (relational tables), including
     hierarchical Sequences (Sequences that contain other
     Sequences). 
Writing a
     data access method can be complex, since it involves writing a
     program in C or \Cpp .  Does not support Array data types.

   Advantages and Disadvantages of the Two Flexible DODS Servers

It is possible that none of these options is the right one for you, in which case you can use the OPeNDAP DAP library to craft a server of your very own. The library is available in both C++ and Java. If you choose this route, contact OPeNDAP; we may be able to direct you to someone who has already done something like it. The [Programmer Guide] contains useful information about the DAP library, including instructions on how to construct servers and clients.