Wiki Testing/ServerInstallationGuide: Difference between revisions
Line 4: | Line 4: | ||
from programs that weren't originally designed for that purpose, as | from programs that weren't originally designed for that purpose, as | ||
well as some that were. The OPeNDAP software implements the Data | well as some that were. The OPeNDAP software implements the Data | ||
Access Protocol <ref>[[#ref1]]</ref>, | Access Protocol <ref>[[Wiki_Testing/ServerInstallationGuideFootnotes#ref1]]</ref>, | ||
version 2. A \new{DAP server} is a program that sends data in a | version 2. A \new{DAP server} is a program that sends data in a | ||
standard transmission format to a client that has requested it. A | standard transmission format to a client that has requested it. A |
Revision as of 11:26, 3 January 2008
The OPeNDAP Server
The OPeNDAP software provides a way to access data over the Internet, from programs that weren't originally designed for that purpose, as well as some that were. The OPeNDAP software implements the Data Access Protocol <ref>Wiki_Testing/ServerInstallationGuideFootnotes#ref1</ref>, version 2. A \new{DAP server} is a program that sends data in a standard transmission format to a client that has requested it. A \new{DAP client} is a program, running on a networked computer somewhere in the world, that requests and receives data. A client can be a specialized DAP client, or a standard web browser.
Though originally developed to deal with oceanographic data, the OPeNDAP software has found wide use outside that community, and is now used for several different kinds of science data.
Nothing limits the use of the DAP to science data; its framework will support many different data types. But the DAP has facilities for accommodating large arrays, relational tables, irregular grids, and many other otherwise anomalous data types. DAP servers can be adapted to serve data from any kind of storage format, and versions that support several popular data storage formats are readily available.
\texorhtml{}{\htmlmenu{4}=The OPeNDAP Server=}
The OPeNDAP DAP server is just an ordinary WWW server ) equipped with a CGI program that enable it to respond to requests for data from DAP client programs. Web servers and CGI programs are standard parts of the Web, and the details of their operation and \subj{The OPeNDAP DAP server is
just a http server with special CGI programs.} installation are
beyond the scope of this guide. (That is, there are too many different varieties of web servers out there for us to help you install each different one of them.) Once you have one installed, this guide will explain how to use it to serve data using the OPeNDAP server for the DAP.
Entire books are written about the operation of the Internet and the WWW, and about client/server systems. This is not one of them. To understand the OPeNDAP server's architecture, you need only understand the following:
- A Web server is a process that runs on a computer (the host machine) connected to the Internet. When it receives a URL from some Web client, such as a user somewhere operating Netscape (or a specialized DAP client) it packages and returns the data specified by the URL to that client. The data can be text, as in a web page, but it may also be images, sounds, a program to be executed on the client machine, or some other data.
- A properly specified URL can cause a Web server to invoke a \new{CGI program} on its host machine, accepting as input a part of the URL, or some other data, and returning the output of that program to the client that sent the URL in the first place. The CGI is executed on the server.
The way the server is configured depends on the storage format of the data you intend to serve. The OPeNDAP server supports a variety of storage formats, including \netcdf , HDF4, and DSP. The server can also read data using the \ffnd and \jgofs libraries, which can be \subj{The CGI programs you need depend on the data you want to serve.} configured to read files with nearly any data format.
Server Architecture
The OPeNDAP server consists of a set of programs, and a CGI \new{dispatch script} used to decide which program can handle whatever request is at hand.
A user can make
three different sorts of requests to the server. The first request is for the "shape" of the data, and consists of the \new{data descriptor structure} (DDS). The second request is for the \new{data attribute structure} (DAS) of the data types described \subj{You need to know the shape of the data before you request the data.} in the DDS. (\DODSquick and \DODSuser contain more information about these structures.) Both of these requests return plain text data, readable in a web browser like Netscape Navigator. You can see a typical DDS \texorhtml{in [[Image:reynolds,dds}.}{here:]]
\figureplace{A DDS (One of the basic response types defined by the
DAP, version 2)(sst.mnmean.nc.dds)}{htb}
{reynolds,dds}{reynolds-dds.ps}{reynolds-dds.gif}{http://www.cdc.noaa.gov/cgi-bin/nph-nc/Datasets/reynolds_sst/sst.mnmean.nc.dds}
\T\pagebreak
\texorhtml{There's a DAS from the same dataset in
[[Image:reynolds,das}}{Here's a DAS from the same dataset:]]
\subj{Other metadata is helpful, too.}
\figureplace{A DAS (sst.mnmean.nc.das)}{h} {reynolds,das}{reynolds-das.ps}{reynolds-das.gif} {http://www.cdc.noaa.gov/cgi-bin/nph-nc/Datasets/reynolds_sst/sst.mnmean.nc.das}
After getting the \new{metadata} (defined, for DAP purposes, as the contents of the DAS and DDS), the DAP client can request actual data. This is binary data, and is often too large to view easily in a web browser. (See \DODSquick for strategies to use to examine DAP data from a standard web browser.)
Depending on the data format in use, the DAS and DDS are either generated from the data served, or from ancillary information \subj{The DODS/OPeNDAP Quick Start Guide shows how to look at all the
services provided by the OPeNDAP Server.} text files you have to
supply (or both). The data in these structures may be cached by the client system.
In addition to the three basic message types (DAS, DDS, and data), the OPeNDAPserver can also provide information about the server operation and about the data, can return data in ASCII comma-separated tables, and can provide a query form allowing users to craft subsampling requests to the server. Some of these services are provided by other service programs that must be installed with the dispatch script and its companions.
More Explanation of How It Works
To understand the operation of the OPeNDAP server, it is useful to follow the actions taken to reply to a data request. The diagrams in File:Dods-server,fig,server-design1 and File:Dods-server,fig,server-design2 lay out the relationship between the various entities. Consider a DAP request URL such as the following:
http://test.opendap.org/opendap/nph-dods/data/nc/fnoc1.nc
\subj{A description of the different URL parts.} The URL as written refers to the entire data file, but any particular request must be slightly more specific. The precision is supplied by appending a suffix to the data URL. Do you want binary data (.dods), ASCII data (.asc), the DDS (.dds), the DAS (.das), usage information (.info), or a query form (.html)? To get a DDS, for example, you would use this URL:
http://test.opendap.org/opendap/nph-dods/data/nc/fnoc1.nc.dds
A DAP client may silently add the appropriate suffix to the URL, but if you're using a standard WWW client, such as Netscape, you have to add the suffix yourself.
Once the proper suffix has been appended to the URL, the URL is sent out into the world. Through the magic of IP addressing, it makes its way to the web server (httpd) running on the platform, test.opendap.org. \Figureref{dods-server,fig,server-design1} shows these first steps. The client makes an internet connection to the test.opendap.org machine, and the httpd daemon executes the dispatch script (opendap/nph-dods) and forwards it the remaining parts of the URL it had received.\footnote{The actual directory name,
or whether the CGI programs are kept in a particular directory (or named with a particular convention) is another detail of the specific web server and configuration used. The web server might refer to the directory as the ScriptAlias directory, as it does with Apache.} (In this case, that would be
data/nc/fnoc1.nc.dods.) DAP requests are "GET" requests, not "POST" requests, so all the information forwarded is in the URL.\footnote{When you fill out some
HTML form, you are usually sending data in a "POST" request. When you type a URL into your web browser, this is a "GET" request. HTTP servers can respond to both kinds of request.}
\figureplace{The Architecture of the OPeNDAP Server, part I.}{htbp} {dods-server,fig,server-design1}{installfig1.ps}{installfig1.gif}{}
\Figureref{dods-server,fig,server-design2} illustrates what happens next. Sitting in the CGI directory (here called opendap) with the dispatch script are several \new{service programs}, also called \new{services} or \new{helper programs}. The dispatch script (nph-dods) analyzes the suffix on the URL to figure out what kind of request this is, and executes the corresponding service program.
As of DODS release 3.2, the dispatch
script is named nph-dods and OPeNDAP will use this name for all of its version 3.x releases. Earlier releases of DODS used different dispatch scripts, depending on the storage format of the data. For earlier releases, there would be a nph-nc to handle
netCDF data, nph-jg to handle JGOFS data, and so on.
In the case illustrated, the .dods suffix indicates that this is
a request for binary data. Therefore, the dispatch script executes
the dap_nc_handler service, and forwards to it the rest of the URL,
which includes a data object name (which may be a file or not,
depending on the API), and possibly a \new{constraint
expression}.\footnote{This is not shown in this illustration, but it
would follow a question mark in the URL, like this: http://test.opendap.org/opendap/nph-dods/temp.nc.asc?temp[0:180][0:45]. For more information about constraint expressions, see \DODSquick or \DODSuser.} It's up to the service program to find the data,
read it, read and parse the constraint expression (if any), and output the data message. If the service requires any ancillary data, it may also read an ancillary data file or two, as necessary.
\figureplace{The Architecture of the OPeNDAP Server, part II.}{htbp} {dods-server,fig,server-design2}{installfig2.ps}{installfig2.gif}{}
The standard output of the service program is redirected to the output of the httpd, so the client will receive the program output as the reply to its request.
For APIs that are designed to read data in files, such as netCDF, the CGI program will be executed with the working directory (also called the default directory) specified by the httpd configuration. However, the OPeNDAP software will look for its data relative to the document root tree. On the test.opendap.org server, for example, the nph-dods CGI program is executed native to the directory /usr/local/share/dap-server/, but the document root directory is /var/www/html/. The last section of the URL, then, specifies the file fnoc1.nc in the directory: \subj{Data
files are specified relative to the document root directory.}
/var/www/html/data/nc/
Some existing data APIs, such as JGOFS, are not designed with file access as their fundamental paradigm. The JGOFS system, for example, uses an arrangement of "dictionaries" that define the location and method of access for specified data "objects." A URL addressing a JGOFS object may appear to represent a file, like the netCDF URL above.
http://test.opendap.org/opendap/nph-dods/station43
However, the identifier (station43) after the CGI program name (nph-dods) represents, not a file, but an entry in the JGOFS data dictionary. The entry will, in turn, identify a file or a database index entry (possibly on yet another system) and a method to access the data indicated. These are JGOFS server-specific installation issues covered in the installation documentation for that server.
Note that the name and location of the CGI directory (opendap), as well as the name and location of the working directory used by the CGI programs, are local configuration details of the particular web server in use. The location of the JGOFS data dictionary is a configuration issue of the JGOFS installation. That is to say these details will probably be different on different machines.
Service Programs
\indc{services!helper programs} When the server gets a DAP request, it executes the
dispatch script, which then figures out which service program should be invoked. The output from that program is what gets returned to the client.
\subj{The service programs do the real work of the server.} \texorhtml{\Tableref{dods-server,tab,suffixes} contains a list
of}{Here is a table of} the service programs required for each of
the services of the server. The dispatch script is called nph-dods. (Though see note \texorhtml{on page
\pageref{note,dispatch-name}}{(note,dispatch-name)}.) For
another DODS server, the names of some of the helper programs would have a different root than nc. (For example, ff identifies the FreeForm server, jg for JGOFS, and so on.)
\begin{table}[htbp]
\caption[DODS Services, with their suffixes and helper programs\@.] {Services, with their suffixes and helper programs\@. In the OPeNDAP server, different handlers are used for each supported data access format, e.g. nc for netCDF, jg for JGOFS, and so on.}
Service | Suffix | Helper Program |
---|---|---|
Data Attribute | .das | dap_nc_handler |
Data Descriptor | .dds | dap_nc_handler |
DODS Data | .dods | dap_nc_handler |
ASCII Data | .asc or .ascii | dap_asciival |
Information | .info | dap_usage, see |
\ifh | .html | dap_www_int |
Version | .ver | None |
Compression | None | deflate |
Help | Anything else | None |
The service programs are started by the dispatch script depending on
the extension given with the URL. If the URL ends with `.das' and the
file name in the URL ends in `.nc', then the DAS service program
(dap_nc_handler) is started using the -O das argument.
Similarly, the extension `.dds' will cause the dap_nc_handler
service to be run with the -O dds argument, and so on. \subj{The
service program invoked, and its arguments, depends on the URL suffix.}
On the client side, when using a DAP client, the user may never see the `.das,' `.dds,' or `.dods' URL extensions. Nor will the user necessarily be aware that each data URL given to the DAP client may produce three different requests for information. These manipulations happen within the DAP client software, and the user need never be aware of them.
This is only true when using a DAP client . Programs that
don't use the OPeNDAP client libraries, or similar, can still be clients of a DAP server. You can use Netscape to contact a DAP server and get data, in which case you have unmoderated access to the
server and need to include the service program URL extensions.
Choosing a Data Handler
There are a variety of data format handlers available for the OPeNDAP data server, each designed to handle a different data storage format. Data handlers from OPeNDAP exist to serve data stored in the netCDF, HDF and HDF-EOS, and DSP storage formats (other groups have built data handlers for other formats). If you have data stored with one of these formats, the choice is quite simple: choose the one that works with your data.
There is also a JDBC server, written in Java, for serving data
stored in relational databases. See ( server,java) for more
information about installing that software.
See the DRDS Download page
for more information about the DODS Java software.
If your data is not already stored in one of the supported formats,
don't despair. Some standard API formats include tools for
translating data into that format. For example, netCDF includes an
application called ncgen you can use to translate array data
into standard netCDF files by writing a data description in the netCDF
CDL (Common Data Language). See the [http://www.unidata.ucar.edu/packages/netcdf/guidef/guidef-15.html#MARKER-2-3320netCDF documentation
for more information about this.\footnote{A user has contributed
examples for this. Go to http://seawater.tamu.edu/noppdodsgom and click on "Resources."}
If your data is not in a supported format and you don't want to
translate it into one of those formats, there is still a way to serve
your data. \subj{There are two data handlers that can accommodate
various file formats.} There are two other data handlers available
that can be used to serve data that are not already in one of these formats. These are the FreeForm and JGOFS servers. It may be that one of these servers can be easily adapted to your uses. The FreeForm handler is somewhat easier to set up, and the JGOFS handler is more flexible. A key difference is that the JGOFS handler can process data contained in several different data files. (The FreeForm handler can, as well, but, being slightly less flexible, it may require the files to be rearranged or renamed.)
\texorhtml{There's a brief comparison of the two in
\tableref{tab:jgofs,vs,ff}}{Here's a brief comparison of the two:}
Server | Advantages | Disadvantages |
---|---|---|
FreeForm | Simple to set up. Serving data in a new format requires only
creating a text file describing that format. Serves data in Arrays or Sequences. |
Not quite as flexible as its name implies. If the format in
question is too complex or too variable, the FreeForm API cannot handle it. Sequences can be served, but only flat ones. (That is, Sequences that contain other Sequences will not work.) Generally, data must line up in columns. |
JGOFS | Extremely flexible. Uses specialized access methods
to read data, and these methods can be extensively customized. Optimized for Sequence data (relational tables), including hierarchical Sequences (Sequences that contain other Sequences). |
Writing a
data access method can be complex, since it involves writing a program in C or \Cpp . Does not support Array data types. |
\caption{Advantages and Disadvantages of the Two Flexible DODS Servers}
It is possible that none of these options is the right one for you, in which case you can use the OPeNDAP DAP library to craft a server of your very own. The library is available in both \Cpp and Java. If you choose this route, contact OPeNDAP; we may be able to direct you to someone who has already done something like it. The \DODSapi contains useful information about the DAP library, including instructions on how to construct servers and clients.