|
|
(8 intermediate revisions by the same user not shown) |
Line 370: |
Line 370: |
| (and the motivation behind many of the design decisions) of the OPeNDAP | | (and the motivation behind many of the design decisions) of the OPeNDAP |
| software. | | software. |
|
| |
| =Using OPeNDAP=
| |
|
| |
|
| |
| A user uses OPeNDAP with an OPeNDAP client program. This client program may
| |
| have been acquired by the user (for example, the OPeNDAP Matlab and IDL
| |
| graphic user interfaces, or Ferret, a freeware data analysis package
| |
| each use OPeNDAP for data access), or may be a program converted to
| |
| use the OPeNDAP library for data access (see ([http://www <cite> opd-client</cite>]).
| |
|
| |
| In either case, there are a set of issues that must be addressed in
| |
| order to use a program to access data through OPeNDAP. The issues can be
| |
| classed into two groups. One set of issues involves configuring the
| |
| system to provide OPeNDAP with the helper applications and environment
| |
| variables it requires. The other set concerns the manner in which a
| |
| user communicates with an OPeNDAP server. We cover this first
| |
|
| |
| ==How OPeNDAP Finds Data==
| |
|
| |
|
| |
|
| |
| Once linked to the
| |
| OPeNDAP libraries, an OPeNDAP client created from an existing program will
| |
| work exactly as before when run using local files. However, a user
| |
| can also specify an OPeNDAP Uniform Resource Locator (URL) to indicate
| |
| some data file on a remote host machine. When the program receives
| |
| this URL, the OPeNDAP libraries will recognize it as remote data, and
| |
| issue a network request for the data. If a user has also installed
| |
| an OPeNDAP server on the local machine, then local data may be accessed
| |
| either through their local filenames or their OPeNDAP URL.
| |
|
| |
| A URL is simply a unique name for some Internet resource.
| |
| The [[Image:opd-client,fig,url-parts]] shows the parts of a
| |
| typical OPeNDAP URL.
| |
|
| |
| \begin{figure}[h]
| |
| \texorhtml
| |
| {\small
| |
| ${}\overbrace{>dncview}^{Program}
| |
| \overbrace"http}^{Protocol"://
| |
| \overbrace"dods.gso.uri.edu}^{Machine Name"/
| |
| \overbrace"cgi-bin/nph-nc}^{Server"/
| |
| \overbrace"data}^{Directory"/
| |
| \overbrace"fnoc1.nc}^{Filename"/
| |
| \overbrace".das}^{URL Suffix}$"
| |
| {\begin{vcode}{cb}
| |
| >dncview http://dods.gso.uri.edu/cgi-bin/nph-nc/data/fnoc1.nc.das
| |
|
| |
| ^ ^ ^ ^ ^ ^ ^
| |
|
| |
| | | | | | | |
| |
| Program | | | | | |
| |
| Protocol-- | | | | |
| |
| Machine Name----- | | | |
| |
| Server------------------------------------ | | |
| |
| Directory---------------------------------------- | |
| |
| Filename---------------------------------------------- |
| |
| URL Suffix-----------------------------------------------------
| |
| \end{vcode}}
| |
| \caption{Parts of an OPeNDAP URL (without a constraint expression)}
| |
|
| |
| \end{figure}
| |
|
| |
| The parts of the URL are:
| |
|
| |
| <blockquote>
| |
|
| |
| ; protocol :
| |
|
| |
|
| |
| The protocol of an Internet request may be thought of as the kind
| |
| of conversation the client expects to have with the target machine.
| |
| For example, a web browser like Netscape Navigator wants to find
| |
| a server that can return hypertext documents, while an ftp client
| |
| wants to find a server that can understand file transfer requests. A
| |
| web browser equipped to display hypertext documents will specify
| |
| <font color='green'>http</font> as the protocol for its conversation, and hope that the target
| |
| machine has an <font color='green'>httpd</font> daemon listening.
| |
|
| |
| ; host : The host name in a URL is simply the
| |
| Internet address of the host machine running whatever server can
| |
| reply to the specified protocol.
| |
|
| |
| ; server : A special feature of the <font color='green'>httpd</font> server process is
| |
| that it may be configured to execute Common Gateway Interface (CGI)
| |
| programs upon receipt of a properly specified URL. This is used, for example, by
| |
| Internet search engines that ask a user to fill out a form. The CGI
| |
| specification will be specific to the server in question, and the
| |
| part of the URL that follows the CGI name is passed to the CGI upon
| |
| invocation. This data may include a file name, but it may as easily
| |
| be some arbitrary string of instructions. The OPeNDAP server is simply
| |
| a set of CGI scripts executed on demand by the <font color='green'>httpd</font> server.
| |
| Here, the OPeNDAP server is represented by a CGI script called
| |
| <font color='green'>nph-nc</font>.
| |
|
| |
| ; filename : If a CGI is not
| |
| specified, the part of the URL after the host name is simply the
| |
| name of a file that is to be returned to the inquiring browser. If
| |
| a CGI is specified, the file is given to the program as its
| |
| argument.
| |
|
| |
| ; URL suffix : If you are issuing an OPeNDAP request
| |
| from a non-OPeNDAP client, such as a web browser, you can specify the
| |
| type of request by appending a suffix to the URL. Different
| |
| suffixes demand different services from the server. The different
| |
| services are listed in ([http://www <cite> opd-client,services</cite>]). If you
| |
| are using OPeNDAP from an OPeNDAP client, or a client program adapted to
| |
| use the OPeNDAP DAP library, you do not need to use a URL suffix. For
| |
| example, to use OPeNDAP from Matlab, with the Matlab GUI or
| |
| command-line clients, you do not need to use a suffix. To use OPeNDAP
| |
| from a simple web browser like Netscape Navigator, you will need to
| |
| use a suffix.
| |
|
| |
| </blockquote>
| |
|
| |
| The URL in [[Image:opd-client,fig,url-parts]] shows a client
| |
| request to the <font color='green'>httpd</font> server on the machine
| |
| <font color='green'>dods.gso.uri.edu</font>, for a netCDF dataset (specified by the
| |
| <font color='green'>nph-nc} in the \lit{cgi-bin</font> directory) contained in a file
| |
| called <font color='green'>fnoc1.nc}. Upon receiving this URL, the \lit{httpd</font>
| |
| server executes the specified OPeNDAP server module (<font color='green'>nph-nc</font>), which
| |
| retrieves the file is in a directory called <font color='green'>data</font> relative to
| |
| wherever the <font color='green'>httpd</font> server looks for its data\footnote{The only
| |
|
| |
| part of the URL whose spelling is not at the discretion of the
| |
| administrator of the host machine is the <font color='green'>http</font>, and the
| |
| <font color='green'>nph-</font> at the beginning of the CGI script name. Even the
| |
| <font color='green'>nc</font>, indicating netCDF, can be changed, although for clarity's
| |
| sake, we hope people won't do so. Incidentally, the <font color='green'>nph-</font> is a
| |
| relic, dating from the early days of the World Wide Web and the
| |
| first hypertext protocol standards. It stands for "Non-Parsing
| |
| Header" (See the CGI 1.1 Standard for more information.), and is
| |
| the only way to pass data through many httpd servers unparsed.}.
| |
|
| |
| OPeNDAP URLs can get somewhat more complicated than this simple
| |
| description. In particular, they can contain "constraint
| |
| expressions" that limit a request to data satisfying a set of
| |
| conditions, and they can contain requests to specific OPeNDAP services,
| |
| besides the data delivery service suggested here. Constraint
| |
| expressions are described in more detail in
| |
| ([http://www <cite> opd-client,constraint</cite>]), while the array of services
| |
| provided by OPeNDAP servers are described in
| |
| ([http://www <cite> opd-client,services</cite>]).
| |
|
| |
| ===Security===
| |
| Some OPeNDAP data providers will choose to control access to some or all
| |
| of their data. When you request data from one of these servers, the
| |
| OPeNDAP client will prompt you for a username and password. If you want
| |
| to avoid the prompt, you can make the OPeNDAP URL even more baroque by
| |
| embedding a username and password in it, like this:
| |
|
| |
| \begin{vcode}{sib}
| |
| http://user:password@www.dods.org/nph-dods/etc...
| |
| \end{vcode}
| |
|
| |
|
| |
| ==The OPeNDAP Services==
| |
|
| |
| Up to now, we have treated the OPeNDAP server as if it has only one
| |
| service: providing data to clients who ask for it. It is true that
| |
| this is the most important service a server provides. However, it is
| |
| also true that the server provides several other services besides
| |
| that. In fact, fulfilling a request for data actually requires three
| |
| separate requests from the client, using three different services of
| |
| the OPeNDAP server.
| |
|
| |
| The services requested from an OPeNDAP server are specified in a suffix
| |
| appended to the URL described in
| |
| [[Image:opd-client,fig,url-parts]]. Depending on the suffix
| |
| supplied, the server will provide one of these services:
| |
|
| |
|
| |
| <blockquote>
| |
|
| |
| ; Data Attribute : This service returns the entire data
| |
|
| |
| attribute structure for the given dataset. This is a text file
| |
|
| |
| describing the attributes of each data quantity in that dataset.
| |
|
| |
| (See ([http://www <cite> data,das</cite>]) for more information about data
| |
|
| |
| attributes.) This service is activated when the
| |
|
| |
| server receives a URL ending with <font color='green'>.das</font>.
| |
|
| |
|
| |
| ; Data Descriptor : This service returns the entire data descriptor
| |
| structure for the given dataset. This is a text file describing the
| |
| structure of the variables in the dataset. (See
| |
| ([http://www <cite> data,dds</cite>]) for more information about data descriptors.)
| |
| This service is activated when the server receives a URL ending with
| |
| <font color='green'>.dds</font>.
| |
|
| |
| ; OPeNDAP Data : This service returns the actual data requested by
| |
| a given URL. This is not a text file, but is encoded as a
| |
| Multipurpose Internet Mail Extensions (MIME) document. This service
| |
| is activated when the server receives a URL ending with <font color='green'>.dods</font>
| |
|
| |
| ; ASCII Data : This service returns an ASCII representation of
| |
| the requested data. This can make the data available to a wide
| |
| variety of browser programs. This service is activated when the
| |
| server receives a URL ending with <font color='green'>.asc} or \lit{.ascii</font>.
| |
|
| |
| ; \ifh : When the server receives a URL ending in
| |
| <font color='green'>.html</font>, it produces an HTML form containing information from
| |
| the dataset that you can use to construct a sensible URL with which
| |
| to request OPeNDAP data. The \ifh is also triggered when the OPeNDAP
| |
| server receives a URL that references a directory instead of a file.
| |
|
| |
| ; Information : This service returns information about
| |
| the server and dataset, in human-readable HTML form. The returned
| |
| document may include information about both the data server itself
| |
| (e.g. server functions implemented), and the dataset referenced in
| |
| the URL. The server administrator determines what information is
| |
| returned in response to such a request. This service is activated
| |
| when the server receives a URL ending with <font color='green'>.info</font>. See
| |
| ([http://www <cite> sec,document-data</cite>]) for more information about how to
| |
| configure the information service.
| |
|
| |
| ; Version : This service returns the version information for the
| |
| OPeNDAP server software running on the server. This service is
| |
| triggered by a URL ending with <font color='green'>.ver</font>.
| |
|
| |
| ; Help : This service returns some help text in response to an
| |
| improperly specified URL. This service is triggered by a URL ending
| |
| in any suffix that is not recognized by the OPeNDAP server.
| |
|
| |
| </blockquote>
| |
|
| |
|
| |
| <blockquote>A request for data from an OPeNDAP client will generally make three
| |
| different service requests, for data attributes, data descriptors, and
| |
| for data. The prepackaged OPeNDAP clients do this for you, so you may
| |
| not be aware that three requests are made for each URL. That is, an OPeNDAP client may accept an OPeNDAP URL specifying some data, such as the
| |
| one shown in [[Image:opd-client,fig,url-parts]]. In this case, the
| |
| OPeNDAP client library (such as nc-dods) will accept the input URL, and
| |
| append the different suffixes to that URL, making three distinct
| |
| requests to the OPeNDAP server.</blockquote>
| |
|
| |
| ===\ifh===
| |
|
| |
|
| |
| Each OPeNDAP server implements a service called the \ifh . This is a way
| |
| to use a standard Web client, such as Netscape, to get information
| |
| about the data served by a specific server.\footnote{The \ifh is only
| |
|
| |
| available for servers later than version 3.1.} The \ifh has two
| |
| modes of operation: the directory level and the file level.
| |
|
| |
| If an OPeNDAP URL references a directory instead of a file on the server
| |
| machine, the server produces a listing similar to that shown in
| |
| [[Image:opd-client,fig,ifh-dir]].
| |
|
| |
| \figureplace{\ifh - Directory Level}{htbp}
| |
| {opd-client,fig,ifh-dir}{ifh-dir.ps}{ifh-dir.gif}{}
| |
|
| |
| Clicking on a dataset shown in the directory-level listing
| |
| will produce an HTML form similar to the one in
| |
| [[Image:opd-client,fig,ifh]]. The top line in the window ("Data
| |
| URL") shows a URL that makes a request for an OPeNDAP dataset. The
| |
| windows below it show the variables that make up the dataset. You can
| |
| edit the form to select the data you'd like to see from this dataset,
| |
| and the \ifh will edit the Data URL so that it only requests the data
| |
| you are interested in. When done, you can push the "ASCII" button,
| |
| to see an ASCII representation of the data you've requested. Netscape
| |
| cannot handle binary data, so if you
| |
| want to use the binary data, you should copy the URL in the Data URL
| |
| window to the OPeNDAP client you'd like to use.
| |
|
| |
| \figureplace{\ifh}{htbp}
| |
| {opd-client,fig,ifh}{ifh.ps}{ifh.gif}{}
| |
|
| |
| ==Using an OPeNDAP Program==
| |
|
| |
|
| |
| There are some
| |
| configuration issues a user must consider in order to use an OPeNDAP
| |
| client application program. There is a short list of software that is
| |
| required for some of the advanced features of OPeNDAP, and some
| |
| environment variables that control the execution of the OPeNDAP software.
| |
| For a piece of software that has been converted to use OPeNDAP, after
| |
| these conditions are satisfied, the program will run in the same
| |
| manner it ran before. Aside from network delays, the user should not
| |
| be able to tell that they are accessing data from the Internet.
| |
|
| |
|
| |
| Finally, though it may seem unnecessary to mention, in order for an OPeNDAP client application to communicate with an OPeNDAP server, the
| |
| computer running the OPeNDAP client must be connected to the Internet.
| |
|
| |
| ===Requirements===
| |
|
| |
| In order to use of some of the features of the OPeNDAP core software, a
| |
| user's computer must have some additional software installed, and
| |
| available on the user's <font color='green'>PATH</font>, in
| |
| <font color='green'>$DODS_ROOT/bin} or \lit{$DODS_ROOT/etc</font>.
| |
|
| |
|
| |
| \indc{system
| |
|
| |
| configuration}
| |
|
| |
|
| |
| *The <font color='green'>wish} {Tcl}}/{\ind{Tk</font> interpreter (or whatever
| |
| program is indicated by the <font color='green'>DODS_GUI</font> environment variable) is
| |
| used by the "GUI manager" to provide a progress indicator
| |
| that displays the status of a pending data request as it is being
| |
| processed. It is also used by the error reporting system to display
| |
| error message received from the server. \tbd{and by the data
| |
| locator, to display information and query the user}
| |
| *The <font color='green'>gzip}</font> program, the \ind{GNU compression
| |
| software, is used to decompress data messages received from an OPeNDAP
| |
| server. If this program is not installed, the OPeNDAP core software
| |
| tells the server not to send compressed messages, so data may still
| |
| be received. However, having the compression software installed and
| |
| available will increase the data transfer rate.
| |
|
| |
| The required software, like OPeNDAP itself, is free software. Refer to
| |
| \appref{install} for information about acquiring that software.
| |
|
| |
| ===Environment Variables===
| |
|
| |
| After successfully relinking an application program with the OPeNDAP
| |
| libraries, there is a short list of environment variables that
| |
| may be defined. Only <font color='green'>DODS_ROOT</font> is required. The other three
| |
| variables are only used to override default values controlling the GUI
| |
| manager process. Most users may safely ignore them.
| |
|
| |
| <blockquote>
| |
|
| |
| ; <font color='green'>DODS_ROOT</font> : indicates the root directory of the OPeNDAP
| |
| software. The OPeNDAP core software must be able to locate utilities
| |
| that are located in this directory tree. \indc{environment
| |
| variables!DODS_ROOT}
| |
|
| |
| ; <font color='green'>DODS_GUI</font> : can contain the name of the program used by the
| |
| \new{GUI manager}. A user might wish to change this variable to
| |
| point to a "safe" Tcl/Tk interpreter; whatever program is used
| |
| here must be able to process Tcl and Tk commands. The default value
| |
| is the <font color='green'>wish</font> program. \indc{environment
| |
| variables!DODS_GUI}
| |
|
| |
| ; <font color='green'>DODS_GUI_INIT</font> : indicates the name of any initialization
| |
| command required by the "GUI manager". The default
| |
| initialization string executes the Tcl program in
| |
| <font color='green'>$DODS_ROOT/etc/dods_gui.tc1</font>.
| |
| \indc{environment
| |
| variables!DODS_GUI_INIT}
| |
|
| |
| ; <font color='green'>DODS_USE_GUI</font> : may be used to turn off the GUI manager. Set
| |
| the value of this variable to <font color='green'>no</font>, and the progress indicator
| |
| and the error message windows will not be displayed.
| |
|
| |
|
| |
| </blockquote>
| |
|
| |
| <blockquote>The user has substantial control over the GUI manager. You can
| |
| change the program that listens for GUI commands from <font color='green'>wish</font> to
| |
| anything else, and you can actually change the action of the GUI
| |
| commands by editing the Tcl code in the files <font color='green'>dods_gui.tcl</font>,
| |
| <font color='green'>error.tcl}, and \lit{progress.tcl</font>. (These are in the
| |
| <font color='green'>$DODS_ROOT/etc</font> directory.) However, editing these files and
| |
| variables will not change the form of the messages from the OPeNDAP
| |
| server, and from the core software that are meant to invoke these
| |
| programs. In other words, the user may mess with these, but must be
| |
| careful to leave the GUI manager in a form that will be able to
| |
| process the messages it receives.</blockquote>
| |
|
| |
| ===The Error System===
| |
|
| |
|
| |
|
| |
| The GUI manager is used to display error messages
| |
| to the user. The messages themselves will vary with the server
| |
| implementation. Refer to the documentation of the particular server,
| |
| or consult the server's <font color='green'>info</font> Service (See
| |
| ([http://www <cite> opd-server,service</cite>]).), for a list of the error messages
| |
| that might be issued by a particular server. \tbd{As error codes are
| |
| finalized, they should be included in an Appendix of this document,
| |
| and a pointer to them included here.}
| |
|
| |
| ===Temporary Files===
| |
|
| |
|
| |
| Using an OPeNDAP client application will
| |
| create a number of temporary files. They are created with the
| |
| <font color='green'>tmpnam()</font> function, so their names will correspond to the rules
| |
| for that function on your system (See the manual page for
| |
| <font color='green'>tmpnam(3)}, or type \lit{man tmpnam</font> for more information.)
| |
| During normal operation, OPeNDAP will delete the temporary files it
| |
| creates as it goes. However, if execution of the OPeNDAP client is
| |
| somehow interrupted, these files may remain, and will have to be
| |
| deleted by hand.
| |
|
| |
| =The OPeNDAP Client=
| |
|
| |
|
| |
| There are many different data analysis packages in use. Some packages, such
| |
| as MATLAB and IDL, are commercially available, but many more are written for
| |
| a specialized need or application. Many of these use one of the widely
| |
| available sets of scientific data access functions (called an {\em
| |
|
| |
| Application Program Interface}, or API)\indc{Application Program
| |
|
| |
| Interface|see{API}} such as NetCDF, JGOFS, or HDF. There is great variety
| |
| among all these programs, but one feature they share is that they all access
| |
| data through files containing that data\footnote{This is not true of some
| |
|
| |
| APIs, such as JGOFS. That API, however, uses a data dictionary to allow
| |
|
| |
| the user to think that the data access is through files.}. That is to say
| |
| that each program begins by identifying a file containing the data the user
| |
| wishes to examine or analyze.
| |
|
| |
| An OPeNDAP client is simply a data
| |
| analysis application linked with the OPeNDAP libraries instead of the
| |
| standard data access API. Using this program, a user can look at files
| |
| containing data in the same way as was possible without the OPeNDAP
| |
| libraries. However, by using these libraries, a user can also use a
| |
| URL (URL), instead
| |
| of a simple file name, to specify data located anywhere on the
| |
| Internet. \Figureref{intro,fig,unlinked} and
| |
| [[Image:intro,fig,linked]] illustrate the operation of an
| |
| application program linked with a standard data access API, and the
| |
| same program linked with the OPeNDAP version of that API.
| |
|
| |
| An OPeNDAP client is then a data analysis application program
| |
| modified to become a web browser, somewhat like any other \ind{web
| |
|
| |
| browser} (NCSA Mosaic) with
| |
| which you may be familiar. A web browser can only display the data it
| |
| receives, however. What makes an OPeNDAP client different from
| |
| another web browser is that, unlike Netscape, once the data has been
| |
| received from an OPeNDAP server, the OPeNDAP client application can
| |
| compute with it.
| |
|
| |
| Like a web browser, an OPeNDAP client accepts a URL from a user, and
| |
| parses it to come up with a protocol, an address, and a message. (See
| |
| ([http://www <cite> opd-client,url</cite>]) for more information about URLs.) The
| |
| browser then sends a message to the address, directed to the server
| |
| who can service the desired protocol, asking for the information
| |
| specified in the remainder of the URL. Unlike a typical web browser, an OPeNDAP client will not know what to do with data returned for a web page
| |
| containing text and pictures, but an OPeNDAP server will return scientific
| |
| data that an OPeNDAP client can understand and process.
| |
|
| |
| Here is a simple example, using the <font color='green'>ncview</font> program. This program
| |
| simply prints out the contents of a netCDF formatted data file,
| |
| specified on the command line, like this:
| |
|
| |
| <pre>
| |
| > ncview fnocl.nc
| |
| </pre>
| |
|
| |
| Using OPeNDAP, this same function may be executed from any computer connected to
| |
| the Internet by substituting a URL for the
| |
| filename above:
| |
|
| |
| <pre>
| |
| > dncview http://dods.gso.uri.edu/cgi-bin/nc/data/fnocl.nc
| |
| </pre>
| |
|
| |
|
| |
| (See [[Image:opd-client,fig,url-parts]] Aside from the fact that
| |
| the data is remote, and must be specified with a URL, the program will
| |
| seem to function in the same way it had with the simple netCDF library
| |
| (albeit somewhat more slowly due to having to make network connections
| |
| instead of local file operations). You can find <font color='green'>dncview</font> (the
| |
| <font color='green'>ncview</font> program linked with the OPeNDAP library) in the
| |
|
| |
| <pre>
| |
| $DODS_ROOT/src/nc-dods/ncview
| |
| </pre>
| |
|
| |
|
| |
| directory. Running the above command will produce the following output:
| |
|
| |
| <pre>
| |
| netcdf fnocl {
| |
| dimensions:
| |
|
| |
| time_a = 16
| |
|
| |
| lat = 17 ;
| |
|
| |
| lon = 21 ;
| |
|
| |
| time = 16 ;
| |
|
| |
| variables:
| |
|
| |
| long u(time_a, lat, ion) ;
| |
|
| |
| u:units = "meter per second" ;
| |
|
| |
| u:long_name = "Vector wind eastward component" ;
| |
|
| |
| u:missing_value = "-32767" ;
| |
|
| |
| u:scale_factor = "0.005" ;
| |
|
| |
| long v(time_a, lat, ion) ;
| |
|
| |
| v:units = "meter per second" ;
| |
|
| |
| v:long_name = "Vector wind northward component" ;
| |
|
| |
| v:missing_value = "-32767" ;
| |
|
| |
| v:scale_factor = "0.005" ;
| |
|
| |
| double lat(lat) ;
| |
|
| |
| lat:units = "degree North" ;
| |
|
| |
| double lon(lon) ;
| |
|
| |
| lon:units = "degree East" ;
| |
|
| |
| double time(time) ;
| |
|
| |
| time:units = "hours from base_time" ;
| |
|
| |
| // global attributes:
| |
|
| |
| :base_time = "88- 10-00:00:00" ;
| |
|
| |
| :title = "FNOC UV wind components
| |
|
| |
| from 1988- 10 to 1988- 13." ;
| |
| data:
| |
|
| |
| u =
| |
|
| |
| -1728, -2449, -3099, -3585, -3254, -2406, -1252,
| |
|
| |
| 662, 2483, 2910, 2819, 2946, 2745, 2734,
| |
|
| |
| 2931, 2601, 2139, 1845, 1754, 1897, 1854, -1686,
| |
| ...
| |
| </pre>
| |
|
| |
| Although there are packaged OPeNDAP browsing programs that a user can use
| |
| to look at data, the user can also construct his or her own. Linking
| |
| an OPeNDAP API with an already existing program allows a user to create a
| |
| customized web browser that can access data available from any OPeNDAP
| |
| server connected to the Internet.
| |
|
| |
| The OPeNDAP APIs are designed to accurately mimic the behavior of several
| |
| different commonly used scientific data APIs. As of this writing
| |
| (\today), the OPeNDAP API set includes:
| |
|
| |
|
| |
|
| |
| {| border="1"
| |
| |+
| |
| ! Supported APIs !! !!
| |
| |-
| |
| |'''API''' || '''Description''' || '''Components'''
| |
| |-
| |
| |netCDF
| |
| || Support for gridded data, such as satellite data,
| |
| interpolated ship station data, or current meter data.
| |
| || Server and client.
| |
| |-
| |
| |JGOFS
| |
| || Support for relational data, such as \class{Sequences}.
| |
| Created by the Joint Globar Ocean Flux Study (JGOFS) project for use
| |
| with oceanographic station data.
| |
| || Server and client.
| |
| |-
| |
| |HDF
| |
| || Support for gridded data. Commonly used for astronomical
| |
| data and model data.
| |
| || Server only.
| |
| |-
| |
| |DSP
| |
| || Oceanographic and geophysical satellite data. Provides
| |
| support for image processing. Developed at the University of
| |
| Miami/RSMAS. Primarily used for AVHRR and CZCS data.
| |
| || Server only.
| |
| |-
| |
| |GRIB
| |
| || Support for gridded binary data. GRIB is the World
| |
| Meteorological Organization (WMO) format for the storage of weather
| |
| information and the exchange of weather product messages.
| |
| || Server only, due in early 1999.
| |
| |-
| |
| |BUFR
| |
| || The WMO's standard set of codes for the transmission and
| |
| storage of meteorological data, using a compressed format with each
| |
| data value occupying the least number of bits necessary to contain
| |
| its range of values. Suitable for meteorological observations made
| |
| from a single point or set of points.
| |
| || Server only, due in early 1999.
| |
| |-
| |
| |Free\-Form
| |
| || On-the-fly conversion of arbitrarily formatted data, including
| |
| relational data and gridded data. May be used for sequence data,
| |
| satellite data, model data, or any other data format that can be
| |
| described in the flexible FreeForm format definition
| |
| language. This server can be used to serve data stored in almost
| |
| all home-grown data formats.
| |
| || Server only; no client required.
| |
| |-
| |
| | native OPeNDAP
| |
| || The OPeNDAP class library may be used directly by a client program. It
| |
| supports relational data, array data, gridded data, and
| |
| a flexible assortment of data types that can be combined to
| |
| c accommodate most data models.
| |
| || Client.
| |
|
| |
| |}
| |
|
| |
|
| |
| The API set is extensible, meaning that developers can use the OPeNDAP
| |
| software toolkit to write OPeNDAP-compliant versions of new APIs. See
| |
| [http://www.opendap.org/support/docs.html/api/pguide-html/<cite>The OPeNDAP Programmer's Guide</cite>] for more information.
| |
|
| |
| The most important result of this architecture is that, just as the
| |
| use of the <font color='green'>dncview</font> program above is identical to the original
| |
| <font color='green'>ncview</font>, a user can use remote OPeNDAP data "and" continue to
| |
| use the same data analysis and display programs with which he or she
| |
| is familiar. Any program that uses one of the OPeNDAP-supported APIs may
| |
| be re-linked to use the OPeNDAP version of that API. This creates an OPeNDAP
| |
| client. That and a connection to the Internet, are all that a
| |
| researcher requires to gain access to the available OPeNDAP data.
| |
|
| |
| ==Configuring Programs to Use OPeNDAP==
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
| Relinking an existing program with the OPeNDAP implementation of some
| |
| data API is a simple procedure. Find the directory that contains the
| |
| source/object code of the program you want to re-link and modify the
| |
| makefile (typically called <font color='green'>Makefile</font>) for the program so that the
| |
| OPeNDAP-compliant API library is used in place of the standard API
| |
| library. (If you can't find the libraries on your system, see
| |
| \appref{install}, or ask the system administrator.) These
| |
| libraries are:
| |
|
| |
|
| |
| <blockquote>
| |
| ; <font color='green'>libdap++.a</font> : Software common to all of the OPeNDAP-supported
| |
|
| |
| APIs.
| |
| </blockquote>
| |
|
| |
| OPeNDAP also uses facilities from some standard libraries, and these must
| |
| also be included in the link to resolve all the symbols.
| |
|
| |
| <blockquote>
| |
| ; <font color='green'>libwww.a</font> : The World Wide Web library. \indc{World Wide
| |
| Web!library} This contains the functions used to communicate
| |
| between the OPeNDAP client and server.
| |
|
| |
| ; <font color='green'>libexpect.a</font> : Functions from the <font color='green'>expect</font>
| |
| library are used to communicate between
| |
| OPeNDAP client processes.
| |
|
| |
| ; <font color='green'>libtcl.a</font> : Contains definitions necessary for the
| |
| <font color='green'>expect</font> library. The use of this library in the link is not
| |
| related to the use of Tcl by OPeNDAP clients.
| |
|
| |
| ; <font color='green'>libstdc++.a</font> :
| |
| The GNU C++ class library (This is not necessary if using <font color='green'>g++</font>
| |
| to re-link.)
| |
|
| |
| </blockquote>
| |
|
| |
| You will also need to include the library containing the
| |
| OPeNDAP-compliant version of the API. The name of this library of course
| |
| depends on the API, but it is generally in the form
| |
|
| |
| <pre>
| |
| <font color='green'>lib"API</font>-dods.a"
| |
| </pre>
| |
|
| |
| Where "API" is an abbreviation indicating the API emulated by the
| |
| specified library. For example, the OPeNDAP-compliant netCDF library is
| |
| called <font color='green'>libnc-dods.a</font> and the JGOFS version is <font color='green'>libjg-dods.a</font>.
| |
|
| |
| ===An Example Using netCDF===
| |
|
| |
|
| |
| The <font color='green'>ncview</font> program is a simple utility that prints the contents
| |
| of a netCDF-format file to standard output. This section outlines the
| |
| process used to modify the <font color='green'>ncview</font> makefile to link that program
| |
| with the OPeNDAP netCDF API, thereby turning <font color='green'>ncview</font> into a
| |
| network-ready OPeNDAP client. The process of linking any other program
| |
| with the corresponding OPeNDAP library is entirely analogous to this one
| |
| and only requires the substitution of the program name and the
| |
| appropriate library.
| |
|
| |
| First the link flags were modified so that the library search path
| |
| would include the likely places to find the OPeNDAP libraries:
| |
|
| |
| <pre>
| |
| LDFLAGS = -g -L$(DODS_ROOT)/lib
| |
| </pre>
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
| <font color='green'>DODS_ROOT</font> is an environment variable that indicates the root
| |
| directory of the OPeNDAP installation, and in this manual is used as
| |
| shorthand for this directory. It is typically called something like
| |
| <font color='green'>/usr/local/DODS</font>. If you cannot find these directories on your
| |
| system, consult your system administrator, or refer to
| |
| \appref{install} for information about acquiring and installing
| |
| the OPeNDAP software.
| |
|
| |
| After the link flags were modified, the OPeNDAP libraries were added to the list
| |
| of libraries used. The order in which the libraries are listed is important.
| |
|
| |
| <pre>
| |
| LIBS = -lnc-dods -ldap++ -lnc-dods -ldap++ -lwww -ltcl
| |
|
| |
| -lexpect -lz -lrx
| |
| </pre>
| |
|
| |
|
| |
| <blockquote>Because OPeNDAP is implemented as a core set of classes contained in one
| |
| library (<font color='green'>libdap++.a</font>) and a set of specializations of those classes in a
| |
| second library (<font color='green'>libnc-dods.a</font>), and because there is a circular
| |
| dependence between those two libraries, they must be included twice in the
| |
| linker command.</blockquote>
| |
|
| |
| Finally, <font color='green'>g++</font> was substituted for the link command.\footnote{It
| |
| is possible to use <font color='green'>gcc</font>} instead of \lit{g++, but in that
| |
| case, <font color='green'>-lg++</font> must be added to the end of the library list.}
| |
|
| |
|
| |
| ===Potential Problems===
| |
|
| |
|
| |
| When a user links an existing a program to the OPeNDAP libraries, there are
| |
| several possible conditions that may cause problems.
| |
|
| |
|
| |
| *Some programs use more than one API.
| |
| *Some programs access data using both API and UNIX system calls.
| |
| *Some programs use undocumented features of the APIs.
| |
|
| |
| If this is the case for a given program, there is generally no good solution
| |
| beside rewriting the software to conform to a strict usage of the data
| |
| reading parts of the given API. Of course if the problem is that the
| |
| program uses more than one API, you can try linking the program with an OPeNDAP-compliant version of the second API as well.
| |
|
| |
|
| |
| *Re-linked programs can be very large.
| |
|
| |
|
| |
| \indc{troubleshooting!size
| |
|
| |
| of executable}
| |
| The OPeNDAP libraries are large, and the <font color='green'>g++</font>, <font color='green'>www</font>,
| |
| <font color='green'>expect</font>, and <font color='green'>tcl</font> libraries on which they are built are even
| |
| larger. This means that the executable version of a re-linked OPeNDAP
| |
| client can seem unreasonably obese. Much of the disk space is occupied
| |
| by symbol tables, which can be removed from the executable file with
| |
| the <font color='green'>strip</font> utility. In many cases, a user can recover a
| |
| substantial amount of disk space this way.
| |
|
| |
|
| |
| <blockquote>[CAUTION]{Without familiarity with the OPeNDAP software, it is best
| |
| only to strip the executable files. Stripping object files or
| |
| libraries might leave them in a useless condition for the linker.
| |
| Furthermore, stripping an executable file removes symbol names,
| |
| which may make diagnosing problems more difficult.</blockquote>
| |
|
| |
| The OPeNDAP libraries only affect the data ''reading'' functionality
| |
| of the specified API. There are no OPeNDAP replacements for functions
| |
| like netCDF's <font color='green'>ncputrec()</font>, that ''write'' data to a disk file.
| |
| These functions are included in the OPeNDAP-compliant API library, but
| |
| they operate in a manner identical to the original (non-OPeNDAP)
| |
| versions, that is, they work on local files only, attempting to write
| |
| "over the network" will result in an error. \indc{API!data output
| |
|
| |
| functions}
| |
|
| |
| ==Writing New OPeNDAP Programs==
| |
|
| |
| The OPeNDAP software may also be used to write new programs. This may be
| |
| done either through one of the OPeNDAP-supported API libraries, such as
| |
| netCDF or JGOFS, or by using the OPeNDAP data access protocol directly.
| |
| There are advantages and disadvantages to each approach.
| |
|
| |
|
| |
|
| |
| The biggest advantage of writing new code using an OPeNDAP-supported API
| |
| such as netCDF or JGOFS is that the programmer in question is probably
| |
| already familiar with the use of that API. Writing an OPeNDAP program using
| |
| an adapted API is not significantly different than writing the same
| |
| program with the original API. While writing this new program, it will be
| |
| useful to remember that the data the program uses will often be remote,
| |
| implying that data retrieval may not be instantaneous, and that
| |
| implementation of local caching to store requested data might be a good
| |
| idea, but other than that, the process is the same as writing a program
| |
| using the regular API.
| |
|
| |
|
| |
|
| |
| It is also possible to use the OPeNDAP data access protocol directly.
| |
| This is somewhat more involved than using one of the OPeNDAP-compliant
| |
| API libraries, and C++ is the only language supported for this.
| |
| However, this approach can provide substantially more efficient
| |
| programs. For further information about this approach, refer to the
| |
| technical information about the DAP in [http://www.opendap.org/support/docs.html/api/pguide-html/<cite>The OPeNDAP Programmer's Guide</cite>] .
| |
|
| |
| [[http://docs.opendap.org/index.php/UserGuide1]]
| |
What is OPeNDAP?
The OPeNDAP provides a way for ocean researchers to
access oceanographic data anywhere on the Internet from a wide variety of new
and existing programs. By developing network versions of commonly used
data access Application Program Interface (API) libraries, such as
NetCDF ,
HDF ,
JGOFS , and others,
the OPeNDAP project can capitalize on years of development of data analysis and
display packages that use those APIs, allowing users to continue to use
programs with which they are already familiar.
The OPeNDAP architecture uses a client/server model, with a {\em
{client}} that sends requests for data out onto the network to some
"server", that answers with the requested data. This is exactly
the model used by the World Wide Web where client programs
called browsers submit requests to web servers for the data that make up web
pages. Of course, OPeNDAP clients can do much more than browse this data. Using
flexible data types suitable for many uses, including scientific data, the
OPeNDAP servers deliver real data directly to the client program in the format
needed by that client.
In fact, the network communication model used by OPeNDAP uses URL
addresses and web servers ("httpd") to deliver data to the
researcher. This is done by using the OPeNDAP software to convert a
researcher's data analysis software into a sophisticated (though
specialized) web browser. In addition to providing network-compatible
versions of popular data access APIs, the OPeNDAP project also
provides a software client and server toolkit to help other developers
create network-compatible OPeNDAP versions of other APIs.
To expand the universe of data available to a user, OPeNDAP incorporates
a powerful data translation facility, so that data may be stored in
data structures and formats defined by the data provider, but may be
accessed by the user in a manner identical to the access of local data
files on the user's own system. Though there are limitations on the
types of data that may be translated (See ( data,trans)),
the facility is flexible and general enough to handle many of the
possible translation. There are two important results:
- A user may not need to know that data from one set are stored in a format different from data in another set. Further, it may be possible that "neither" data set is stored in a format readable by the original (i.e. without OPeNDAP) version of the data analysis and display program he or she uses.
- No segment of OPeNDAP users will be effectively cut off from accessing data because of its storage format. A scientist who wishes to make his or her data available to other OPeNDAP users may do so while keeping that data in what may actually be a highly idiosyncratic storage format. Of course, it doesn't have to be in a highly idiosyncratic format. The point is that OPeNDAP can handle a wide variety of possible cases.
The combination of the OPeNDAP network communication model and the data
translation facility make OPeNDAP a powerful tool for the retrieval,
sampling, and display of large distributed datasets. Though OPeNDAP was
developed by oceanographers, its application is not constrained to
oceanographic data. The organizing principles and algorithms may be
applied to many other fields where data can be stored on computers.
The population of people who may be interested in a system such as
OPeNDAP may be divided into data consumers and data providers. Though it
was an important observation to the development of OPeNDAP that the two
roles are often assumed by the same scientists, the division is a
useful one for the introduction of the system. The following two
sections provide a broad introduction to the roles of data consumer
and data provider. The remainder of this guide is organized around
this distinction between classes of users.
Why Use OPeNDAP to Read Data?
A scientist wishing to examine and sample some dataset will typically
be comfortable using a relatively small number of data analysis and
display programs or packages. Some of these packages will use one of
the popular data access APIs currently available. However, few data
access APIs provide direct access to distributed data
refers to datasets that reside on different computers which are linked
by a network such as the Internet. The computers may or may not be
physically remote from each other. The main point is that the
computers manage their data resources independently. In this guide the
terms "remote\/} and {\em distributed\/" are used to imply
independently managed resources.}, so this access must be made with
network tools, such as web browsers or "ftp". While
relatively straightforward in principle, this process can nonetheless
become time-consuming and somewhat challenging in practice.
The following example illustrates some of the differences between
accessing distributed data with the tools currently in widespread use,
and the same operation using OPeNDAP.
An Example: Using ftp
The advent of the WWW has made possible simple data browsers that
allow sophisticated interactive sampling of on-line datasets. Using a
web browser and "ftp", a user can sample any of several large
oceanographic datasets available on the Internet. However, there are
several problems with these data search engines that may only become
apparent when a user actually tries to use the data.
Among the problems that can arise are those that appear when a user
tries to use the results of one dataset to search a second
dataset. Suppose that a user wishes to choose a sea-surface
temperature image from the NOAA/NASA Pathfinder AVHRR archive at:
http://podaac-www.jpl.nasa.gov/mcsst/mcsst_subset.html
using the results of a
time-series generated from the COADS Climatology archive at:
http://ferret.wrc.noaa.gov/fbin/climate_server
The steps are theoretically straightforward:
- Create the time series from the COADS Climatology archive. This is done by answering the menu of options on the COADS web page.
- Import the time series from step 1 to the user's local data analysis system. Note that this step may itself require several steps:
- The data must be down-loaded, using "ftp" or a similar program.
- Once down-loaded, the data may have to be converted into a format that can be read by the data analysis program.
- Examine the data and formulate a request to the AVHRR archive. This is again done by answering the menu of option on the AVHRR Web page. Note that the COADS and AVHRR pages are not completely compatible in this respect. For example, the date formats of the two pages are different.
- Import the result of step 3 to the user's local data display system. This may also require several steps:
- The data must be down-loaded again.
- And again, once down-loaded, the data may have to be converted into a format that can be read by the data analysis program. Note that the set of available formats on the COADS page are distinct from the available options from the AVHRR archive.
- Think about the results.
Though the procedure is straightforward and the web servers designed
to make sampling the datasets a simple task, upon close examination,
the combination of the steps may create unforeseen difficulties. For
example, a request to the COADS server will return either a spreadsheet
suitable for use on a PC, a netCDF format file, or a file in one
of a selection of simple ASCII formats.
If the user is fortunate, the returned file will already be in a
format compatible with the desired analysis package. But not all users
will be so fortunate. Often this file must be converted to some
other file format before it can be imported to the user's analysis
program. This may or may not be a simple task.
Even a file format for which a user is properly equipped may be used
in an unfamiliar manner. For example, the independent and dependent
variables might be in a different order or an ASCII data file may use
tabs instead of spaces.
Assuming the import of the COADS data has been accomplished and
boundaries for the AVHRR search identified, the task of selecting from
the second archive may begin. Unfortunately, the request to the AVHRR
archive will return either a GIF picture, an HDF format file, or a raw
(binary) data file. Again, importing this output into the user's
analysis program may or may not be simple, but it will not be the same
procedure as the one used for the first data request.
Other problems are also apparent. The COADS Climatology sampling
program requests the user supply dates (month and day), whereas the
AVHRR archive asks for the "Julian day" (an integer between 1 and
365 or 366). One server will accept "S" and "W" to indicate South
latitudes and West longitudes, while the other requires that these be
indicated with negative coordinate values. The sampling of the COADS
dataset, while flexible, may not allow sampling in the manner the user
needs. It cannot, for example, provide a section except along a line
of constant latitude or longitude. If a user wanted to see a section
along a NE-SW line, it would be a challenging and time-consuming
task to assemble one from many small data requests.
Further, it might be desirable to use the results of sampling these
two databases to construct a time series. This could conceivably mean
repeating the entire procedure many times.
An Example: Using OPeNDAP
To produce the same data selection using OPeNDAP, a user would follow
essentially the same steps. However, the steps themselves would be
performed differently. Once the user's data analysis package has been
converted to an OPeNDAP client
(( opd-client,link)), the \tbd{add xref to install GUI
clients}
accesses to the remote datasets are made through the analysis package
itself. Instead of specifying a data file by a pathname reference to
some local disk file, the user specifies a URL, which may point to
either a local or a remote dataset. Here is a re cap of the same operation,
outlined as they would be performed by an OPeNDAP application program:
- Create the time series from the COADS Climatology archive. This is done by using the sampling facilities of whatever data analysis program a scientist is familiar with. If desired, OPeNDAP constraint expressions may be used to reduce the network load, or to provide a sampling scheme not supported by the data analysis program.
- The data need not be imported to the user's data analysis program, since it was down-loaded and converted automatically in step 1.
- Examine the data and formulate a request to the AVHRR archive. This is again done through the sampling facilities of whatever data analysis program the user is using, and OPeNDAP constraint expressions. Note that, whatever their actual format, both COADS and AVHRR archives appear to the OPeNDAP client to be stored in identical formats.
- The data need not be imported to the user's data analysis program, since it was down-loaded and converted automatically in step 3.
- Think about the results.
It is important to note that "any" data analysis package that can
handle one of the DODS-supported data access APIs can be converted
into an OPeNDAP client program capable of reading data stored by "all"
of the DODS-supported data access APIs. (There are some limitations on
translation. See ( intro,opd-client) and
( data,trans) for more information.) Therefore, assuming
the user has some analysis package capable of doing the required
sampling and analysis on local data, all the steps would be performed
from within that package, just as if the user were operating on local
files. The result is a simpler procedure, even though the same
essential steps are followed.
The OPeNDAP scenario has, among others, the following advantages:
- The user need not learn about any of the archival formats, since the OPeNDAP server and client cooperate to deliver the data in the format in which the analysis package expects to see it. Whereas the user of the ftp server has to worry about importing the data into the analysis program, the OPeNDAP client program imports it transparently.
- The user can sample the distant datasets in any fashion supported by his or her own (local) analysis package. Unnecessary data need not be sent over the Internet.
- By appending a "constraint expression" to the URLs given to the analysis program, the user can sample data using techniques that their analysis program cannot do.\footnote{For example, suppose a user wishes to access the NODC XBT database using a program that uses the netCDF API. A program that can process the arrays that netCDF manipulates are largely unsuitable for XBT station data. However, a user can define constraint expressions in the URL to sample the data and deliver it in a form the netCDF API can use. For more information about constraint expressions, see Section~(opd-client,constraint). For more information about data models and translation, see Chapter~(data).}\tbd{Use a different example in the footnote}
- A substantial amount of the searching and sampling is performed on the server machines. This reduces Internet traffic, as well as decreasing the load on the local machine.
The OPeNDAP Client
OPeNDAP uses a client/server model. As mentioned, the OPeNDAP
servers are simply "httpd} web servers, equipped to interpret an OPeNDAP URL sent to them. (See \chapterref{opd-server".) The OPeNDAP client
program can be any program that uses one of the supported APIs, such
as JGOFS or netCDF.\footnote{Or a program specially developed to
read data from OPeNDAP servers.}
Without OPeNDAP, an application program that uses one of the common data
access APIs such as netCDF will operate as shown in File:Intro,fig,unlinked.
The user
makes a request for data from the application program. The program in turn
uses procedures defined by the data access API to access the data,
which is stored locally on the host machine. Some APIs are somewhat more
sophisticated than this, of course, but their general operation is
similar to this outline.
\figureplace{The Architecture of a Data Analysis Package.}{htbp}
{intro,fig,unlinked}{unlinked.ps}{unlinked.gif}{}
The operation of an OPeNDAP client is illustrated in File:Intro,fig,linked.
Here, the
same application program that was used in File:Intro,fig,unlinked
has been linked
with an OPeNDAP version of the data access API. Now, in addition to being
able to use local data as before, the application program is able to access
data from OPeNDAP server anywhere on the Internet in the same manner as the
local data.
To make some program into an OPeNDAP client, it must only be re-linked with
the OPeNDAP implementation of the supported API library. This is a simple
process, generally requiring only a few minutes. The process will
create a program that accepts URLs, specifying a location for the data
somewhere on the Internet, in addition to file pathnames which only
specify a location on the local platform's file system. (See
( opd-client,link).)
\figureplace{The Architecture of a Data Analysis Package Using OPeNDAP.}{htbp}
{intro,fig,linked}{linked.ps}{linked.gif}{}
OPeNDAP also provides a data translation facility. Data from the original
data file is translated by the OPeNDAP server into an OPeNDAP data model for
transmission to the client. Upon receiving the data, the client
translates the data into the data model it understands. (See
( data) for more information about the OPeNDAP data model.)
Because the data transmitted from an OPeNDAP server to the client travel
in the OPeNDAP format, the data set's original storage format is completely
irrelevant to the user of an OPeNDAP client. If the client was originally
designed to read netCDF format files, the data returned by the
OPeNDAP-netCDF library will appear to have been read from a netCDF file,
whatever the actual format of the files from which the data were
read\footnote{Note that there is a limit to what can be translated. An
API meant to support two-dimensional arrays may be able to handle
one-dimensional vector data, but a program designed to process
one-dimensional vector data will not know what to do with a
two-dimensional array. The set of data access APIs supported by OPeNDAP
contain several such mismatches. See
Section~(data,trans) for more information.}. If the
program expects JGOFS data, the DODS-JGOFS library will return data
that seem to have come from a JGOFS dataset, again, no matter what the
actual input file format.
OPeNDAP does not pretend to remove all the overhead of data searches. A
user will still have to keep track of the URLs of interesting data
sets in the same way a user must now keep track of the names of files
containing interesting data. an OPeNDAP \new{catalog service} is in the
process of being constructed that will help users scan the available
datasets.
Providing Data with OPeNDAP
The OPeNDAP data provider is the person or organization willing to make
their digital datasets available to the community with an OPeNDAP server.
The designers of OPeNDAP recognized that many of the data users are also
the data providers, and OPeNDAP was built with a recognition that
providing the data should be as simple and as straightforward as
possible. In many cases, once a local web server is equipped to become
an OPeNDAP server, a scientist need do very little beyond what must
be done simply to make the data available locally. (i.e., Put the data
into a file format that can be read by the locally used data analysis
and display programs.) The tasks of a data provider can be separated
into three parts:
- Install and configure the OPeNDAP server.
(( opd-server,install).)
- Create whatever ancillary data files are needed by the data set (if any). (( intro,ancillary).) %
- Register the data set with the master directory (optional). %
- Create the data catalog.
The OPeNDAP Server
The OPeNDAP data server is simply made up of a regular httpd server
equipped with CGI programs (or filters) that will respond to requests
for dataset structure, data attributes, and data itself. (See
( data,dap) for a description of the data returned by these
requests and see ( opd-client,url) for a description of the
OPeNDAP URL syntax used to send these requests.) Most of the task of a
data provider consists of configuring this server. While perhaps not
a trivial task, it potentially represents far less effort than
packaging a dataset for submission to some central data archive.
Furthermore, modifying a server's configuration to accommodate new
data will be an almost trivial task, involving the simple editing of a
configuration file.
Ancillary Data
In order for an OPeNDAP client to accept data from an OPeNDAP server, it must be able
to allocate the data structures and arrange internal labels to organize the
incoming data. The information the client library needs to do this
organizing is called the ancillary data\footnote{It is also referred to as
the Data Descriptor Structure and the Data Attribute Structure. See
Chapter~(data) for more details about these structures.}. For many
APIs, the ancillary data is inherent in the data files themselves, and the
OPeNDAP server can glean that information by scanning the data files. For large
data archives, where scanning the data files is impractical, and that might
not change often, OPeNDAP can cache the ancillary data to speed access times.
When a client requests the ancillary data, the OPeNDAP server can check this
data cache first before scanning the data files.
This feature is useful in other cases because not all data file formats
are self-describing. For example, a data set might contain several files of
time vs. temperature data; the header information describing which numbers
are temperature and which time may be in a different file or may simply
be understood by the user of the local data analysis program equipped
to look at this data. As an example, data accessed by OPeNDAP servers using
the FreeForm data access API require provider-created ancillary data files.
Administration and Centralization of Data
Under OPeNDAP, there is no central archive of data. Data under OPeNDAP
is organized in a manner similar to the World Wide Web itself. That
is, all one need do to make one's data available is to start up a
properly configured "httpd" server on an Internet node that has
access to the data to be served. Each data provider is free to join
and to leave the system when it is convenient, just as any proprietor
of a web page is free to delete it or add to it as whimsy demands.
Of course, as can also be seen on the World Wide Web, there are some
disadvantages to the lack of central authority. If no one knows about
a web site, no one will visit it. Similarly, listing a dataset in a
central data catalog, such as the Global Change Master Directory
(http://gcmd.gsfc.nasa.gov/),can make data available to other researchers in a way that simply
configuring an OPeNDAP server does not. OPeNDAP provided a facility for
registering a data set with the GCMD catalog, which makes the data set
known to the OPeNDAP data location service.
The remainder of this book will be divided into three major sections:
instructions on the building and operating of OPeNDAP clients; a tutorial
and reference on running OPeNDAP servers and making data available to OPeNDAP
clients; and technical documentation describing the implementation details
(and the motivation behind many of the design decisions) of the OPeNDAP
software.