Wiki Testing/ServerInstallationGuide2

From OPeNDAP Documentation
Revision as of 03:12, 4 October 2007 by Yuan (talk | contribs) (New page: =Installing the OPeNDAP Server= Most of the task of installing the OPeNDAP server consists of getting the required Web server installed and running. The variety of available Web servers...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
⧼opendap2-jumptonavigation⧽

Installing the OPeNDAP Server

Most of the task of installing the OPeNDAP server consists of getting the required Web server installed and running. The variety of available Web servers make this task beyond the scope of this guide. Proceed with the following steps only after the Web server itself works. Look at ( server,testing) for hints on how to tell whether the server is working. \subj{First, get the web server working. Then

 install the OPeNDAP Server.}

If you want to install the DODS Relational Database Server (DRDS), to serve data stored in a relational DBMS, like Oracle or SQL Server, see ( server,java).

Step by Step

Here are the steps to installing the server's base software and data handlers (these are the components of the server) and data to be served.

  1. Install the web server.
  1. Download and install the OPeNDAP server base software.
  1. Choose one or more data handlers, download and install.
  1. Configure your data.


In a little more detail, here are those same steps:


  1. Install the web server to be used. This is not an OPeNDAP program, and will have its own documentation. If the server is already installed, figure out how to run a CGI program with that server. Testing: If your web server is running, you should be able to request a web page from it. From a web browser, try sending a simple request containing only the machine name: http://machine, where \var{machine} should be replaced by the name of the computer you're doing the installation on. When the simple page works, try executing a CGI program. (The simplest CGI Perl program is on \pageref{simple,perl}.) See ( server,testing) for more ways to check if the web server is working.
  1. Download and install the OPeNDAP server base software. It is usually easiest to use the pre-compiled binary distributions. Look at the \DODShome to get the software. See \appref{install} for more information. Testing: The software should all be installed in three directories: 1) The Perl software in /usr/local/share/dap-server; 2) The CGI program in /usr/local/share/dap-server-cgi; and 3) The binary programs dap_usage, dap_asciival and dap_www_int in /usr/local/bin. If you build the software from sources you can install the server software in a root that is different from /usr/local.
  1. Once the server software has been installed, you'll need to install one or more \new{data handlers}. OPeNDAP provides data handlers for NetCDF, HDF4 (and HDF-EOS), DSP, JGOFS, and FreeForm. You need to download these separately and follow their installation instructions.
  1. Next you will need to configure the server's CGI so that it can find the handlers. During this step a handful of configuration parameters must be set. Locate the file named dap-server.rc in /usr/local/share/dap-server-cgi. See ( install,config,server) for instructions about how to configure this file. Testing: When the server is working, you should get a response to a version request, where you query the software for its version (release) number. To do this, enter a URL like the following into a web browser:
     http://machine/cgi-bin/nph-dods/version 
    Remember to replace \var{machine} with the machine you're using. Also, the CGI directory you're using may not be called cgi-bin. If you're not using a CGI directory, see the next step.
    • If your server uses name conventions to identify CGI programs, change the names of the dispatch script to conform with the local convention: e.g. nph_dods to nph_dods.cgi. The other service program names do not need to be changed. Testing: Try to get a version number from the server. If your CGI programs are identified with a suffix like .cgi, try a URL like this:
       http://machine/nph-dods.cgi/version 
    • If you are using DODS release 3.5 or later, you also need to make sure that the dispatch configuration file dap-server.rc is also copied into the directory with the CGI and correctly protected. See ( install,config,server) for instructions about how to configure this file. Note that earlier servers (from version 3.2 through 3.4 used a configuration file named dods.in or dods.rc). Testing: This cannot be tested without having some data installed.
  1. See ( install,data-install) for instructions on installing and testing the data to be served.

See ( server,testing) for more tests to make sure you've done each step correctly.

NOTE: In addition to some specialized Perl modules, the OPeNDAP server

uses the HTML::Parser Perl module. You should have installed this

prior to installing the dap-server package.

The CGI configuration of a web server is dependent on the particular web server you use. Consult the documentation for that server for more information. Our observations are that for most servers, having a CGI directory is the default situation, and there are a couple of potential security holes avoided with that configuration.

\subj{How to find the CGI directory.} To find which directory is the cgi-bin directory, you can look in the server's configuration file for a line like:

ScriptAlias /cgi-bin/ /var/www/cgi-bin/

For both the NCSA and Apache servers, the option ScriptAlias defines where CGI programs may reside. In this case they are in the directory /var/www/cgi-bin. URLs with cgi-bin in their path will automatically refer to programs in this directory.

At this point, you will be wondering if things are working yet or not. Again, check out ( server,testing) for a detailed set of tests to get you going.

Configure the Server

Starting with version 3.5, the OPeNDAP server makes much more


Starting with version 3.5, the OPeNDAP server makes much more extensive use of its configuration file. The dap-server.rc configuration file is used to tailor the server to your site. The file contains a handful of parameters plus the mappings between different data sources (typically files, although that doesn't have to be the case) and hander-programs. The format of the configuration file is:


     <parameter> <value> ... <value>
 The configuration file is line-oriented, with each parameter appearing on
 its own line. Blank lines are ignored and the \# character is used to
 begin comment lines. Comments have to appear on lines by themselves.
 The parameters recognized are:
  • data_root
   data_root <path> 
 Define this if you do not want the server to assume data are located under the web server's DocumentRoot. The value <path> should be the fully qualified path to the directory which you want to use as the root of your data tree. For example, we have a collection of netcdf, hdf, {\it et c.}, files that we store in directories named /usr/local/test_data/data/nc, /usr/local/test_data/data/hdf, {\it et c.}, and we set data_root to /usr/local/test_data. The value of <path> should not end in a slash.  

The OPeNDAP server's directory browsing functions do not work when using the data_root option. They do still work when locating data under DocumentRoot. Also, it's not an absolute requirement that this path be a real directory or that your data are in `files.' Data can reside in a relational database, for example, and in that case the base software will use <path> as a prefix to the path part of the URL it receives from the client.

  • timeout
   timeout <seconds> 
This sets the OPeNDAP server timeout value, in seconds. This is different from the httpd timeout. OPeNDAP servers run independently of httpd once the initial work of httpd is complete. Setting timeout ensures that your OPeNDAP server does not continue indefinitely if something goes wrong ({\it i.e.}, a user makes a huge request to a database).  Default is 0 which means no time out.
  • cache_dir
   cache_dir <directory> 

When data files are stored in a compressed format such as gzip or UNIX compress, the OPeNDAP server first decompresses them and then serves the decompressed file. The files are cached as they are decompressed. This parameter tells the server where to put that cache. Default: /usr/tmp

  • cache_size
   cache_size <size in MB> 

How much space can the cached files occupy? This value is given in MegaBytes. When the total size of all the decompressed files exceeds this value, the oldest remaining file will be removed until the size drops below the parameter value. If you are serving large files, make sure this value is at least as large as the largest file.

  • maintainer
   maintainer <email address> 

The email address of the person responsible for this server. This email address will be included in many error messages returned by the server. Default: support@unidata.ucar.edu

  • curl
   curl <path> 

This parameter is used to set the path to the curl executable. The curl command line tool is used to dereference URLs when the server needs to do so. In some cases the curl executable might not be found by the CGI. This can be a source of considerable confusion because a CGI program run from a web daemon uses a very restricted PATH environment variable, much more restricted than a typical user's PATH. Thus, even if you, as the server installer have curl on your PATH, nph-dods may not be able to find the program unless you tell it exactly where to look.

  • exclude
   exclude <handler> ... <handler> 

This is a list of handlers whose regular expressions should not be used when building the HTML form interface for this server. In general, this list should be empty. However, if you have a handler that is bound to a regular expression that is very general (such as .* which will match all files), then you should list that handler here, enclosing the name in double quotes. See the next item about the 'handler' parameter. Default: No handlers are excluded.

  • handler
   handler <regular expression> <handler name> 

The handler parameter is used to match data sources with particular handler programs used by the server. In a typical OPeNDAP server setup, the data sources are files and the regular expressions choose handlers based on the data file's extension. However, this need not be that case. The OPeNDAP server actually matches the entire pathname of the data source when searching for the correct handler to use. Here are the values assigned to the handler parameter in the default dap-server.rc file:

 
# Look for common file extensions.  
handler .*\.(HDF|hdf|EOS|eos)(.Z|.gz)*$ /usr/local/bin/dap_hdf_handler
handler .*\.(NC|nc|cdf|CDF)$ /usr/local/bin/dap_nc_handler 
handler .*\.(dat|bin)$ /usr/local/bin/dap_ff_handler 
handler .*\.(pvu)(.Z|.gz)*$ /usr/local/bin/dap_dsp_handler  

# For JGOFS datasets, match either the dataset name or the absence of an 
# extension. The later case is sort of risky, but if you have lots of JGOFS 
# datasets it might be appealing. handler .*/test$ /usr/local/bin/dap_jg_handler
# handler .*/[^/]+$ /usr/local/bin/dap_jg_handler

Consider a URL like this:

 http://test.opendap.org/opendap/nph-dods/data/nc/fnoc1.nc 

When the nph-dods script is executed, the "file name" part of the URL is /data/nc/fnoc1.nc. Testing this against the default initialization file matches the second line, which indicates that the dap_nc_handler (NetCDF) data handler should be used to process this request. The request is then dispatched to the handler for processing.

The default configuration file used to be set up so that files without extensions are handled by the JGOFS data handler. This caused some problems for sites where data files did not always use common extensions for data files. If you need to serve many JGOFS data files, then uncomment this line of write special regular expressions for the JGOFS data sources.

 "Regular expressions", advanced pattern-matching languages, are a powerful feature of Perl and many other computer languages. Powerful enough, in fact, to warrant at least one book about them (Mastering Regular Expressions by Jeffrey Friedl, O'Reilly, 1997).    (For a complete reference online, which is not a particularly good place to learn about them for the first time, see http://www.perldoc.com/perl5.6/pod/perlre.html.)  Briefly, however, the above patterns test whether a filename is of the form  \var{file}.\var{ext}.\var{comp}, where \var{comp} (if present) is Z or gz, and \var{ext} is one of several possible filename extensions that might indicate a specific storage API.  If these default rules will not work for your installation, you can rewrite them.  For example, if all your files are HDF files, you could replace the default configuration file with one that looks like this:  
 handler .* /usr/local/bin/dap_hdf_handler 

The .* pattern matches all possible patterns (the . matches a single character and * matches zero or more occurrences of the previous character or \new{metacharacter}), and indicates that whatever the name of the file sought, the HDF service programs are the ones to use. If you have a situation where all the files in a particular directory (whatever its extension) are to be handled by the DSP service programs, and all other files served are JGOFS files, try this:

 
handler \/dsp_data\/.*$ /usr/local/bin/dap_dsp_handler 
handler .*/[^/]+$ /usr/local/bin/dap_jg_handler 

The rules are applied in order, and the first rule with a successful match returns the handler that will be applied. The above set of rules implies that everything in the dsp_data directory will be processed with the DSP handler, and everything else will be sent to the JGOFS handler.

Compression