THREDDS using XSLT

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽

Background

Prior to Hyrax 1.5 THREDDS catalog functionality in Hyrax was provided using the UNIDATA implementation of THREDDS. In essence Hyrax was dependent on the entire TDS code bundle. This was a large and complex dependancy for Hyrax, and the UNIDATA THREDDS implementation (version 3.16) had significant scalability problems for large catalogs: They simply consumed all available memory in the JVM.

In response to this we have written new code for Hyrax. We have replaced the UNIDATA code with 2 OLFS handlers.

BES THREDDS Handler

The opendap.bes.BESThreddsDispatchHandler provides THREDDS catalogs for all data served from a BES. It requires no configuration. Simply adding it to the olfs configuration (file: $CATALINA_HOME/content/opendap/olfs.xml) will provide THREDDS catalogs for data served from the BES.

This handler uses XSL transforms to convert the BES <showCatalog> response into a THREDDS catalog.

Example Configuration

           <Handler className="opendap.bes.BESThreddsDispatchHandler" />

THREDDS Dispatch Handler

The opendap.threddsHandler.Dispatch handler provides THREDDS catalog functionality for static THREDDS catalogs located on the system with the OLFS. The handler uses XSL transforms to provide HTML presentation views of both the catalogs and individual datasets within the catalog. LIke the TDS, data access links are available on the dataset pages (if the catalog contains the information for the access links).

Memory Caching

The implementation can be configured to use memory caching of THREDDS catalogs to improve speed and reduce disk thrashing.

When memory caching is enabled, the handler will traverse the local THREDDS catalogs at startup. Each catalog file will be read into a memory buffer and cached. The memory buffer is parsed to verify that the catalog represents valid XML, but the resulting document is not saved. When a thredds:catalogRef element is encountered during the traversal its href is evaluated:

  • If the href is a relative URL (does not begin with a "/" or "http://) then the catalog is traversed and cached.
  • If the href begins with a "/" character it is assumed that the catalog is being provided by another service on the same system, and it is not traversed or cached.
  • If the href begins with a "http://" it is assumed to be a remotely hosted catalog provided by another service on a different system, and it is not traversed or cached.

When a client asks for an XML catalog response, the entire cached buffer for the catalog is dumped to the client in a single write command. This should be very fast, as all that must happen is a byte buffer is written to the response stream.

If the client is asking for an HTML view of the catalog, the buffer is parsed and passed through an XSL transform to generate the HTML page. The thinking behind this is that machines are likely to be traversing the XML files and would require very fast response times, while humans would traversing the HTML views of the catalog and the latency generated by parsing and performing the transform would be acceptable to most users.

If memory caching is disabled, then the start up remains the same, except no data is cached. Subsequent client requests for THREDDS products are handled in the same manner as before, only the catalog content is read from disk each time. Although this means that the XML responses will be much slower, it will scale to handle much larger static catalog collections.

prefix

This handler requires a prefix element in the configuration: <prefix>thredds</prefix>The value of the prefix element is used by the handler to identify requests intended for it. Basically it will claim any request whose path begins with the prefix.

For example, if the prefix is set to "thredds", then this request:

http://localhost:8080/opendap/thredds/catalog.xml

Will be claimed by the handler while this request:

http://localhost:8080/opendap/catalog.xml

Will not. (Although it would be claimed by the BES THREDDS Handler)

Presentation View (HTML)

Supplanting the .xml at the end of a catalog's name with .html will cause the HADLERNAME to

Example configuration

           <Handler className="opendap.threddsHandler.Dispatch">
               <prefix>thredds</prefix>
               <useMemoryCache>true</useMemoryCache>
           </Handler>