Hyrax - THREDDS Configuration: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
No edit summary
Line 17: Line 17:
*Each item that appears in the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory)  should have a corresponding element as a child of the top level ''<catalog>'' element in the '''catalog.xml''' file. Collections (aka directories) are represented by a ''<datasetScan>'' element. Granules (files) are represented as ''<dataset>'' elements. It is not possible to map the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory) to a single <datasetScan> element in the THREDDS catalog.<br /><br />
*Each item that appears in the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory)  should have a corresponding element as a child of the top level ''<catalog>'' element in the '''catalog.xml''' file. Collections (aka directories) are represented by a ''<datasetScan>'' element. Granules (files) are represented as ''<dataset>'' elements. It is not possible to map the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory) to a single <datasetScan> element in the THREDDS catalog.<br /><br />


=== Representing Collections: '''''<datsetScan>''''' ===
=== Representing Collections (directories): The '''''<datsetScan>''''' element ===


# For each collection that appears in the top level of the OPeNDAP directory response (probably http://localhost:8080/opendap/contents.html if Hyrax is running on your machine) you '''SHOULD''' create a <datasetScan> in the '''catalog.xml''' file.<br /><br />''The THREDDS catalog views will NOT include a collection for which this is not done!'' <br /><br />
# For each collection that appears in the top level of the OPeNDAP directory response (probably http://localhost:8080/opendap/contents.html if Hyrax is running on your machine) you '''SHOULD''' create a <datasetScan> in the '''catalog.xml''' file.<br /><br />''The THREDDS catalog views will NOT include a collection for which this is not done!'' <br /><br />
Line 29: Line 29:
# The ''<datasetScan>'' is allowed to contain a THREDDS ''<metadata>'' element. The details of its use can be found [http://www.unidata.ucar.edu/projects/THREDDS/tech/cataloggen/devel/datasetScanElement.html HERE]<br /><br />
# The ''<datasetScan>'' is allowed to contain a THREDDS ''<metadata>'' element. The details of its use can be found [http://www.unidata.ucar.edu/projects/THREDDS/tech/cataloggen/devel/datasetScanElement.html HERE]<br /><br />


=== Representing Granules (files): The '''''<dataset>''''' element ===


----
----

Revision as of 03:33, 2 July 2008

@TODO: Revise this page to improve clarity and usability

This release of Hyrax supports the complete THREDDS catalog service stack. THREDDS catalogs are controlled by a catalog.xml file located in the (persistent) content directory for the OLFS (More on that here). Rather than provide an exhaustive explanation of the THREDDS catalog functionality and configuration I will appeal to the existing documents provided by our fine colleagues at UNIDATA:

Did you read all that? Excellent!



Configuration Instructions

  • The THREDDS catalog configuration is stored in the file $CATALINA_HOME/content/opendap/catalog.xml

  • Each item that appears in the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory) should have a corresponding element as a child of the top level <catalog> element in the catalog.xml file. Collections (aka directories) are represented by a <datasetScan> element. Granules (files) are represented as <dataset> elements. It is not possible to map the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory) to a single <datasetScan> element in the THREDDS catalog.

Representing Collections (directories): The <datsetScan> element

  1. For each collection that appears in the top level of the OPeNDAP directory response (probably http://localhost:8080/opendap/contents.html if Hyrax is running on your machine) you SHOULD create a <datasetScan> in the catalog.xml file.

    The THREDDS catalog views will NOT include a collection for which this is not done!

  2. The service attribute in the <datasetScan> element must be set to "OPeNDAP-Hyrax" corresponding to the <service> element at the top of the file whose name element has the same value.

  3. Each <datasetScan> element has three crucial attributes that must be set to correspond to the the collection that is meant to be traversed: location, path, and name. These attributes should be set as follows:

    • location="/bes/collectionName"
          Where collectionName is the name of the top level collection in the BES that is to be traversed. The prefix /bes/ is required.

    • path="collectionName"
          Where collectionName is the same value as used in the location attribute. The collectionName MUST NOT start with a "/" character.

    • name="collectionName"
          Where collectionName is the same value as used in the path attribute. The name attribute is used for presentation and could be set differently from the collectionName, but doing so will invariably lead to confusion.

  4. In each <datasetScan> element that you create you MUST include the following element: <crawlableDatasetImpl className="opendap.bes.BESCrawlableDataset" /> This is the CrawlableDataset implementation that allows the THREDDS implementation to work with the BES.

  5. You should apply a filter to the data that coincides with the value of the "BES.Catalog.catalog.TypeMatch" for the data types being served. I suggest that you make the filter expose ALL of the data types served by the BES. See the THREDDS pages on the DatasetScan Element for filter details. The point of this to remove files from the catalog view that are NOT OPeNDAP data. For example README files.

  6. The <datasetScan> is allowed to contain a THREDDS <metadata> element. The details of its use can be found HERE

Representing Granules (files): The <dataset> element


Reinitializing THREDDS

The THREDDS catalog is read when Tomcat is started. Hyrax will check the last modifed date of the catalog.xml file prior to responding to a THREDDS catalog request. If the last modifed date has changed since Tomcat started, then Hyrax will reload all of the THREDDS catalog information.

So if you make changes to ANY of the THREDDS catalog files in the $CATALINA_HOME/content/opendap directory tree, then there are two ways for you to get Hyrax to update:

  1. Change the last modified date of the file: $CATALINA_HOME/content/opendap/catalog.xml

    This can be accomplished with the unix command "touch" command: touch $CATALINA_HOME/content/opendap/cataog.xml This will cause Hyrax to reload all of the THREDDS catalogs the next time that a THREDDS catalog request is made (You might want to make this request yourself if you have a big THREDDS catalog configuration so that a knowing user doesn't have to wait for a response while Hyrax is working)

    OR you could:

  2. Restart Tomcat.



THREDDS Catalog Examples

Example 1

Here is an example catalog.xml file for a Hyrax installation in which the top level of the BES shows only ONE collection called "data":

<?xml version="1.0" encoding="UTF-8"?>
<catalog name="Hyrax Test Catalog"
         xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"
         xmlns:xlink="http://www.w3.org/1999/xlink">

     <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->

     <service name="OPeNDAP-Hyrax" serviceType="OPeNDAP" base="/opendap/"/>

     <datasetScan location="/bes/data" path="data" name="AllMyData" serviceName="OPeNDAP-Hyrax">

	       <crawlableDatasetImpl className="opendap.bes.BESCrawlableDataset" />

           <filter>
               <exclude wildcard=".*" atomic="true" collection="true" />
               <include wildcard="*" />
           </filter>
           <addDatasetSize />

           <metadata inherited="true">
               <serviceName>OPeNDAP-Hyrax</serviceName>
               <authority>opendap.org</authority>
               <dataType>Random</dataType>
           </metadata>
     </datasetScan>

     <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
</catalog>

Example 2

Here is an example catalog.xml file for a Hyrax installation in which the top level of the BES shows contains 4 collection called "nc", "hdf", and "ff":

<?xml version="1.0" encoding="UTF-8"?>
<catalog name="Hyrax Test Catalog"
         xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"
         xmlns:xlink="http://www.w3.org/1999/xlink">

     <service name="OPeNDAP-Hyrax" serviceType="OPeNDAP" base="/opendap/"/>

     <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->

     <datasetScan location="/bes/nc" path="nc" name="NetCDF Archive" serviceName="OPeNDAP-Hyrax">

	       <crawlableDatasetImpl className="opendap.bes.BESCrawlableDataset" />

           <filter>
               <exclude wildcard=".*" atomic="true" collection="true" />
               <include wildcard="*.nc" />
           </filter>
           <addDatasetSize />

           <metadata inherited="true">
               <serviceName>OPeNDAP-Hyrax</serviceName>
               <authority>opendap.org</authority>
               <dataType>Random</dataType>
           </metadata>
     </datasetScan>

     <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->

     <datasetScan location="/bes/hdf" path="hdf" name="HDF Archive" serviceName="OPeNDAP-Hyrax">

	       <crawlableDatasetImpl className="opendap.bes.BESCrawlableDataset" />

           <filter>
               <exclude wildcard=".*" atomic="true" collection="true" />
               <include wildcard="*.hdf" />
           </filter>
           <addDatasetSize />

           <metadata inherited="true">
               <serviceName>OPeNDAP-Hyrax</serviceName>
               <authority>opendap.org</authority>
               <dataType>Random</dataType>
           </metadata>
    </datasetScan>

    <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->

    <datasetScan location="/bes/ff" path="ff" name="FreeForm Archive" serviceName="OPeNDAP-Hyrax">

	       <crawlableDatasetImpl className="opendap.bes.BESCrawlableDataset" />

           <filter>
               <exclude wildcard=".*" atomic="true" collection="true" />
               <include wildcard="*.dat" />
           </filter>
           <addDatasetSize />

           <metadata inherited="true">
               <serviceName>OPeNDAP-Hyrax</serviceName>
               <authority>opendap.org</authority>
               <dataType>Random</dataType>
           </metadata>
     </datasetScan>

     <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
</catalog>



In Particular Note The Following

1. The line in which the CrawlableDataset implementation is defined:

    <crawlableDatasetImpl className="opendap.bes.BESCrawlableDataset" />

Identifies the correct CrawlableDataset class for Hyrax - the one that works with the BES to automatically generate catalogs.


2. In the <datasetScan> element the location attribute's value MUST begin with /bes. So, if the top level collection in the BES contains 4 sub collections they may each be identified using a separate <datasetScan> element like so:

	<datasetScan location="/bes/nc"  path="nc"  name="NetCDF Archive"   serviceName="OPeNDAP-Hyrax"> . . . </datasetScan>
	<datasetScan location="/bes/hdf" path="hdf" name="HDF Archive"      serviceName="OPeNDAP-Hyrax"> . . . </datasetScan>
	<datasetScan location="/bes/jg"  path="jg"  name="JGOFFS Archive"   serviceName="OPeNDAP-Hyrax"> . . . </datasetScan>
	<datasetScan location="/bes/ff"  path="ff"  name="FreeForm Archive" serviceName="OPeNDAP-Hyrax"> . . . </datasetScan>

Where each <datasetScan> element may have it's own filter and inheritance rules. You MUST NOT lump them all into one <datasetScan> element with one set of filter rules like so:

	<datasetScan location="/bes"     path="DATA"  name="AllYourDataAreUs"   serviceName="OPeNDAP-Hyrax"> . . . </datasetScan>

Because it does not work. If you want them all to be in one a single collection then configure the BES so that it has one top level collection (see Example 1)


3. The path attribute in the <datasetScan> element appears in the URL after the servlet name, and MUST be the same as the value of the location attribute with the leading "/bes/" removed. In other words it MUST NOT start with a "/" character .