Hyrax - THREDDS Configuration: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
No edit summary
No edit summary
Line 1: Line 1:
@TODO: '''Revise this page to improve clarity and usability'''
@TODO: '''Revise this page to improve clarity and usability'''


Hyrax now uses its own implementation of the THREDDS catalog services and supports the complete THREDDS catalog service stack. THREDDS catalogs are controlled by a ''catalog.xml'' file located in the (persistent) content directory for the OLFS ([[Hyrax_-_OLFS_Configuration#olfs.xml_Configuration_File |more on that here]]). Rather than provide an exhaustive explanation of the THREDDS catalog functionality and configuration I will appeal to the existing documents provided by our fine colleagues at [http://www.unidata.ucar.edu/projects/THREDDS/ UNIDATA]:
Hyrax now uses its own implementation of the THREDDS catalog services and supports the complete THREDDS catalog service stack. The implementation relies on two DispatchHandlers in the OLFS and utilizes XSLT to provide HTML versions (presentation views) for human consumption.
 
# Dynamic THREDDS catalogs for holdings provided by the BES are provided by  the opendap.bes.BESThreddsDispatchHandler.
# Static THREDDS catalogs are provided by the opendap.threddsHandler.Dispatch
 
The default ([[Hyrax_-_OLFS_Configuration#olfs.xml_Configuration_File |olfs.xml]]) file now contains both of these handlers in the correct locations for Hyrax operations.
 
Static THREDDS catalogs are "rooted" in the ''catalog.xml'' file located in the (persistent) content directory for the OLFS (Typically $CATALINA_HOME/content/opendap). The default ''catalog.xml'' that comes with Hyrax contains a simple catalogRef element that points to the dynamic THREDDS catalogs generated from the BES holdings. Additional catalog components may be added to the ''catalog.xml'' file to build (potentially large) static catalogs.
 
Caveats:
 
* Since the BESThreddsDispatchHandler provides catalogs for the BES holdings, the static catalogs do not support the THREDDS datasetScan element, other than to provide links in the presentation view.
* The BES catalogs do not support a mechanism for including inherited metadata down the catalog tree (normally this functionality comes from the datasetScan element in a static catalog). If the need for this arises it can be done, just let us know!
 
 
Rather than provide an exhaustive explanation of the THREDDS catalog functionality and configuration I will appeal to the existing documents provided by our fine colleagues at [http://www.unidata.ucar.edu/projects/THREDDS/ UNIDATA]:


* [http://www.unidata.ucar.edu/projects/THREDDS/tech/index.html#catalog Catalog Basics]
* [http://www.unidata.ucar.edu/projects/THREDDS/tech/index.html#catalog Catalog Basics]
Line 16: Line 31:


*The THREDDS catalog configuration is stored in the file '''$CATALINA_HOME/content/opendap/catalog.xml'''<br /><br />
*The THREDDS catalog configuration is stored in the file '''$CATALINA_HOME/content/opendap/catalog.xml'''<br /><br />
*Each item that appears in the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory)  should have a corresponding element as a child of the top level ''<catalog>'' element in the '''catalog.xml''' file. Collections (aka directories) are represented by a ''<datasetScan>'' element. Granules (files) are represented as ''<dataset>'' elements. It is not possible to map the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory) to a single <datasetScan> element in the THREDDS catalog.<br /><br />


=== Representing Collections (directories): The '''''<datsetScan>'''''  element ===


# For each collection that appears in the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory)  you '''SHOULD''' create a corresponding ''<datasetScan>'' in the '''catalog.xml''' file.<br /><br />''The THREDDS catalog views will NOT include top level collections for which this is not done!'' <br /><br />
*Each item that appears in the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory)  should have a corresponding element as a child of the top level ''<catalog>'' element in the '''catalog.xml''' file. Collections (aka directories) are represented by a ''<datasetScan>'' element. Granules (files) are represented as ''<dataset>'' elements. It is not possible to map the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory) to a single <datasetScan> element in the THREDDS catalog.<br /><br />
# The ''serviceName'' attribute in the <''datasetScan''> element must be set to "''OPeNDAP-Hyrax''" corresponding to the ''<service>'' element at the top of the file whose name element has the same value. <br /><br />
#* '''''serviceName="OPeNDAP-Hyrax"'''''<br /><br />
# Each ''<datasetScan>'' element has three crucial attributes that must be set to correspond to the the collection that is meant to be traversed: '''''location''''', '''''path''''', and '''''name'''''. These attributes should be set as follows:<br /><br />
#* '''''location="/bes/collectionName"''''' <br />&nbsp;&nbsp;&nbsp;&nbsp;Where collectionName is the name of the top level collection in the BES that is to be traversed. The prefix '''''/bes/''''' is required.<br /><br />
#* '''''path="collectionName"'''''<br />&nbsp;&nbsp;&nbsp;&nbsp;Where collectionName is the same value as used in the ''location'' attribute. The collectionName '''MUST NOT''' start with a "/" character.<br /><br />
#* '''''name="collectionName"'''''<br /> &nbsp;&nbsp;&nbsp;&nbsp;Where collectionName is the same value as used in the ''path'' attribute. The ''name'' attribute is used for presentation and could be set differently from the '''''collectionName''''', but doing so will most likely lead to confusion for people navigating the OPeNDAP contents.html view of the server and the THREDDS catalog view of the server.<br /><br />
# In each <''datasetScan''> element that you create you '''MUST''' include the following element:<br/><br/>&nbsp;&nbsp;&nbsp;&nbsp;'''''<''crawlableDatasetImpl className="opendap.bes.BESCrawlableDataset"'''''/> <br/><br/>This is the ''CrawlableDataset'' implementation that allows the THREDDS implementation to work with the BES.<br /><br />
# You should apply a filter to the data that coincides with the value of the "''BES.Catalog.catalog.TypeMatch''" for the data types being served. I suggest that you make the filter expose ALL of the data types served by the BES. See the [http://www.unidata.ucar.edu/projects/THREDDS/tech/cataloggen/devel/datasetScanElement.html THREDDS pages on the DatasetScan Element] for filter details. The point of this to remove files from the catalog view that are NOT being served as OPeNDAP data. For example README files.<br /><br />
# The ''<datasetScan>'' is allowed to contain a THREDDS ''<metadata>'' element. The details of its use can be found [http://www.unidata.ucar.edu/projects/THREDDS/tech/cataloggen/devel/datasetScanElement.html HERE]<br /><br />
 
=== Representing Granules (files): The '''''<dataset>''''' element ===
# For each granule (file)  that appears in the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory) you '''SHOULD''' create a corresponding ''<dataset>'' in the '''catalog.xml''' file.<br /><br />''The THREDDS catalog views will NOT include top level granules for which this is not done!'' <br /><br />
# Each ''<dataset>'' element has three crucial attributes that must be set to correspond to the the collection that is meant to be traversed: '''''name''''', '''''urlPath''''', and '''''ID'''''. These attributes should be set as follows:<br /><br />
#* '''''name="granuleName"''''' <br />&nbsp;&nbsp;&nbsp;&nbsp;Where granuleName is the name of the top level granule (file) in the BES that is to be included in the catalog.<br /><br />
#* '''''ID="granuleName"''''' <br />&nbsp;&nbsp;&nbsp;&nbsp;Where granuleName is the name of the top level granule (file) in the BES that is to be included in the catalog.<br /><br />
#* '''''urlPath="granuleName"''''' <br />&nbsp;&nbsp;&nbsp;&nbsp;Where granuleName is the name of the top level granule (file) in the BES that is to be included in the catalog.<br /><br />
# In each <''dataset''> element that you create you '''MUST''' include the following element:<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;'''''<serviceName>OPeNDAP-Hyrax</serviceName>''''' <br /><br />Where the text value of the ''<serviceName>'' element is equal to the value of the ''name'' attribute of the corresponding ''<service>'' element at the top of the document.<br /><br />
# You can find more about the THREDDS ''<dataset>'' element [http://www.unidata.ucar.edu/projects/THREDDS/tech/catalog/InvCatalogSpec.html#dataset HERE]
 
 
 
 
 
 
----
 
=Reinitializing THREDDS=
The THREDDS catalog is read when Tomcat is started. Hyrax will check the last modifed date of the catalog.xml file prior to responding to a THREDDS catalog request. If the last modifed date has changed since Tomcat started, then Hyrax will reload all of the THREDDS catalog information.
 
So if you make changes to ANY of the THREDDS catalog files in the ''$CATALINA_HOME/content/opendap'' directory tree, then there are two ways for you to get Hyrax to update:
 
# Change the last modified date of the file: ''$CATALINA_HOME/content/opendap/catalog.xml''<br /><br /> This can be accomplished with the unix command "touch" command: <code>touch $CATALINA_HOME/content/opendap/cataog.xml</code> This will cause Hyrax to reload all of the THREDDS catalogs the next time that a THREDDS catalog request is made (You might want to make this request yourself if you have a big THREDDS catalog configuration so that a knowing user doesn't have to wait for a response while Hyrax is working) <br /><br />OR you could:<br /><br />
#Restart Tomcat.
 
 
-----
 
=THREDDS Catalog Examples=
 
===Example 1===
Here is an example ''catalog.xml'' file for a Hyrax installation in which the top level of the BES shows only ONE collection called "''data''":
<pre>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;catalog name=&quot;Hyrax Test Catalog&quot;
        xmlns=&quot;http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0&quot;
        xmlns:xlink=&quot;http://www.w3.org/1999/xlink&quot;&gt;
 
    &lt;!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --&gt;
 
    &lt;service name=&quot;OPeNDAP-Hyrax&quot; serviceType=&quot;OPeNDAP&quot; base=&quot;/opendap/&quot;/&gt;
 
    &lt;datasetScan location=&quot;/bes/data&quot; path=&quot;data&quot; name=&quot;data&quot; serviceName=&quot;OPeNDAP-Hyrax&quot;&gt;
 
      &lt;crawlableDatasetImpl className=&quot;opendap.bes.BESCrawlableDataset&quot; /&gt;
 
          &lt;filter&gt;
              &lt;exclude wildcard=&quot;.*&quot; atomic=&quot;true&quot; collection=&quot;true&quot; /&gt;
              &lt;include wildcard=&quot;*&quot; /&gt;
          &lt;/filter&gt;
          &lt;addDatasetSize /&gt;
 
          &lt;metadata inherited=&quot;true&quot;&gt;
              &lt;serviceName&gt;OPeNDAP-Hyrax&lt;/serviceName&gt;
              &lt;authority&gt;opendap.org&lt;/authority&gt;
              &lt;dataType&gt;Random&lt;/dataType&gt;
          &lt;/metadata&gt;
    &lt;/datasetScan&gt;
 
    &lt;!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --&gt;
&lt;/catalog&gt;
</pre>
 
===Example 2===
 
Here is an example ''catalog.xml'' file for a Hyrax installation in which the top level of the BES shows contains 4 collection called "''nc''", "''hdf''", and "''ff''":
<pre>
&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;catalog name=&quot;Hyrax Test Catalog&quot;
        xmlns=&quot;http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0&quot;
        xmlns:xlink=&quot;http://www.w3.org/1999/xlink&quot;&gt;
 
    &lt;service name=&quot;OPeNDAP-Hyrax&quot; serviceType=&quot;OPeNDAP&quot; base=&quot;/opendap/&quot;/&gt;
 
    &lt;!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --&gt;
 
    &lt;datasetScan location=&quot;/bes/nc&quot; path=&quot;nc&quot; name=&quot;nc&quot; serviceName=&quot;OPeNDAP-Hyrax&quot;&gt;
 
      &lt;crawlableDatasetImpl className=&quot;opendap.bes.BESCrawlableDataset&quot; /&gt;
 
          &lt;filter&gt;
              &lt;exclude wildcard=&quot;.*&quot; atomic=&quot;true&quot; collection=&quot;true&quot; /&gt;
              &lt;include wildcard=&quot;*.nc&quot; /&gt;
          &lt;/filter&gt;
          &lt;addDatasetSize /&gt;
 
          &lt;metadata inherited=&quot;true&quot;&gt;
              &lt;serviceName&gt;OPeNDAP-Hyrax&lt;/serviceName&gt;
              &lt;authority&gt;opendap.org&lt;/authority&gt;
              &lt;dataType&gt;Random&lt;/dataType&gt;
          &lt;/metadata&gt;
    &lt;/datasetScan&gt;
 
    &lt;!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --&gt;
 
    &lt;datasetScan location=&quot;/bes/hdf&quot; path=&quot;hdf&quot; name=&quot;hdf&quot; serviceName=&quot;OPeNDAP-Hyrax&quot;&gt;
 
      &lt;crawlableDatasetImpl className=&quot;opendap.bes.BESCrawlableDataset&quot; /&gt;
 
          &lt;filter&gt;
              &lt;exclude wildcard=&quot;.*&quot; atomic=&quot;true&quot; collection=&quot;true&quot; /&gt;
              &lt;include wildcard=&quot;*.hdf&quot; /&gt;
          &lt;/filter&gt;
          &lt;addDatasetSize /&gt;
 
          &lt;metadata inherited=&quot;true&quot;&gt;
              &lt;serviceName&gt;OPeNDAP-Hyrax&lt;/serviceName&gt;
              &lt;authority&gt;opendap.org&lt;/authority&gt;
              &lt;dataType&gt;Random&lt;/dataType&gt;
          &lt;/metadata&gt;
    &lt;/datasetScan&gt;
 
    &lt;!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --&gt;
 
    &lt;datasetScan location=&quot;/bes/ff&quot; path=&quot;ff&quot; name=&quot;ff&quot; serviceName=&quot;OPeNDAP-Hyrax&quot;&gt;
 
      &lt;crawlableDatasetImpl className=&quot;opendap.bes.BESCrawlableDataset&quot; /&gt;
 
          &lt;filter&gt;
              &lt;exclude wildcard=&quot;.*&quot; atomic=&quot;true&quot; collection=&quot;true&quot; /&gt;
              &lt;include wildcard=&quot;*.dat&quot; /&gt;
          &lt;/filter&gt;
          &lt;addDatasetSize /&gt;
 
          &lt;metadata inherited=&quot;true&quot;&gt;
              &lt;serviceName&gt;OPeNDAP-Hyrax&lt;/serviceName&gt;
              &lt;authority&gt;opendap.org&lt;/authority&gt;
              &lt;dataType&gt;Random&lt;/dataType&gt;
          &lt;/metadata&gt;
    &lt;/datasetScan&gt;
 
    &lt;!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --&gt;
&lt;/catalog&gt;
</pre>
 
 
-----
 
=In Particular Note The Following=
'''1.''' The line in which the ''CrawlableDataset'' implementation is defined:
<pre>
    &lt;crawlableDatasetImpl className="opendap.bes.BESCrawlableDataset" /&gt;
</pre>
Identifies the correct ''CrawlableDataset'' class for Hyrax - the one that works with the BES to automatically generate catalogs.
 
 
'''2.''' In the <''datasetScan''> element the location attribute's value '''MUST''' begin with ''/bes''. So, if the top level collection in the BES contains 4 sub collections they may each be identified using a separate <''datasetScan''> element like so:
<pre>
&lt;datasetScan location=&quot;/bes/nc&quot;  path=&quot;nc&quot;  name=&quot;nc&quot;  serviceName=&quot;OPeNDAP-Hyrax&quot;&gt; . . . &lt;/datasetScan&gt;
&lt;datasetScan location=&quot;/bes/hdf&quot; path=&quot;hdf&quot; name=&quot;hdf&quot;  serviceName=&quot;OPeNDAP-Hyrax&quot;&gt; . . . &lt;/datasetScan&gt;
&lt;datasetScan location=&quot;/bes/jg&quot;  path=&quot;jg&quot;  name=&quot;jg&quot;  serviceName=&quot;OPeNDAP-Hyrax&quot;&gt; . . . &lt;/datasetScan&gt;
&lt;datasetScan location=&quot;/bes/ff&quot;  path=&quot;ff&quot;  name=&quot;ff&quot;  serviceName=&quot;OPeNDAP-Hyrax&quot;&gt; . . . &lt;/datasetScan&gt;
</pre> Where each <''datasetScan''> element may have it's own filter and inheritance rules. You MUST NOT lump them all into one <''datasetScan''> element with one set of filter rules like so:
<pre> &lt;datasetScan location=&quot;/bes&quot;    path=&quot;DATA&quot;  name=&quot;DATA&quot;  serviceName=&quot;OPeNDAP-Hyrax&quot;&gt; . . . &lt;/datasetScan&gt;</pre> Because it does not work. If you want them all to be in one a single collection then configure the BES so that the BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory have a single top level collection (see Example 1)
 
 
'''3.''' The path attribute in the <''datasetScan''> element appears in the URL after the servlet name, and '''MUST''' be the same as the value of the location attribute with the leading "''/bes/''" removed. In other words it '''MUST NOT''' start with a "/" character .

Revision as of 18:51, 24 February 2009

@TODO: Revise this page to improve clarity and usability

Hyrax now uses its own implementation of the THREDDS catalog services and supports the complete THREDDS catalog service stack. The implementation relies on two DispatchHandlers in the OLFS and utilizes XSLT to provide HTML versions (presentation views) for human consumption.

  1. Dynamic THREDDS catalogs for holdings provided by the BES are provided by the opendap.bes.BESThreddsDispatchHandler.
  2. Static THREDDS catalogs are provided by the opendap.threddsHandler.Dispatch

The default (olfs.xml) file now contains both of these handlers in the correct locations for Hyrax operations.

Static THREDDS catalogs are "rooted" in the catalog.xml file located in the (persistent) content directory for the OLFS (Typically $CATALINA_HOME/content/opendap). The default catalog.xml that comes with Hyrax contains a simple catalogRef element that points to the dynamic THREDDS catalogs generated from the BES holdings. Additional catalog components may be added to the catalog.xml file to build (potentially large) static catalogs.

Caveats:

  • Since the BESThreddsDispatchHandler provides catalogs for the BES holdings, the static catalogs do not support the THREDDS datasetScan element, other than to provide links in the presentation view.
  • The BES catalogs do not support a mechanism for including inherited metadata down the catalog tree (normally this functionality comes from the datasetScan element in a static catalog). If the need for this arises it can be done, just let us know!


Rather than provide an exhaustive explanation of the THREDDS catalog functionality and configuration I will appeal to the existing documents provided by our fine colleagues at UNIDATA:

Did you read all that? Excellent!



Configuration Instructions

  • The THREDDS catalog configuration is stored in the file $CATALINA_HOME/content/opendap/catalog.xml


  • Each item that appears in the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory) should have a corresponding element as a child of the top level <catalog> element in the catalog.xml file. Collections (aka directories) are represented by a <datasetScan> element. Granules (files) are represented as <dataset> elements. It is not possible to map the top level directory of the BES (BES.Catalog.catalog.RootDirectory and BES.Data.RootDirectory) to a single <datasetScan> element in the THREDDS catalog.