RDH Catalog Organization: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
Line 1: Line 1:
== RDH Catalog ==
== RDH Catalog ==


The RDH needs to be able to provide a catalog of its data holdings. Since the RDH relies on a list of ODBC Data Sources in its configuration file (See the [[Adding the RDH to the BES | RDH Configuration Use Case]] and the [[RDH handles bes:showCatalog request | RDH Catalog Use Case ]]) it's catalog is flat.  
The RDH needs to be able to provide a catalog of its data holdings. Since the RDH relies on a simple list of ODBC Data Sources in its configuration file (See the [[Adding the RDH to the BES | RDH Configuration Use Case]] and the [[RDH handles bes:showCatalog request | RDH Catalog Use Case ]]) it's catalog is flat, just the simple list of Data Source names.  


In BES speak that means that RDH catalog response will contain only items that
In BES speak that means that RDH catalog contains no sub collections, aka containers, aka child catalogs. The RDH holdings can and should be represented as a simple list of DAP data sets. Each of these data sets has a DDS/DDX/DAS representation that can be accessed in the typical DAP manner. When responding to a bes:showCatalog request the RDH should return a catalog composed of a top level dataset that contains a list of dataset elements each of which has a ''node'' attribute whose value is false and (at minimum) a child bes:serviceRef element with a value of "dap" :
There are no collections, aka containers, aka child catalogs. The RDH holdings can and should be represented as a simple list of DAP data sets. Each of these data sets has a DDS/DDX/DAS representation that can be accessed in the typical DAP manner. When responding to a bes:showCatalog request the RDH should return a catalog composed of a top level dataset that contains a list of dataset elements each of which has a ''node'' attribute whose value is false and (at minimum) a child bes:serviceRef element with a value of "dap" :


<dataset catalog="catalog" lastModified="-1" name="dataSourceOne" node="false" size="-1">
  <response xmlns="http://xml.opendap.org/ns/bes/1.0#" reqID="some_unique_value">
    <serviceRef>dap</serviceRef>
    <showCatalog>
  </dataset>
        <dataset catalog="rdh" count="13" lastModified="2009-03-24T19:02:46" name="/" node="true" size="578">
            <dataset catalog="catalog" lastModified="-1" name="dataSourceOne" node="false" size="-1">
                <serviceRef>dap</serviceRef>
            </dataset>
            <dataset catalog="catalog" lastModified="-1" name="dataSourceTwo" node="false" size="-1">
                <serviceRef>dap</serviceRef>
            </dataset>
            <dataset catalog="catalog" lastModified="-1" name="dataSourceThree" node="false" size="-1">
                <serviceRef>dap</serviceRef>
            </dataset>
        </dataset>
    </showCatalog>
  </response>


Since the bes:dataset ''size'' and ''lastModified'' attributes may not have much meaning in this context they should be set to the appropriate value for missing value or mottied if that's the right thing to do...
==== ''size'' Attribute ====
 
What does the size of the dataset mean in this context?
* Is it the aggregate size of all of the holdings in the Data Source?
* Can that be easily determined through the ODBC API?
 
If determining the size of the holding is a sensible activity (there is a  decent API, it's time efficient, etc) then we should do it. Otherwise return a "-1" to indicate that the value is not known.
 
==== ''lastModified'' Attribute ====
 
What does the last modified date of the dataset mean in this context?
* Is it the the last time time that data was added to the Data Source?
* Is is the last time the Data Source definition was changed? 
* What happens if the Data Source defines a subset of the available holdings in the underlying RDBMS?
** Is last modified then the last time that one of the tables or views in the RDBMS made are available through the Data Source was changed?
* Can that be easily determined through the ODBC API?
 
If determining the last modified time of the holding is a sensible activity (there is a  decent API, it's time efficient, etc) then we should do it. Otherwise we should return a "-1" to indicate that the value is not known. (<font color=red>Is that right? What's the missing value for this in the BES XML API?</font>)


== Integration with existing BES catalog services ==
== Integration with existing BES catalog services ==

Revision as of 17:05, 7 May 2009

RDH Catalog

The RDH needs to be able to provide a catalog of its data holdings. Since the RDH relies on a simple list of ODBC Data Sources in its configuration file (See the RDH Configuration Use Case and the RDH Catalog Use Case ) it's catalog is flat, just the simple list of Data Source names.

In BES speak that means that RDH catalog contains no sub collections, aka containers, aka child catalogs. The RDH holdings can and should be represented as a simple list of DAP data sets. Each of these data sets has a DDS/DDX/DAS representation that can be accessed in the typical DAP manner. When responding to a bes:showCatalog request the RDH should return a catalog composed of a top level dataset that contains a list of dataset elements each of which has a node attribute whose value is false and (at minimum) a child bes:serviceRef element with a value of "dap" :

 <response xmlns="http://xml.opendap.org/ns/bes/1.0#" reqID="some_unique_value">
   <showCatalog>
       <dataset catalog="rdh" count="13" lastModified="2009-03-24T19:02:46" name="/" node="true" size="578">
           <dataset catalog="catalog" lastModified="-1" name="dataSourceOne" node="false" size="-1">
               <serviceRef>dap</serviceRef>
           </dataset>
           <dataset catalog="catalog" lastModified="-1" name="dataSourceTwo" node="false" size="-1">
               <serviceRef>dap</serviceRef>
           </dataset>
           <dataset catalog="catalog" lastModified="-1" name="dataSourceThree" node="false" size="-1">
               <serviceRef>dap</serviceRef>
           </dataset>
       </dataset>
   </showCatalog>
</response>

size Attribute

What does the size of the dataset mean in this context?

  • Is it the aggregate size of all of the holdings in the Data Source?
  • Can that be easily determined through the ODBC API?

If determining the size of the holding is a sensible activity (there is a decent API, it's time efficient, etc) then we should do it. Otherwise return a "-1" to indicate that the value is not known.

lastModified Attribute

What does the last modified date of the dataset mean in this context?

  • Is it the the last time time that data was added to the Data Source?
  • Is is the last time the Data Source definition was changed?
  • What happens if the Data Source defines a subset of the available holdings in the underlying RDBMS?
    • Is last modified then the last time that one of the tables or views in the RDBMS made are available through the Data Source was changed?
  • Can that be easily determined through the ODBC API?

If determining the last modified time of the holding is a sensible activity (there is a decent API, it's time efficient, etc) then we should do it. Otherwise we should return a "-1" to indicate that the value is not known. (Is that right? What's the missing value for this in the BES XML API?)

Integration with existing BES catalog services

The RDH will need to have it's own catalog representation in the BES.

http://localhost:8080/opendap/ http://localhost:8080/opendap/catalog/

http://localhost:8080/opendap/rdh/ http://localhost:8080/opendap/cedar/ http://localhost:8080/opendap/jgofs/



http://localhost:8080/opendap/