RDH Catalog Organization: Difference between revisions
Line 1: | Line 1: | ||
== RDH Catalog == | == RDH Catalog == | ||
The RDH needs to be able to provide a catalog of its data holdings. Since the RDH relies on a list of ODBC Data Sources in its configuration file (See the [[Adding the RDH to the BES | RDH Configuration Use Case]] and the [[RDH handles bes:showCatalog request | RDH Catalog Use Case ]]) it's catalog is flat. | The RDH needs to be able to provide a catalog of its data holdings. Since the RDH relies on a simple list of ODBC Data Sources in its configuration file (See the [[Adding the RDH to the BES | RDH Configuration Use Case]] and the [[RDH handles bes:showCatalog request | RDH Catalog Use Case ]]) it's catalog is flat, just the simple list of Data Source names. | ||
In BES speak that means that RDH catalog | In BES speak that means that RDH catalog contains no sub collections, aka containers, aka child catalogs. The RDH holdings can and should be represented as a simple list of DAP data sets. Each of these data sets has a DDS/DDX/DAS representation that can be accessed in the typical DAP manner. When responding to a bes:showCatalog request the RDH should return a catalog composed of a top level dataset that contains a list of dataset elements each of which has a ''node'' attribute whose value is false and (at minimum) a child bes:serviceRef element with a value of "dap" : | ||
<response xmlns="http://xml.opendap.org/ns/bes/1.0#" reqID="some_unique_value"> | |||
<showCatalog> | |||
</ | <dataset catalog="rdh" count="13" lastModified="2009-03-24T19:02:46" name="/" node="true" size="578"> | ||
<dataset catalog="catalog" lastModified="-1" name="dataSourceOne" node="false" size="-1"> | |||
<serviceRef>dap</serviceRef> | |||
</dataset> | |||
<dataset catalog="catalog" lastModified="-1" name="dataSourceTwo" node="false" size="-1"> | |||
<serviceRef>dap</serviceRef> | |||
</dataset> | |||
<dataset catalog="catalog" lastModified="-1" name="dataSourceThree" node="false" size="-1"> | |||
<serviceRef>dap</serviceRef> | |||
</dataset> | |||
</dataset> | |||
</showCatalog> | |||
</response> | |||
==== ''size'' Attribute ==== | |||
What does the size of the dataset mean in this context? | |||
* Is it the aggregate size of all of the holdings in the Data Source? | |||
* Can that be easily determined through the ODBC API? | |||
If determining the size of the holding is a sensible activity (there is a decent API, it's time efficient, etc) then we should do it. Otherwise return a "-1" to indicate that the value is not known. | |||
==== ''lastModified'' Attribute ==== | |||
What does the last modified date of the dataset mean in this context? | |||
* Is it the the last time time that data was added to the Data Source? | |||
* Is is the last time the Data Source definition was changed? | |||
* What happens if the Data Source defines a subset of the available holdings in the underlying RDBMS? | |||
** Is last modified then the last time that one of the tables or views in the RDBMS made are available through the Data Source was changed? | |||
* Can that be easily determined through the ODBC API? | |||
If determining the last modified time of the holding is a sensible activity (there is a decent API, it's time efficient, etc) then we should do it. Otherwise we should return a "-1" to indicate that the value is not known. (<font color=red>Is that right? What's the missing value for this in the BES XML API?</font>) | |||
== Integration with existing BES catalog services == | == Integration with existing BES catalog services == |
Revision as of 17:05, 7 May 2009
RDH Catalog
The RDH needs to be able to provide a catalog of its data holdings. Since the RDH relies on a simple list of ODBC Data Sources in its configuration file (See the RDH Configuration Use Case and the RDH Catalog Use Case ) it's catalog is flat, just the simple list of Data Source names.
In BES speak that means that RDH catalog contains no sub collections, aka containers, aka child catalogs. The RDH holdings can and should be represented as a simple list of DAP data sets. Each of these data sets has a DDS/DDX/DAS representation that can be accessed in the typical DAP manner. When responding to a bes:showCatalog request the RDH should return a catalog composed of a top level dataset that contains a list of dataset elements each of which has a node attribute whose value is false and (at minimum) a child bes:serviceRef element with a value of "dap" :
<response xmlns="http://xml.opendap.org/ns/bes/1.0#" reqID="some_unique_value"> <showCatalog> <dataset catalog="rdh" count="13" lastModified="2009-03-24T19:02:46" name="/" node="true" size="578"> <dataset catalog="catalog" lastModified="-1" name="dataSourceOne" node="false" size="-1"> <serviceRef>dap</serviceRef> </dataset> <dataset catalog="catalog" lastModified="-1" name="dataSourceTwo" node="false" size="-1"> <serviceRef>dap</serviceRef> </dataset> <dataset catalog="catalog" lastModified="-1" name="dataSourceThree" node="false" size="-1"> <serviceRef>dap</serviceRef> </dataset> </dataset> </showCatalog> </response>
size Attribute
What does the size of the dataset mean in this context?
- Is it the aggregate size of all of the holdings in the Data Source?
- Can that be easily determined through the ODBC API?
If determining the size of the holding is a sensible activity (there is a decent API, it's time efficient, etc) then we should do it. Otherwise return a "-1" to indicate that the value is not known.
lastModified Attribute
What does the last modified date of the dataset mean in this context?
- Is it the the last time time that data was added to the Data Source?
- Is is the last time the Data Source definition was changed?
- What happens if the Data Source defines a subset of the available holdings in the underlying RDBMS?
- Is last modified then the last time that one of the tables or views in the RDBMS made are available through the Data Source was changed?
- Can that be easily determined through the ODBC API?
If determining the last modified time of the holding is a sensible activity (there is a decent API, it's time efficient, etc) then we should do it. Otherwise we should return a "-1" to indicate that the value is not known. (Is that right? What's the missing value for this in the BES XML API?)
Integration with existing BES catalog services
The RDH will need to have it's own catalog representation in the BES.
http://localhost:8080/opendap/ http://localhost:8080/opendap/catalog/
http://localhost:8080/opendap/rdh/ http://localhost:8080/opendap/cedar/ http://localhost:8080/opendap/jgofs/