Use cases for swath and time series aggregation: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
(Created page with " <-- Back Use cases for satellite Swath and Time Series aggregation. Our general approach is to use the Sequence data type to aggregate g...")
 
Line 12: Line 12:


* For Level 2 data, here are a couple of examples:
* For Level 2 data, here are a couple of examples:
** GLAS:  
** GLAS: The folders are all named GLA<N>* or GLAH<N>* N >= 10 are L2 products (GLA10, GLA11, GLAH10, etc) http://nsidc.org/opendap/GLAS/contents.html
The folders are all named GLA<N>* or GLAH<N>* N >= 10 are L2 products (GLA10, GLA11, GLAH10, etc)
** MODAPS (MODIS): FTP server is here: ftp://nrt1.modaps.eosdis.nasa.gov/allData/1/. Log in with your URS credentials (create an account here: https://urs.earthdata.nasa.gov); Folders ending in _L2 contain level 2 data.
http://nsidc.org/opendap/GLAS/contents.html
** MODAPS (MODIS):
FTP server is here: ftp://nrt1.modaps.eosdis.nasa.gov/allData/1/
Log in with your URS credentials (create an account here: https://urs.earthdata.nasa.gov)
Folders ending in _L2 contain level 2 data.


Some notes:
My notes:
* Level three data will be DAP2 Grids or DAP4 Coverages and look like they can easily be aggregated using NcML. We might think about a function that could aggregate them, but it's not in scope for this task.
* Level three data will be DAP2 Grids or DAP4 Coverages and look like they can easily be aggregated using NcML. We might think about a function that could aggregate them, but it's not in scope for this task.
* The GLAS data are stored in one-dimensional arrays. These are time series data: HDF5_GLOBAL.featureType: timeSeries. The GLAH files are HDF5 files. The one I looked at has 1Hz, 40Hz and 0.25Hz (4s) data. for each of the sample rates, there are a d_lat, d_lon and UTCTime arrays along with a large number of dependent variables in arrays. There are also some browse images. For some of the time series data there are two dims where the second dim provides cloud layer info (that is, values were gathered for cloud top and bottom for each of 10 layers.
* The GLAS data are stored in one-dimensional arrays. These are time series data: HDF5_GLOBAL.featureType: timeSeries. The GLAH files are HDF5 files. The one I looked at has 1Hz, 40Hz and 0.25Hz (4s) data. for each of the sample rates, there are a d_lat, d_lon and UTCTime arrays along with a large number of dependent variables in arrays. There are also some browse images. For some of the time series data there are two dims where the second dim provides cloud layer info (that is, values were gathered for cloud top and bottom for each of 10 layers.

Revision as of 00:09, 23 January 2015

<-- Back

Use cases for satellite Swath and Time Series aggregation. Our general approach is to use the Sequence data type to aggregate granules from Swath and Time series data sets (with themselves, not to mix the two, although the latter would be possible in general). Data will be read from arrays and loaded into a Sequence object, where it will be filtered and concatenated with other sequence objects. The result will be the aggregate. Of course, this will have to be optimized...

Sample data

(Information from Patrick Quinn)

  • Level 3 are easy to find. Some examples:

You can use Earthdata Search to find Level 3: https://search.earthdata.nasa.gov/search?m=0.0703125!0.140625!2!1!0!&ff=Subsetting+Services Click on the icon next to any dataset and click on the "API Endpoints" tab. That will give you the OPeNDAP endpoint.

My notes:

  • Level three data will be DAP2 Grids or DAP4 Coverages and look like they can easily be aggregated using NcML. We might think about a function that could aggregate them, but it's not in scope for this task.
  • The GLAS data are stored in one-dimensional arrays. These are time series data: HDF5_GLOBAL.featureType: timeSeries. The GLAH files are HDF5 files. The one I looked at has 1Hz, 40Hz and 0.25Hz (4s) data. for each of the sample rates, there are a d_lat, d_lon and UTCTime arrays along with a large number of dependent variables in arrays. There are also some browse images. For some of the time series data there are two dims where the second dim provides cloud layer info (that is, values were gathered for cloud top and bottom for each of 10 layers.
    • Suppose we want to aggregate a bunch of granules of these data? We can build a table of lat, lon, time[, cloud layer] and zero or more dependent variables for each granule, concatenate them and filter them. Optimizations include filtering before concatenating and (further) reading only data that would pass the filter in the first place.
    • By including a granule name, and using nested Sequences, we can include useful metadata and make it easier to transform the resulting sequence back into an array. The nested Seq could be flattened for a return (as DAP binary or CSV).
  • The MODIS data are typical MODIS L2 products with a number of dependent vars in 2D arrays and two 2D arrays, one for lat and lon.
    • We could read these data into a table with lat, lon and zero or more dependent values. Concatenate and filter. Optimizations are to read just the data needs and/or filter before concatenation. Could add granule and array index information to simplify transformation back from the seq to an array

I'm going to close this spike. The larger task in this sprint for this aggregation topic is to design the function; I'm going to write up some use cases and ask Patrick if they describe his needs.

Here are some URLs I used to get data: