OPULS: UGrid Subsetting: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
Line 28: Line 28:
: And then once your browser gets ahold of it:
: And then once your browser gets ahold of it:
:: http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/hyrax/ebs/Ike/2D_varied_manning_windstress/test_dir-norename.ncml.dds?ugr3(0,depth,%2228.0%3Clat%20&%20lat%3C29.0%20&%20-89.0%3Clon%20&%20lon%3C-88.0%22)
:: http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/hyrax/ebs/Ike/2D_varied_manning_windstress/test_dir-norename.ncml.dds?ugr3(0,depth,%2228.0%3Clat%20&%20lat%3C29.0%20&%20-89.0%3Clon%20&%20lon%3C-88.0%22)
; Constraining the range variables
; Subsetting the range variables
: In the example dataset
: In the example dataset
:: http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/ebs/Ike/2D_varied_manning_windstress/test_dir-norename.ncml.dds
:: http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/ebs/Ike/2D_varied_manning_windstress/test_dir-norename.ncml.dds
: We can see that there are three range variables associated with the nodes of the mesh that have a second dimension 'time'. Since the 'time' dimension is largish, with 1081 values, it may be that you wish to only retrieve one time slice. This can accomplished by using a dap4 array constraint onb the variable as it is passed to the ugrid function. For example, to retrieve the 5th time slice of the variable zeta you would constrain it like 'zeta[4][*]' (note that the indices begin at zero). In the request URL this would look like
: We can see that there are three range variables associated with the nodes of the mesh that have a second dimension 'time'. Since the 'time' dimension is largish, with 1081 values, it may be that you wish to only retrieve one time slice. This can accomplished by using a dap4 array constraint on the variable as it is passed to the ugrid function. For example, to retrieve the 5th time slice of the variable zeta you would constrain it like 'zeta[4][*]' (note that the indices begin at zero). In the request URL this would look like
:: <nowiki>http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/ebs/Ike/2D_varied_manning_windstress/test_dir-norename.ncml.dds?ugr5(0,zeta[4][*],%2229.3%3Elat&lat%3C29.8&-95.0%3Elon&lon%3C-94.4%22) </nowiki>
:: <nowiki>http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/ebs/Ike/2D_varied_manning_windstress/test_dir-norename.ncml.dds?ugr5(0,zeta[4][*],%2229.3%3Elat&lat%3C29.8&-95.0%3Elon&lon%3C-94.4%22) </nowiki>
: It is also possible to request, say, every 5th time slice 'zeta[0:5:*][*]' or every 3rd time slice from 3 to 30 'zeta[3:3:30][*]'  in the request URL this would look like
: It is also possible to request, say, every 5th time slice 'zeta[0:5:*][*]' or every 3rd time slice from 3 to 30 'zeta[3:3:30][*]'  in the request URL this would look like

Revision as of 00:42, 26 November 2013

Overview

Adding a server side function to hyrax that allows the user to subset unstructured grid (triangular mesh) data objects based on values in their spatial range.


10/29/2013 Update

New version of the server up and running with an updated ugrid subsetting function. This version allows for constraining the additional dimensions of the range variables as they are passed into the ugrid subsetting function, thus you can ask for 1 or more times steps, conditions etc.

Please check it out, details below. It's a "test" server that I am using to work with NOAA to explore the use of AWS storage services, so it's not going to be rock solid. If you want to use it for a demo, tell me and I'll work with you to see it running when you want it.


The Function

Name
ugr5 - Restrict the domain of an unstructured 2D Mesh Topology grid
Syntax
ugr5(0, rangeVariable:string, [rangeVariable:string, ... ] condition:string)
  • The first parameter is currently required to have value zero (0) or two (2), to indicate that the supplied relational constraint condition (the last parameter in the function call) is to be applied to coordinate values at the nodes of the mesh or the faces. Edges (represented by a 1) are not currently supported.
  • The second parameter is a list of one or more data variables that you wish to subset, they may be associated with nodes or faces of the mesh.
  • The last parameter is string whose value is an expression that defines conditions that the domain variables must meet in order to be included in the result.
Here is an example:
ugr5(0,depth,"28.0<lat & lat<29.0 & -89.0<lon & lon<-88.0")
Used in a url:
http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/hyrax/ebs/Ike/2D_varied_manning_windstress/test_dir-norename.ncml.dds?ugr3(0,depth,"28.0<lat & lat<29.0 & -89.0<lon & lon<-88.0")
And then once your browser gets ahold of it:
http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/hyrax/ebs/Ike/2D_varied_manning_windstress/test_dir-norename.ncml.dds?ugr3(0,depth,%2228.0%3Clat%20&%20lat%3C29.0%20&%20-89.0%3Clon%20&%20lon%3C-88.0%22)
Subsetting the range variables
In the example dataset
http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/ebs/Ike/2D_varied_manning_windstress/test_dir-norename.ncml.dds
We can see that there are three range variables associated with the nodes of the mesh that have a second dimension 'time'. Since the 'time' dimension is largish, with 1081 values, it may be that you wish to only retrieve one time slice. This can accomplished by using a dap4 array constraint on the variable as it is passed to the ugrid function. For example, to retrieve the 5th time slice of the variable zeta you would constrain it like 'zeta[4][*]' (note that the indices begin at zero). In the request URL this would look like
http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/ebs/Ike/2D_varied_manning_windstress/test_dir-norename.ncml.dds?ugr5(0,zeta[4][*],%2229.3%3Elat&lat%3C29.8&-95.0%3Elon&lon%3C-94.4%22)
It is also possible to request, say, every 5th time slice 'zeta[0:5:*][*]' or every 3rd time slice from 3 to 30 'zeta[3:3:30][*]' in the request URL this would look like
http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/ebs/Ike/2D_varied_manning_windstress/test_dir-norename.ncml.dds?ugr5(0,zeta[3:3:30][*],%2229.3%3Elat&lat%3C29.8&-95.0%3Elon&lon%3C-94.4%22)

The Data

I have a new server up with the ugrid data and subsetting function on board here: http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/

In this dir: http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/ebs/ You can find the datasets (hex.nc, test4.nc) that were used by Bill & Scott at UW to develop the first ugrid restrict using the gridfields library. However their code was based on an older version of the ugrid spec, and it seems the specification shifted out from under their code. I worked on the code to get it to respond to datasets organized as described here: https://publicwiki.deltares.nl/display/NETCDF/Deltares+CF+proposal+for+Unstructured+Grid+data+model

I adapted their original test data using ncml files. I wrote an ncml file for test4.nc ( called test4-ugrid.ncml in the same directory) that makes it compliant with the Deltares specification and it can be subset with the ugrid restrict function. http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/ebs/test4-ugrid.ncml Also in that directory is the time-sliced dataset (fvcom_1step.nc) that you provided http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/ebs/fvcom_1step.nc Which can be subset using the ugrid restrict function.

Also, a while back I tracked down and downloaded on of the larger, multi-time-step fvcom ugrid data files, it's now set up here: http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/ebs/Ike/ Inside you'll find that the 2D models have useful stuff. In http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/ebs/Ike/2D_varied_manning_windstress/ You should see the original nmcl files from the source server. These weren't compatible with Hyrax - mostly the result of Hyrax not being as "elastic" as the TDS with its inputs.

The file test_dir-norename.ncml can be subset using the ugrid restrict function: http://ec2-54-242-224-73.compute-1.amazonaws.com:8080/opendap/ebs/Ike/2D_varied_manning_windstress/test_dir-norename.ncml The file test_dir.ncml is closer to the original ncml file 00_dir.ncml, unfortunately it only works superficially. You can get the metadata responses for this dataset, but there is a bug in the ncml code that prevents ugrid restrict function from returning data values for renamed variables. The test_dir-norename.ncml is simply test_dir.ncml without the renamed variables.



Places for improvement

ugr3(0,zeta[18][*],"28.0<lat & lat<29.0 & -89.0<lon & lon<-88.0")
where zeta is defined "Float32 zeta[time = 1081][node = 417642];"??
UPDATE: YES! This works now!


  1. Make individual functions that can be 'nested'
    regrid(ugr(....),....)
    ugr(regrid(....))
    Do we let the users expression dictate the efficency of the operation, or do we try to optimize (subset before regrid etc.) and how would we do that?
  2. Can we restrict time series like this (using the current syntax)?? As of 10/29/13 we have this functionality (see above syntax description)
    We need a way to do it, either in the ugrid code or elsewhere. This example (potentially) uses DAP but it may not be support and there may be issues making it happen. Maybe DAP4 syntax would make this easier? James?
    ?zeta[0:15:1000][*],ugr3(0,zeta,"28.0<lat & lat<29.0 & -89.0<lon & lon<-88.0")
  3. What about nesting using some kind of wild card for the current (ugrid) dataset. This keeps the type signature constant. ($ is the entire dataset which is a ugrid dataset, and is the inout to the interior ugr3 function call)
    datasetName?ugr3(2,ugr3(0,$,"28.0<lat & lat<29.0 & -89.0<lon & lon<-88.0"),"temp>12")
  4. Create a super restrict function that can operate at any rank and that always operates on the entire ugrid dataset
    datsetname?superrestrict("28.0<lat & lat<29.0 & -89.0<lon & lon<-88.0","","temp>12")
  5. If I understood Bill correctly we may be able to reorganize the code and make a couple more calls into gridfields which would allow us to restrict the coordinate topology (but preserve the coordinate index mapping) so that then we can load one variable at a time, subset the variable and then transmit it. Which I think we can make work for both one and N dimensional variables.
    While we still marshal the response in memory the new code (10/29/13) uses index variables to determine the subset by utilizing the ugrid library once, and then it reads just the minimum data required to fulfill the restricted response.