OPeNDAP Developer's Workshop 2010

From OPeNDAP Documentation
Revision as of 18:59, 7 October 2010 by Ndp (talk | contribs) (→‎Thursday)
⧼opendap2-jumptonavigation⧽

Where & When

  • Location: Troy, NY
  • Dates: 6-8 Oct

Meeting modus operandi

  • mockups
  • whiteboard designs
  • use cases
  • best to exploit face-to-face for what it affords.

Agenda


Wednesday

0830 - Coffee
0900 -


Server Side Functions

  • Registration
    • modularized
    • each function -> class
    • version
      • can you handle this
      • service
      • how do we package this
  • Discovery
    • Ask the OLFS/BES what functions for dataset
      • what functions
      • capabilities
  • Format
    • function(p1,p2,p3,p4...)
  • Usage
    • Run by handler during loading - slection
    • Run during constraint parsing
    • Run during transmission/serialization (constraint evaluation)
    • What about functions like "version"??


void
function_version(int, BaseType *[], DDS &, BaseType **btpp)

typedef void(*bool_func)(int argc, BaseType *argv[], DDS &dds, bool *result);
typedef void(*btp_func)(int argc, BaseType *argv[], DDS &dds, BaseType **btpp);
typedef void(*proj_func)(int argc, BaseType *argv[], DDS &dds, ConstraintEvaluator &ce);


Use Cases
  • Scientist has a dataset and wants to discover what functions are available to apply to that dataset
  • Data provider has a set of server-side functions that they wish to provide for data users. They want to add these modules to the OPeNDAP via dynamically loaded modules
  • Scientist has a dataset and wants to apply geogrid function to that dataset
  • Scientist wants to discover the the shape and size (metadata) of a particular function against a dataset
Plan: Define a class/hierarchy
  • Define a set of classes for server-saide functions
  • Have the BES instantiate those from .so files
  • Those instances are then passed into a ConstraintEvaluator instance which is then passed to a handler
  • The BES conf file(s) can include information so that some functions are always loaded and some are loaded only for specific handlers (e.g., FreeForm).
  • Take the server side functions out of libdap and add to BES modules
  • Add the server side functions to server side class
  • These classes will be able to handle things like get version, given a ddx(?) can this function be run against it (return true/false)
  • At load time the classes are created within the BES, the functions are registered with libdap. This maintains binary compatability.


Server Administration

Admin interface
  • admin connects to a listener
  • can restart a listener
  • can restart the BES with ADMIN
  • turn on/off debugging through admin interface
  • keep track of how many connections are allowed/active
  • re-write besdaemon/beslistener (2-3 hrs)
  • allowed size of response from BES (1 hr)
http://docs.opendap.org/index.php/Hyrax_Admin_Interface
BES design changes
  • drop the besdaemon from the architecture
  • listener is started instead of daemon
  • listener keeps track of connections made and socket connected to
  • listener keeps track of connections dropped, removes from list
  • limit the number of open connections
  • besdaemon will be able to have both hard and graceful restart
  • Out of band comm will enable access to logs and conf files in addition to the restarts
  • capability to send back configuration information
  • capability to receive new configuration information and write to disk and reload
  • we don't sweat the fact that the logs are shared - we can build a fancy interface out at the servlet level to filter by specific client using the PID info in the log.
  • Changes in the conf only are used when a process is started.
  • Soft shutdown, listener sends certain signal to child, when done handling current request, go down
  • Hard shutdown, listener sends certain signal to child, go down now
  • capability to stream back the log file
Authentication and Authorization
  • Will people be authenticating through ESG to get to Hyrax? Or will Hyrax need to have an authentication piece? Authorization to ESG AuthZ.
  • encrypting the data? Is it through the BES or through the middle tier? If middle tier then no need to encrypt the information through the BES.
  • client application to need to allow login to get data? Scientist gets a URL back to access data through OPeNDAP. To get that data, have to be logged in? Authorization needs to happen.
  • Certificate is on the client disk, when request made within matlab client, grabs certificate and sends along with data URL
Throttling the response
  • Function added to DDS and base types to be able to pre-compute the response size based on projections and selections
  • Should also work for sequences, at least be able to say that you can have x number of rows returned.

Lunch

James : Pizza Margarita
Dan : Corned Beef
Patrick : Veggy Pizza
Michael : Chikin Sandwich
Nathan : Turkey Club Sandwich

Easier install for Hyrax

  1. Modify nightly build to use shrew
  2. Build RPMs from the NB shrew
    1. Add dependencies to all RPM spec files
  3. Modify all Makefiles to support pkg/dmg binaries
  4. Integrate into NB on OS/X Server VMs
  5. build meta package for OS/X using above
  6. Run RPM/Linux NB on VMs too
  7. Test both PRMs and pkg/dmgs on completely clean VMs

SQL Handler

  1. Add to README and INSTALL; fill these in and make them standard WRT the other README/INSTALL files.
  2. Add to the README so that there is a simple 'How To' for the server's configutation and then write stuff up for the docs wiki.
  3. Look at SQL Handler requirements in docs wiki and match up to current developed functionality
  4. Getting attributes

JGOFS

  • Still a lot of organizations using jgofs and wanting opendap access
  • Current library is very difficult to work with and still fork/exec the methods
  • Could take the library and convert to use the autotools
    • Also change from fork and exec to dynamically loaded module
  • Or be a pass-thru/gateway/proxy module, like WCS, and just pass the request back to the proper JGOFS server that is already running.
    • Figure out how to return catalog information from the jgofs servers
    • How do we get attributes/metadata...

Active file system

A filesystem that creates a "signal" whenever something changes.

In many cases this boils down to caching binary objects in/around the BES




1800 - Dinner

Mmmmmmm Irish/Mexican



Thursday

0830 - Coffee

Mmmmmm that was tasty...

NcML Ingestion

  • Sending in NcML from a client
  • Use a proxy server to shield clients from the mechanics
    • The aggregations need to persist and at least some of this must be at the origin server
    • The NcML is held at a proxy server and sent to the origin server where it's validated, etc. and the origin server (must be a Hyrax) returns an id that can be used to reference the aggregation
    • The proxy retains that id and takes requests against it.
  • Can we simplfy this and remove the proxy given that the aggregation has to persist?
  • How long does the aggregation persist? That is, what criteria are used to 'flush' an 'uploaded' aggregation.
Why we want this?
  • EML --> Building aggreagations at a client
  • TDS migration
Design mechanics for this feature
  • Errors: How to get information about bad ncml back to a client
  • Assme that the server (Hyrax) is modified to use POST for this feature
  • Handle errors by returning text as built by the NCML handler
  • OLFS currently has no POST termini and we could make an explict terminus for this but we could make a special servlet for this feature
Alternatives
  • Two things: We add support for HTTP acccess to the ncml handler - this is how we handle aggreagtions of remote entities
  • For the NcML migration we modify the BES/NcML commands to support uploading NCML from the OLFS to a specific BES/NCML-handler instance
  • Advantages: This solves the first problem w/o any changes to exisitng origin servers at the cost of poor accesses to real-world aggreations. Actually useful aggregations would be sent to the 'origin server' by hand between people.
  • There seem to be no real increased costs for NCML migration.
To support exteranal access
  • NcML handler becomes a HTTP client
  • Might require remote URLs to be listed explicitly? Performance implemented

BES Internal Caching

Caching compressed data
  • The BES caching issue is only regarding the compressed files, their decompression and maintaining a restricted 'cache' size.
  • This caching scheme needs to include a two stage cache where a single space with no size limit is used for at most one item. This is used for the actual decompression phase. Only one writer may access this at any time.
  • A second space is used to store N-bytes of decompressed files. This space must lock all accesses during a write (and must block on both reads and writes). Call this the 'main cache.'
  • The main cache must store spin locks for read access to all of the cache items.
  • Modify access to items in the cache so that they are accessed using an object such that it's dtor updates the spin lock (so that the spin lock is manged even when exceptions are thrown).
  • The object (above) also provides access to a path to the cached item; this path is accessed and passed to the handlers.
  • See libdap's HTTPCache and Response classes for an example implmentation
Caching binary objects
  • While it's desireable it's also really hard to retro-fit
  • To serialize the DDS we would have to serialize BaseType and all of the concrete classes that are derived from it (NCByte, ...)
  • It's not clear that caching these kinds o objects will provide a huge benefit for users/clients.
The NcML handler will have to do its own caching
  • It will need to support the Last Modified Time of aggregations if we're going to get caching systems like Squid working correctly.

Lunch

James : Veggie Burger
Dan : Something tasty
Patrick : Veggy Burger
Michael : Corned Beef on Rye
Nathan : Chikin Salad

Strategy Breakout

  • Peter, Dan, James."

NcML Ingestion Break Out

  • Patrick, Micheal, Nathan
  • Instead of reading NCML file, taking an NCML document (1 hr)
  • Issues with adding new BES commands? While adding this new functionality document current interface for adding commands to the BES architecture.


1800 - Dinner


Friday

0830 - Coffee

Module check

Making sure a loaded module matches the version of BES and any required modules


SSL authentication/authorization issues

Currently problem with SSL authentication and keeping SSL channel open for secure communication


Multiple catalogs

BES and OLFS integration


1230 - Lunch

Strategy Breakout (cont)

1800 - Dinner and Departure