Difference between revisions of "BES - Modules - Gateway Module"

From OPeNDAP Documentation
(Performance optimizations)
(Performance optimizations)
Line 20: Line 20:
 
  Gateway.MimeTypes+=h5:application/x-hdf5
 
  Gateway.MimeTypes+=h5:application/x-hdf5
  
=== Performance optimizations ===
+
=== Network proxies and Performance optimizations ===
 
;Gateway.ProxyHost
 
;Gateway.ProxyHost
 
;Gateway.ProxyPort: provide a way to define a proxy server that must be used to make the remote request. This can be used to increase performance as well as navigate firewalls.
 
;Gateway.ProxyPort: provide a way to define a proxy server that must be used to make the remote request. This can be used to increase performance as well as navigate firewalls.

Revision as of 21:53, 15 April 2011

The Gateway Service provides interoperability between Hyrax and other web services. Using the Gateway module, Hyrax can be used to access and subset data served by other web services so long as those services return the data in a form Hyrax has been configured to serve. For example, if a web service returns data using HDF4 files, then Hyrax, using the gateway module, can subset and return DAP responses for those data.

1 Special options supported by the handler

1.1 Limiting access to specific hosts

Because this handler behaves like a web client there are some special options that need to be configured to make it work. When we distribute the client, it is limited to accessing only the local host. This prevents misuse (where your copy of Hyrax might be used to access all kinds of other sites). This gateway's configuration file contains a 'whitelist' of allowed hosts. Only hosts listed on the whitelist will be accessed by the gateway.

Gateway.Whitelist
provides a list of URL of the form protocol://host.domain:port that will be passed through the gateway module. If a request is made to access a web service not listed on the Whitelist, Hyrax returns an error. Note that the whitelist can be more specific than just a hostname - it could in principal limit access to a specific set of requests to a particular web service.

example:

Gateway.Whitelist=http://test.opendap.org/opendap
Gateway.Whitelist+=http://opendap.rpi.edu/opendap

1.2 Recognizing responses

Gateway.MimeTypes
provides a list of mappings from data handler module to returned mime types. When the remote service returns a response, if that response contains one of the listed MIME types (e.g., application/x-hdf5) then the gateway will process it using the named handler (e.g., h5). Note that if the service does not include this information the gateway will try other ways to figure out how to work with the response.

These are the default types:

Gateway.MimeTypes=nc:application/x-netcdf
Gateway.MimeTypes+=h4:application/x-hdf
Gateway.MimeTypes+=h5:application/x-hdf5

1.3 Network proxies and Performance optimizations

Gateway.ProxyHost
Gateway.ProxyPort
provide a way to define a proxy server that must be used to make the remote request. This can be used to increase performance as well as navigate firewalls.
Gateway.ProxyProtocol= 
Gateway.ProxyHost=
Gateway.ProxyPort=

1.3.1 Using Squid

Squid makes a great cache for the gateway. In our testing we have used Squid only for services running on port 80.

====Using Squid on OS/X==== If you're uisng OS/X to run Hyrax, the easiest Squid port is SquidMan (http://web.me.com/adg/squidman/index.html). We tested version SquidMan 3.0 (Squid 3.1.1). Run the SquidMan application and under Preferences... General set the port to something like 3218, the cache size to something big (16GB) and Maximum object size to 256M. Click 'Save' and you're almost done.

Now in the gateway.conf file, set the proxy parameters like so:

Gateway.ProxyProtocol=http
Gateway.ProxyHost=localhost
Gateway.ProxyPort=3218

assuming you're running both Squid and Hyrax on the same host.

Restart the BES and you're all set.

Make some requests using the gateway (http://localhost/opendap/gateway) and click on SquidMan's 'Access Log' button to see the caching at work. The first access, which fetches the data, will say 'DIRECT/<ip number>' while cache hits will be labeled 'NONE/-'.