AMQP Support in Hyrax: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
Line 58: Line 58:
: Remove the dispatch handlers
: Remove the dispatch handlers


Three of these methods take the ''request'' object defined by the Servlet class (''request can be handled'', ''handle request'' and ''get last modified''). One on the methods also takes the ''response'' object from Servlet (''handle request'').
Three of these methods take the ''request'' object defined by the Servlet class (''request can be handled'', ''handle request'' and ''get last modified''). One of the methods also takes the ''response'' object from Servlet (''handle request'').
 


=== How the current OLFS dispatching works ===
=== How the current OLFS dispatching works ===

Revision as of 22:41, 7 December 2009

Figure 1. Hyrax Architecture at a High-level

Support for AMQP in hyrax is best handled by adding a new front-end to the server that can act as an AMQP client, reading information from an AMQP queue. Hyrax has an overall architecture that already supports this. Figure one shows a high-level view of the Hyrax architecture. The BES is the part f Hyrax that builds the bodies of a DAP response. The front-end (the OLFS) contains a set of handlers which respond to requests made using HTTP. Based on the request, the OLFS sends commands over a stateful connection to the BES asking it to make the correct response. Generally, the OLFS will have to parse the request URL and pass information from that URL to the BES. Even though the OLFS is designed to support several different 'protocols' like DAP or THREDDS, it is capable of responding to HTTP only (it is a Java Servlet; see Server Dispatch Operations for information about the OLFS design, implementation and extension capabilities). Thus, it makes the most sense to build a new front-end dedicated to AMQP.

While the architecture chosen to add support for AMQP is very important, other considerations are also critical to the success of the overall effort to run DAP over AMQP. One is the mapping between different DAP versions to AMQP. Since DAP was designed with HTTP in mind, how DAP and AMQP can best be matched merits serious consideration. This will entail looking at the current DAP implementation along with the evolving DAP, version 4, specification and its implementation.

Proposed Architecture for AMQP Support in Hyrax

Hyrax with AMQP

Figure 2. shows how a AMQP module could be added to work with Hyrax. The diagram implies that an actual installation could support both DAP/HTTP and DAP/AMQP interactions using one BES. That might be true, or it might not, depending on how the connection pooling is handled by the OLFS and new AMQP front-end. The DAP/SOAP interface of the OLFS is really different than the other three interfaces shown in the diagram because the SOAP messaging software uses the request document body instead of the URL while the other three interfaces/protocols all use the URL and ignore the request document body. However, it's still part of the OLFS because Hyrax uses SOAP messaging over HTTP.

One option we should consider is supporting DAP over AMQP using SOAP.

Sharing one BES between two front-ends

Because there is no limit within the BES on the number of beslistener processes created, any number of front ends can make connections to the BES and holds those connection sin a pool without concern that the 'pool will fill up'. This is a result of the Unix fork/exec model and the architectural design of Hyrax that has placed limits on the number of outstanding BES connections within the OLFS. A second front end could establish its own set of connections to completely separate processes. In the OLFS, the software that manages the connections to the BES can be found in BES.java.

How the OLFS connects to the BES

On start-up the OLFS makes a connection to the BES. When the OLFS is started, the BES Daemon (besdaemon) has already started and bound a well-known port (10002 by default; set in both the bes and olfs configuration files). The OLFS starts when Tomcat starts or restarts the servlet and initially makes a pool of connections. These are really TCP socket connections to specific instances of the BES listener (beslistener). When the OLFS gets a request it needs to process using the BES, it checks this pool of connections and picks the next available one. If no connections are available, then a new connection is made unless the maximum number of allowed connections have already been made. In the latter case the request for a the next available connection blocks until there is an available connection. The maximum number of connections to the BES, which is really the maximum number of BES listeners (i.e., processes) to make is set in the OLFS configuration file.

To see how the OLFS does this, look at BES.java, OPeNDAPClient.java, and NewPPTClient.java

Important points:

  1. Because the OLFS would block indefinitely if all of the beslisteners get 'stuck', the OLFS uses and inactivity timeout to kill beslisteners. (This is not the case in Hyrax 1.5 but will be in Hyrax 1.6; 300 seconds, hard coded, but it could be a configuatio parameter).
  2. The OLFS will dump connections from the pool after 2000 commands have been sent to a particular beslistener. This parameter is hard coded into the OLFS, but it could be read from the configuration file.
  3. The OLFS initially makes zero connections and not the maximum number of connections.
  4. The besdaemon and master beslistener do not know about the child processes that have been created.

Abstracting the OLFS/BES connection logic

To see how to abstract the connection logic so that the OLFS' connection pooling and configuration logic can be reused with a different transport protocol, look at the NewPPTClient.java class. This class effectively implements a simple interface with the methods:

init
Make the object that holds state for the request
open
Connect to a new BES listener
send request
Given that a connection to a server exists, send a request
process response
Wait for, and then process, a response to a request
close
Deallocate resources associated with this connection

In the explanation above, I used the word connection but it is really a virtual connection. In Java the InetAddress and Socket classes abstract the operations of socket-based IPC. the actual transport can be TCP, UDP, ...

Todo: Extrapolate from this class an interface, make this class implement that interface and then write an second class that provides for TCP tunneling over AMQP (for example) with a second implementation of that interface. It may also be that RabbitMQ provides a tight enough integration with Java's IPC classes that a more straightforward implantation is possible.

Abstracting the request-response logic of the OLFS

The OLFS is a dispatch handler, a giant switch statement that looks at each incoming request and shunts it to the correct software for processing. In many cases the BES is not actually involved in the processing or is involved in only a tangential way. In fact, however, the OLFS' dispatch code is more sophisticated than a switch statement. Instead it consists of two layers of processing where the outer layer is made up of a set of DispatchHandler classes. Each of these classes implements the DispatchHandler interface. This interface has five methods:

init
Initialize the handler; called when the OLFS starts
request can be handled
Called when a new request is presented to the OLFS and the dispatch logic is looking for a handler to process it. Returns true or false.
handle request
Perform whatever is required to build a response and process it
get last modified
Ask the BES for the last modified date of some resource that's central to processing the request.
destroy
Remove the dispatch handlers

Three of these methods take the request object defined by the Servlet class (request can be handled, handle request and get last modified). One of the methods also takes the response object from Servlet (handle request).

How the current OLFS dispatching works

How Best to Combine AMQP and HTTP

One approach to adapting DAP to a messaging architecture has already been implemented in our interface to Hyrax for SOAP and this might be the best starting point for an adaptation of DAP/HTTP to DAP and AMQP. One difference, however, that is likely to play a role in DAP over AMQP that doesn't show up in the SOAP interface is that DAP4 is now much farther along than when that software was written. The feature of DAP4 most important to this project is that DAP4 over HTTP no longer relies on HTTP headers as the sole way to return certain information. Instead, all information about a response is contained in the body of the response and some information is also contained in HTTP response headers to simplify writing HTTP clients and/or working with DAP2 clients. So, for example, the information about the version of DAP used to build a particular response is now part of the response body (in the <Dataset> element) and in the HTTP response header XDAP. This means that HTTP clients can figure out the version before the response document is parsed and other protocols (e.g., AMQP) can get it from the response itself.

I'll chime in here and say that as far as I can see the primary obstacle in moving the protocol to AMQP is the use of HTTP headers as the mechanism for version negotiation between the client and the server. The client tells the server what it wants and the server hands back a response that is requested version or lesser. If this was moved into the request URL via a mandatory server side function, say something like "version(x.y)" where "x" is the DAP major and "y" the DAP minor version, then I think using AMQP would simplified.--ndp 12:05, 3 December 2009 (PST)

See Also

  1. BES XML Commands and Hyrax - BES Client commands
  2. How to build the DataDDX response in/with Hyrax
  3. Hyrax SOAP API

Use Cases

In order to move forward and define the most useful way to use DAP2 and/or DAP4 over AMQP, we need to make suer there's a clear understanding of how the server is supposed to interact with the AMQP broker and how Hyrax within the OOI system will be used.

http://www.oceanobservatories.org/spaces/display/CIDev/Data+Exchange