Difference between revisions of "Hyrax - BES PPT"

From OPeNDAP Documentation
(State Diagram)
(PPT Chunking)
Line 63: Line 63:
  
 
== PPT Chunking ==
 
== PPT Chunking ==
 +
 +
=== Chunking Scheme ===
 +
<pre>
 +
      chunked-body    =  *chunked-section last-chunk
 +
 +
      chunked-section  = chunk | chunk-extension
 +
     
 +
      chunk-extension  = chunk-size "x"  1*(chunk-ext-name [ "=" chunk-ext-val ] ";")
 +
 +
      chunk            = chunk-size "d" chunk-data
 +
 +
      chunk-size      = 7HEXDIG
 +
      last-chunk      = 7("0") "d"
 +
 +
      chunk-ext-name  = token
 +
                          ; sequence of 7-bit ASCII printable chars
 +
 +
      chunk-ext-val    = token | quoted-string
 +
                          ; quoted-string is DQUOTE token DQUOTE
 +
 +
      chunk-data      = chunk-size(OCTET)
 +
                          ; exactly chunk-size bytes
 +
</pre>
  
 
== Communication Flow ==
 
== Communication Flow ==

Revision as of 05:48, 28 January 2009

1 What is PPT

PPT stands for Point to Point Transport. It is based on TFTP, which stands for Trivial File Transfer Protocol. PPT is a protocol developed for OPeNDAP by Jose Garcia at UCAR and is based on RFC 1350 (see http://en.wikipedia.org/wiki/TFTP for more information). Our implementation uses strings as tokens, but that is not part of the protocol, you may implement your tokens anyway you want.

PPT started as an implementation of the RFC for TFTP using UDP, then moved to TCP in order to avoid the ACK required for each packet transmitted.

Here is some information about TFTP and how PPT differs:

  • It uses UDP (port 69) as its transport protocol (unlike FTP which uses TCP port 21). - PPT uses tcp.
  • It cannot list directory contents. - PPT does not deal with any commands, it passes those to the next layer up.
  • It has no authentication or encryption mechanisms. - We added that using standard SSL/X509
  • It is used to read files from, or write files to, a remote server. - Same deal, notice that for PPT, files are data objects.
  • It supports three different transfer modes, "netascii", "octet" and "mail", with the first two corresponding to the "ASCII" and "image" (binary) modes of the FTP protocol; the third is now obsolete and is hardly ever used. - PPT uses only tuneable binary buffers.
  • The original protocol has a file size limit of 32 MB, although this was extended when RFC 2347 introduced block-size negotiation in 1998 (allowing a maximum of 4 GB and potentially higher throughput). - No limit on size for PPT.
  • Since TFTP utilizes UDP, it has to supply its own transport and session support. Each file transferred via TFTP constitutes an independent exchange. That transfer is performed in lock-step, with only one packet (either a block of data, or an 'acknowledgement') ever in flight on the network at any time. Due to this lack of windowing, TFTP provides low throughput over high latency links. - Obsolete for PPT.
  • Due to the lack of security, it is dangerous over the open Internet. Thus, TFTP is generally only used on private, local networks. - It used to be that for this reason PPT was behind HTTPS and GridFTP. Now PPT has its own layer of security BUT it can be set up to be run without security BEHIND HTTPS and GridFTP (three tier model)

The BES architecture is a client/server architecture in which a server application sits listening for OPeNDAP requests. These requests can be for OPeNDAP data structures just as the CGI version serves (DAS, DDS, DataDDS, etc...) or it can serve requests for other data formats and other requests. An OPeNDAP client in this architecture connects to an OPeNDAP server via TCP or UNIX sockets. The client sends a message to the server requesting a PPT connection using a simple token. The server responds with a token letting the client know whether the connection is accepted or rejected. Once a connection has been established the client sends OPeNDAP requests to the server.

The initial implementation of the PPT protocol used tokens to signify the end of a transmission. This token would be placed at the end of a transmission between client to server and server to client. The receiving end would need to search the entire buffer, character for character, for this end-of-transmission token. For smaller transmissions this isn't such a big issue. But for large data transmissions, for example in response to a 'get dods' request, the receiving end would spend quite a bit of processing time searching for this end-of-transmission token.

To get around this we decided to switch to chunking, where the first characters received by an application would be the size of the chunk followed by that many bytes of data. To signify the end of the transmission, a chunk is sent with chunk size of zero.

2 Initial handshaking in the PPT

In the PPT scheme the server is the software that is listening for a connection, accepting commands, and providing responses. The client is the software that initiates a connection, issues commands, and receives responses.

  1. The client opens a socket connection to the server.
  2. The server accepts the connection.
  3. The client sends the string "PPTCLIENT_TESTING_CONNECTION" to the server.
  4. The server responds with one of the following tokens:
    1. "PPTSERVER_CONNECTION_OK" - If it will accept the session connection for PPT based message exchange.
    2. "PPT_PROTOCOL_UNDEFINED" - If it is busy or otherwise unavailable.
    3. "PPTSERVER_AUTHENTICATE" - Authentication is required for connection

Any other response from the server indicates that initiating the session failed and that the client should abandon the connection.

3 PPT Authentication

If PPTSERVER_AUTHENTICATE is returned, then the client must make an SSL connection to the server in order to authenticate. The hand-shaking continues in this case as follows:

  1. The client sends the token "PPTCLIENT_REQUEST_AUTHPORT", which lets the server know that it understands the request for authentication and to request the SSL port to connect to.
  2. The server responds with the SSL port to connect to
  3. The client opens an SSL connection to the server on this port
  4. SSL authentication proceeds as normal.

For SSL authentication, the client and the server must be configured, both through the use of a BES configuration file. The server side requires the following parameters:

BES.ServerSecure=yes|no
BES.ServerSecurePort=<port number>
BES.ServerCertFile=/full/path/to/serverside/certificate/file.pem
BES.ServerCertAuthFile=/full/path/to/serverside/certificate/authority/file.pem
BES.ServerKeyFile=/full/path/to/serverside/key/file.pem

The client side requires the following paramters:

BES.ClientCertFile=/full/path/to/clientside/certificate/file.pem
BES.ClientCertAuthFile=/full/path/to/clientside/certificate/authority/file.pem
BES.ClientKeyFile=/full/path/to/clientside/key/file.pem

Once the secure connection has been accepted, authentication is successful, the SSL connection is dropped and responses and requests are handled through the initial non-secure connection.

Once the connection is established all further communications will be chunked as described in the following sections.

4 PPT Chunking

4.1 Chunking Scheme

       chunked-body     =  *chunked-section last-chunk

       chunked-section  = chunk | chunk-extension
      
       chunk-extension  = chunk-size "x"  1*(chunk-ext-name [ "=" chunk-ext-val ] ";")

       chunk            = chunk-size "d" chunk-data 

       chunk-size       = 7HEXDIG
       last-chunk       = 7("0") "d"

       chunk-ext-name   = token
                          ; sequence of 7-bit ASCII printable chars

       chunk-ext-val    = token | quoted-string
                          ; quoted-string is DQUOTE token DQUOTE

       chunk-data       = chunk-size(OCTET)
                          ; exactly chunk-size bytes

5 Communication Flow

Communication flow.jpg

6 State Diagram

Client state diagram.jpg                              Server state diagram.jpg