Hyrax - BES PPT

From OPeNDAP Documentation
Revision as of 06:11, 28 January 2009 by PatrickWest (talk | contribs) (PPTCAPI)

1 What is PPT

PPT stands for Point to Point Transport. It is based on TFTP, which stands for Trivial File Transfer Protocol. PPT is a protocol developed for OPeNDAP by Jose Garcia at UCAR and is based on RFC 1350 (see http://en.wikipedia.org/wiki/TFTP for more information). Our implementation uses strings as tokens, but that is not part of the protocol, you may implement your tokens anyway you want.

PPT started as an implementation of the RFC for TFTP using UDP, then moved to TCP in order to avoid the ACK required for each packet transmitted.

Here is some information about TFTP and how PPT differs:

  • It uses UDP (port 69) as its transport protocol (unlike FTP which uses TCP port 21). - PPT uses tcp.
  • It cannot list directory contents. - PPT does not deal with any commands, it passes those to the next layer up.
  • It has no authentication or encryption mechanisms. - We added that using standard SSL/X509
  • It is used to read files from, or write files to, a remote server. - Same deal, notice that for PPT, files are data objects.
  • It supports three different transfer modes, "netascii", "octet" and "mail", with the first two corresponding to the "ASCII" and "image" (binary) modes of the FTP protocol; the third is now obsolete and is hardly ever used. - PPT uses only tuneable binary buffers.
  • The original protocol has a file size limit of 32 MB, although this was extended when RFC 2347 introduced block-size negotiation in 1998 (allowing a maximum of 4 GB and potentially higher throughput). - No limit on size for PPT.
  • Since TFTP utilizes UDP, it has to supply its own transport and session support. Each file transferred via TFTP constitutes an independent exchange. That transfer is performed in lock-step, with only one packet (either a block of data, or an 'acknowledgement') ever in flight on the network at any time. Due to this lack of windowing, TFTP provides low throughput over high latency links. - Obsolete for PPT.
  • Due to the lack of security, it is dangerous over the open Internet. Thus, TFTP is generally only used on private, local networks. - It used to be that for this reason PPT was behind HTTPS and GridFTP. Now PPT has its own layer of security BUT it can be set up to be run without security BEHIND HTTPS and GridFTP (three tier model)

The BES architecture is a client/server architecture in which a server application sits listening for OPeNDAP requests. These requests can be for OPeNDAP data structures just as the CGI version serves (DAS, DDS, DataDDS, etc...) or it can serve requests for other data formats and other requests. An OPeNDAP client in this architecture connects to an OPeNDAP server via TCP or UNIX sockets. The client sends a message to the server requesting a PPT connection using a simple token. The server responds with a token letting the client know whether the connection is accepted or rejected. Once a connection has been established the client sends OPeNDAP requests to the server.

The initial implementation of the PPT protocol used tokens to signify the end of a transmission. This token would be placed at the end of a transmission between client to server and server to client. The receiving end would need to search the entire buffer, character for character, for this end-of-transmission token. For smaller transmissions this isn't such a big issue. But for large data transmissions, for example in response to a 'get dods' request, the receiving end would spend quite a bit of processing time searching for this end-of-transmission token.

To get around this we decided to switch to chunking, where the first characters received by an application would be the size of the chunk followed by that many bytes of data. To signify the end of the transmission, a chunk is sent with chunk size of zero.

2 Initial handshaking in the PPT

In the PPT scheme the server is the software that is listening for a connection, accepting commands, and providing responses. The client is the software that initiates a connection, issues commands, and receives responses.

  1. The client opens a socket connection to the server.
  2. The server accepts the connection.
  3. The client sends the string "PPTCLIENT_TESTING_CONNECTION" to the server.
  4. The server responds with one of the following tokens:
    1. "PPTSERVER_CONNECTION_OK" - If it will accept the session connection for PPT based message exchange.
    2. "PPT_PROTOCOL_UNDEFINED" - If it is busy or otherwise unavailable.
    3. "PPTSERVER_AUTHENTICATE" - Authentication is required for connection

Any other response from the server indicates that initiating the session failed and that the client should abandon the connection.

3 PPT Authentication

If PPTSERVER_AUTHENTICATE is returned, then the client must make an SSL connection to the server in order to authenticate. The hand-shaking continues in this case as follows:

  1. The client sends the token "PPTCLIENT_REQUEST_AUTHPORT", which lets the server know that it understands the request for authentication and to request the SSL port to connect to.
  2. The server responds with the SSL port to connect to
  3. The client opens an SSL connection to the server on this port
  4. SSL authentication proceeds as normal.

For SSL authentication, the client and the server must be configured, both through the use of a BES configuration file. The server side requires the following parameters:

BES.ServerSecurePort=<port number>

The client side requires the following paramters:


Once the secure connection has been accepted, authentication is successful, the SSL connection is dropped and responses and requests are handled through the initial non-secure connection.

Once the connection is established all further communications will be chunked as described in the following sections.

4 PPT Chunking

4.1 Chunking Scheme

       chunked-body     =  *chunked-section last-chunk

       chunked-section  = chunk | chunk-extension
       chunk-extension  = chunk-size "x"  1*(chunk-ext-name [ "=" chunk-ext-val ] ";")

       chunk            = chunk-size "d" chunk-data 

       chunk-size       = 7HEXDIG
       last-chunk       = 7("0") "d"

       chunk-ext-name   = token
                          ; sequence of 7-bit ASCII printable chars

       chunk-ext-val    = token | quoted-string
                          ; quoted-string is DQUOTE token DQUOTE

       chunk-data       = chunk-size(OCTET)
                          ; exactly chunk-size bytes

4.2 Graphical representation of chunking scheme

BES chunking 7 1.jpg

size - Stored in the first seven bytes are the size of the chunk (in bytes). The size does not include the first 8 bytes which are the chunk size and the chunk type (x or d). The size is a 16 bit integer encoded in 7 bytes as 7 ASCII characters that represent the size as a 7 hexadecimal (base 16) digits. If the size is 0, this signifies the end of the transmission, no more chunks follow.

type - The eighth byte is the type of chunk that follows. The type can be one of

  • x - extension, one or more name=value; pairs
  • d - data, actual data

data - The data part of the chunk, meaning either extensions or data, not both

Extensions are a name/value pair and can represent information needed by the underlying communication layer. For example: status=error; would mean that an error has occurred. This chunk would not contain the error/exception information itself. Following chunks would hold that information and would be of type d (for data).

It is possible that the chunk may not come across in one read call to the underlying socket layer. The first 8 bytes represent information about the chunk, the first seven bytes being the size of the chunk and the eighth byte being the type of data the chunk contains. Read should be called until the entire chunk has been received.

Read/receive should be called until the last chunk is received. The last chunk is represented by chunk size of 0.

4.3 Client/server exit

The exit handshake used to take place with an exit token. Now, instead, it is done with an exit extension. The chunk will look like this:


followed by the last chunk signifying the end of the transmission.


4.4 Chunking State Diagram


4.5 Error response

Built in to the BES currently is the error message response to a BES client request. Specifically, if the BES client issues a command and the BES server has a problem with it (either with the command itself, the processing of the response to the command, or sending of the response) then the BES server will send an extension chunk to the BES client with the name/value pair "status=error;", and, following that chunk, data chunks will contain the error information followed by the last chunk. The extension chunk with the error status could come following a series of data chunks containing the partial response.

5 Communication Flow

Communication flow.jpg

6 State Diagram

Client state diagram.jpg                              Server state diagram.jpg

7 The API

There are two versions of the PPT available. The PPT C++ API is provided with the BES, either source tarball or binary distributions. The PPTCAPI package is a separate package, not yet available as a source tarball or binary distribution. Both API's are available from svn.