Hyrax - BES PPT
1 What is PPT
PPT stands for Point to Point Transport. It is based on TFTP, which stands for Trivial File Transfer Protocol. PPT is a protocol developed for OPeNDAP by Jose Garcia at UCAR and is based on RFC 1350 (see http://en.wikipedia.org/wiki/TFTP for more information). Our implementation uses strings as tokens, but that is not part of the protocol, you may implement your tokens anyway you want.
PPT started as an implementation of the RFC for TFTP using UDP, then moved to TCP in order to avoid the ACK required for each packet transmitted.
Here is some information about TFTP and how PPT differs:
- It uses UDP (port 69) as its transport protocol (unlike FTP which uses TCP port 21). - PPT uses tcp.
- It cannot list directory contents. - PPT does not deal with any commands, it passes those to the next layer up.
- It has no authentication or encryption mechanisms. - We added that using standard SSL/X509
- It is used to read files from, or write files to, a remote server. - Same deal, notice that for PPT, files are data objects.
- It supports three different transfer modes, "netascii", "octet" and "mail", with the first two corresponding to the "ASCII" and "image" (binary) modes of the FTP protocol; the third is now obsolete and is hardly ever used. - PPT uses only tuneable binary buffers.
- The original protocol has a file size limit of 32 MB, although this was extended when RFC 2347 introduced block-size negotiation in 1998 (allowing a maximum of 4 GB and potentially higher throughput). - No limit on size for PPT.
- Since TFTP utilizes UDP, it has to supply its own transport and session support. Each file transferred via TFTP constitutes an independent exchange. That transfer is performed in lock-step, with only one packet (either a block of data, or an 'acknowledgement') ever in flight on the network at any time. Due to this lack of windowing, TFTP provides low throughput over high latency links. - Obsolete for PPT.
- Due to the lack of security, it is dangerous over the open Internet. Thus, TFTP is generally only used on private, local networks. - It used to be that for this reason PPT was behind HTTPS and GridFTP. Now PPT has its own layer of security BUT it can be set up to be run without security BEHIND HTTPS and GridFTP (three tier model)
The BES architecture is a client/server architecture in which a server application sits listening for OPeNDAP requests. These requests can be for OPeNDAP data structures just as the CGI version serves (DAS, DDS, DataDDS, etc...) or it can serve requests for other data formats and other requests. An OPeNDAP client in this architecture connects to an OPeNDAP server via TCP or UNIX sockets. The client sends a message to the server requesting a PPT connection using a simple token. The server responds with a token letting the client know whether the connection is accepted or rejected. Once a connection has been established the client sends OPeNDAP requests to the server.
The initial implementation of the PPT protocol used tokens to signify the end of a transmission. This token would be placed at the end of a transmission between client to server and server to client. The receiving end would need to search the entire buffer, character for character, for this end-of-transmission token. For smaller transmissions this isn't such a big issue. But for large data transmissions, for example in response to a 'get dods' request, the receiving end would spend quite a bit of processing time searching for this end-of-transmission token.
To get around this we decided to switch to chunking, where the first characters received by an application would be the size of the chunk followed by that many bytes of data. To signify the end of the transmission, a chunk is sent with chunk size of zero.
2 Initial handshaking in the PPT
In the PPT scheme the server is the software that is listening for a connection, accepting commands, and providing responses. The client is the software that initiates a connection, issues commands, and receives responses.
- The client opens a socket connection to the server.
- The server accepts the connection.
- The client sends the string "PPTCLIENT_TESTING_CONNECTION" to the server.
- The server responds with one of the following tokens:
- "PPTSERVER_CONNECTION_OK" - If it will accept the session connection for PPT based message exchange.
- "PPT_PROTOCOL_UNDEFINED" - If it is busy or otherwise unavailable.
- "PPTSERVER_AUTHENTICATE" - Authentication is required for connection
Any other response from the server indicates that initiating the session failed and that the client should abandon the connection.
3 PPT Authentication
If PPTSERVER_AUTHENTICATE is returned, then the client must make an SSL connection to the server in order to authenticate. The hand-shaking continues in this case as follows:
- The client sends the token "PPTCLIENT_REQUEST_AUTHPORT", which lets the server know that it understands the request for authentication and to request the SSL port to connect to.
- The server responds with the SSL port to connect to
- The client opens an SSL connection to the server on this port
- SSL authentication proceeds as normal.
For SSL authentication, the client and the server must be configured, both through the use of a BES configuration file. The server side requires the following parameters:
BES.ServerSecure=yes|no BES.ServerSecurePort=<port number> BES.ServerCertFile=/full/path/to/serverside/certificate/file.pem BES.ServerCertAuthFile=/full/path/to/serverside/certificate/authority/file.pem BES.ServerKeyFile=/full/path/to/serverside/key/file.pem
The client side requires the following paramters:
BES.ClientCertFile=/full/path/to/clientside/certificate/file.pem BES.ClientCertAuthFile=/full/path/to/clientside/certificate/authority/file.pem BES.ClientKeyFile=/full/path/to/clientside/key/file.pem
Once the secure connection has been accepted, authentication is successful, the SSL connection is dropped and responses and requests are handled through the initial non-secure connection.
Once the connection is established all further communications will be chunked as described in the following sections.
4 PPT Chunking
4.1 Chunking Scheme
chunked-body = *chunked-section last-chunk chunked-section = chunk | chunk-extension chunk-extension = chunk-size "x" 1*(chunk-ext-name [ "=" chunk-ext-val ] ";") chunk = chunk-size "d" chunk-data chunk-size = 7HEXDIG last-chunk = 7("0") "d" chunk-ext-name = token ; sequence of 7-bit ASCII printable chars chunk-ext-val = token | quoted-string ; quoted-string is DQUOTE token DQUOTE chunk-data = chunk-size(OCTET) ; exactly chunk-size bytes
4.2 Graphical representation of chunking scheme
Stored in the first seven bytes are the size of the chunk (in bytes). The size does not include the first 8 bytes which are the chunk size and the chunk type (x or d). The size is a 16 bit integer encoded in 7 bytes as 7 ASCII characters that represent the size as a 7 hexadecimal (base 16) digits. If the size is 0, this signifies the end of the transmission, no more chunks follow.
The eighth byte is the type of chunk that follows. The type can be one of * x - extension, one or more name=value; pairs * d - data, actual data
The data part of the chunk, meaning either extensions or data, not both
Extensions are a name/value pair and can represent information needed by the underlying communication layer. For example: status=error; would mean that an error has occurred. This chunk would not contain the error/exception information itself. Following chunks would hold that information and would be of type d (for data).
It is possible that the chunk may not come across in one read call to the underlying socket layer. The first 8 bytes represent information about the chunk, the first seven bytes being the size of the chunk and the eighth byte being the type of data the chunk contains. Read should be called until the entire chunk has been received.
Read/receive should be called until the last chunk is received. The last chunk is represented by chunk size of 0.
4.3 Client/server exit
The exit handshake used to take place with an exit token. Now, instead, it is done with an exit extension. The chunk will look like this:
followed by the last chunk signifying the end of the transmission.