Caching that's multi-process safe

From OPeNDAP Documentation
Revision as of 15:36, 22 March 2012 by Jimg (talk | contribs) (Releasing locks)

1 Problem

Develop a general procedure (or software) for implementing a cache that can be shared by several processes. There seems to be no readily available library that implements caching. It certainly needs to work for Unix; Windows suport would be good, but i of less importance because the OC library may take over or client role on that platform.

2 Background

We have several cases where files need to be cached, both in our server software and in the client code. These include at least: files that have been decompressed and responses from the web that can be cached using the HTTP/1.1 rules. We need for the caching to be both thread safe and multi-process safe. That is, the cache needs to be shared between processes that do not, otherwise, synchronize their operation.


3 Proposed solution

3.1 Assumptions

  1. The size of the cache can only change by the number of active processes using the cache (at any given instance).
  2. The size of the cache will likely change only by the average size of a cached object times some constant that's less than or equal to the number of processes.
  3. We accept as tolerable that the cache might grow beyond its bounds by a 'little bit' but that's an OK tradeoff if we don't have to lock the cache to count its size.
  4. We will count the size of the cache before every attempt to write a new item.
  5. We can store the size of the cache in a file and access that using an exclusive lock (but that might make item #3 moot, because then we always know the exact size of the cache).

3.2 Definitions

shared lock
a synonym for a read-only lock; a program cannot acquire an exclusive lock when a shared lock exists, but it can acuire another shared lock. Thus, many programs can use a shared lock for reading and prevent a program from writing/deleting the file.
exclusive lock
synonym for a write lock; a program cannot acquire an exclusive lock when either a shared lock or another exclusive lock already exists.
semaphore file
a file that is used to serve as a proxy lock for another file. These can be use to indicate to cooperating processes that a file should not be touched - it is effectively locked - without actually locking the file. This can be used to establish overlapping locks, particularly so that files can be first exclusively locked and, without removing that lock, have a shared lock too. Once the shared (i.e., read) lock is in place, the 'exclusive lock' can be removed, thus ensuring that at no time in the sequence is the file left in the unprotected state.

3.3 Releasing locks

The fcntl command makes locks that are removed whenever any of the file descriptors to an open file are closed. So long as we use advisory locking and program logic follows the rules, we can use these locks and be sure that files will be unlocked whenever a handler closes the file, without any need to actually close the lock in the using the same file descriptor used to obtain the lock. This is apparently not the case for flock or lockf. Both the BES and the HTTP caching code in libdap have support for releasing resources, so it's possible to use flock(2), et c., for this which also means we can use the open(2) and creat(2) calls for lock management.

3.4 There are three basic operations that must be implemented

Is a file in the cache
Try to get a shared lock. If the shared lock attempt was successful, the file can be read (because it's locked for read). (It's important to lock the file for read, because locking it is the only way to ensure it won't be deleted by the time the caller reads & processes the response.) Return status (successful lock or not) with the lock as a side effect.
Write a new file
Make the file and get an exclusive lock on it. Write the data to the file. release the exclusive lock. The trick here is that the make the file step may return an error because to processes could try to make the same file at the file at the same time, so the caller of this code must deal with the error that it cannot write the file and should then assume that another process is making a file of the same name and try to obtain a read-lock on it.
Delete a file
Get an exclusive lock on the file. Delete it.

3.5 Cache design

In addition to the files themselves, maintain information about the time each file was last used and their size. The cache should also probably have information about its total size. Store this in the cache in files, so that it's accessible for the processes accessing the cache.

In the make a file step in the write a file operation, make sure to use open(2) in its atomic file creation mode (O_CREAT, ...).

In the BES software, the class BESContainer provides a partially abstract base class that holds two methods: access() and release() that are used to access and release the 'container' that holds the data. The class BESFileContainer has concrete implementations for these methods that uncompress files if need be. The class BESUncompressManager performs the decompression operation and uses the BESCache class to store the result. BESCache handles the locking operations.

To avoid making a new BESCache object on every call to BESContainer::access(), maybe move BESCache into BESUncompressManager.