pp | |
logging | |
time | |
mmf | |
f(x, y, z) | Function to execute on the remote server. |
launch([ppservers, process_jobs_in_order]) | |
callback(x, y, z, result_archive) | Example of a callback: will be called after the job is |
run(archive) | This function must be defined in an importable module so it can be pickled. |
process_job(key, job) | Unpack and process the job after it is done. |
Parallel Python Example
This example shows how to use Parallel Python to distribute tasks over several local processors and on remote machines.
Note
You must install Parallel Python separately for this to work:
wget http://www.parallelpython.com/downloads/pp/pp-1.6.0.tar.bz2 tar -jxvf pp-1.6.0.tar.bz2 cd pp-1.6.0 python setup.py install
We will use two modifications. First, we will make use of the mmf.archive module for serializing our objects as this provides additional flexibility, such as the ability to serialize functions and classes.
Note
If performance is a concern, you might like to try and make your code work with the standard pickling serialization as this allows you to use the server-side caching mechanism.
Second, we will launch the remote servers using an ssh tunnel and port forwarding for security.
Function to execute on the remote server.
Example of a callback: will be called after the job is done.
This function must be defined in an importable module so it can be pickled. It does the packing and unpacking of the arguments and results.
Note that this function is executed in an isolated environment (the pp module stores the actual code and does not import the whole module), thus all symbols must be imported here or explicitly passed to the server. There are several options:
Unpack and process the job after it is done. This will block until the job completes, so you might like to check job.finished first.