# HG changeset patch # User paulb # Date 1189811737 0 # Node ID 31021014ccc7aed521cc485c116eb6aec5fcdb3b # Parent 88f285db505f55461948b4299f7e7a69b79e2509 [project @ 2007-09-14 23:15:37 by paulb] Added a pmap tutorial. diff -r 88f285db505f -r 31021014ccc7 docs/tutorial.xhtml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/tutorial.xhtml Fri Sep 14 23:15:37 2007 +0000 @@ -0,0 +1,72 @@ + + +
+ +The pprocess
+module provides several mechanisms for running Python code concurrently
+in several processes. The most straightforward way of making a program
+parallel-aware - that is, where the program can take advantage of more
+than one processor to simultaneously process data - is to use the
+pmap
function. Consider the following Python code:
+ t = time.time() + + sequence = [] + for i in range(0, N): + for j in range(0, N): + sequence.append((i, j)) + + results = map(calculate, sequence) + + print "Time taken:", time.time() - t + for i in range(0, N): + for result in results[i*N:i*N+N]: + print result, + print ++ +
(This code in context with import
statements and functions is found in the examples/simple_map.py
file.)
Here, we initialise a sequence of inputs for an N by N grid, perform a
+calculation on each element, then print the resulting sequence as a
+grid. Since the map
function performs the calculations sequentially, even if the calculate
+function could be invoked independently for each input value, we have
+to wait for each calculation to complete before initiating a new
+one.
In order to reduce the processing time - to speed the code up, +in other words - we can make this code use several processes instead of +just one. Here is the modified code:
+ ++ t = time.time() + + sequence = [] + for i in range(0, N): + for j in range(0, N): + sequence.append((i, j)) + + results = pprocess.pmap(calculate, sequence, limit=limit) + + print "Time taken:", time.time() - t + for i in range(0, N): + for result in results[i*N:i*N+N]: + print result, + print ++ +
(This code in context with import
statements and functions is found in the examples/simple_pmap.py
file.)
By replacing usage of the map
function with the pprocess.pmap
+function, and specifying the limit on the number of processes to be active at any
+given time, several calculations can now be performed in parallel.