1.1 --- a/docs/tutorial.xhtml Sat Sep 15 19:41:09 2007 +0000
1.2 +++ b/docs/tutorial.xhtml Sat Sep 15 19:41:49 2007 +0000
1.3 @@ -1,7 +1,7 @@
1.4 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
1.5 <html xmlns="http://www.w3.org/1999/xhtml" lang="en-gb">
1.6 <head>
1.7 - <meta content="text/html; charset=UTF-8" http-equiv="content-type" />
1.8 + <meta http-equiv="content-type" content="text/html; charset=UTF-8" />
1.9 <title>pprocess - Tutorial</title>
1.10 </head>
1.11 <body>
1.12 @@ -13,32 +13,40 @@
1.13 in several processes. The most straightforward way of making a program
1.14 parallel-aware - that is, where the program can take advantage of more
1.15 than one processor to simultaneously process data - is to use the
1.16 -<code>pmap</code> function. Consider the following Python code:</p>
1.17 +<code>pmap</code> function.</p>
1.18 +
1.19 +<h2>Converting Map-Style Code</h2>
1.20 +
1.21 +<p>Consider a program using the built-in <code>map</code> function and a sequence of inputs:</p>
1.22
1.23 <pre>
1.24 t = time.time()
1.25
1.26 + # Initialise an array.
1.27 +
1.28 sequence = []
1.29 for i in range(0, N):
1.30 for j in range(0, N):
1.31 sequence.append((i, j))
1.32
1.33 + # Perform the work.
1.34 +
1.35 results = map(calculate, sequence)
1.36
1.37 - print "Time taken:", time.time() - t
1.38 + # Show the results.
1.39 +
1.40 for i in range(0, N):
1.41 for result in results[i*N:i*N+N]:
1.42 print result,
1.43 print
1.44 -</pre>
1.45 +
1.46 + print "Time taken:", time.time() - t</pre>
1.47
1.48 <p>(This code in context with <code>import</code> statements and functions is found in the <code>examples/simple_map.py</code> file.)</p>
1.49
1.50 -<p>Here, we initialise a sequence of inputs for an N by N grid, perform a
1.51 -calculation on each element, then print the resulting sequence as a
1.52 -grid. Since the <code>map</code> function performs the calculations sequentially, even if the <code>calculate</code>
1.53 +<p>The principal features of this program involve the preparation of an array for input purposes, and the use of the <code>map</code> function to iterate over the combinations of <code>i</code> and <code>j</code> in the array. Even if the <code>calculate</code>
1.54 function could be invoked independently for each input value, we have
1.55 -to wait for each calculation to complete before initiating a new
1.56 +to wait for each computation to complete before initiating a new
1.57 one.</p>
1.58
1.59 <p>In order to reduce the processing time - to speed the code up,
1.60 @@ -48,19 +56,25 @@
1.61 <pre>
1.62 t = time.time()
1.63
1.64 + # Initialise an array.
1.65 +
1.66 sequence = []
1.67 for i in range(0, N):
1.68 for j in range(0, N):
1.69 sequence.append((i, j))
1.70
1.71 + # Perform the work.
1.72 +
1.73 results = <strong>pprocess.pmap</strong>(calculate, sequence<strong>, limit=limit</strong>)
1.74
1.75 - print "Time taken:", time.time() - t
1.76 + # Show the results.
1.77 +
1.78 for i in range(0, N):
1.79 for result in results[i*N:i*N+N]:
1.80 print result,
1.81 print
1.82 -</pre>
1.83 +
1.84 + print "Time taken:", time.time() - t</pre>
1.85
1.86 <p>(This code in context with <code>import</code> statements and functions is found in the <code>examples/simple_pmap.py</code> file.)</p>
1.87
1.88 @@ -68,5 +82,80 @@
1.89 function, and specifying the limit on the number of processes to be active at any
1.90 given time, several calculations can now be performed in parallel.</p>
1.91
1.92 +<h2>Converting Invocations to Parallel Operations</h2>
1.93 +
1.94 +<p>Although some programs make natural use of the <code>map</code> function, others may employ an invocation in a nested loop. This may also be converted to a parallel program. Consider the following Python code:</p>
1.95 +
1.96 +<pre>
1.97 + t = time.time()
1.98 +
1.99 + # Initialise an array.
1.100 +
1.101 + results = []
1.102 +
1.103 + # Perform the work.
1.104 +
1.105 + print "Calculating..."
1.106 + for i in range(0, N):
1.107 + for j in range(0, N):
1.108 + results.append(calculate(i, j))
1.109 +
1.110 + # Show the results.
1.111 +
1.112 + for i in range(0, N):
1.113 + for result in results[i*N:i*N+N]:
1.114 + print result,
1.115 + print
1.116 +
1.117 + print "Time taken:", time.time() - t</pre>
1.118 +
1.119 +<p>(This code in context with <code>import</code> statements and functions is found in the <code>examples/simple1.py</code> file.)</p>
1.120 +
1.121 +<p>Here, a computation in the <code>calculate</code> function is performed for each combination of <code>i</code> and <code>j</code>
1.122 +in the nested loop, returning a result value. However, we must wait for
1.123 +the completion of this function for each element before moving on to
1.124 +the next element, and this means that the computations are performed
1.125 +sequentially. Consequently, on a system with more than one processor,
1.126 +even if we could call <code>calculate</code> for more than one combination of <code>i</code> and <code>j</code><code></code> and have the computations executing at the same time, the above program will not take advantage of such capabilities.</p>
1.127 +
1.128 +<p>In order to reduce the processing time - to speed the code up,
1.129 +in other words - we can make this code use several processes instead of
1.130 +just one. Here is the modified code:</p>
1.131 +
1.132 +<pre>
1.133 + t = time.time()
1.134 +
1.135 + # Initialise the results using map with a limit on the number of
1.136 + # channels/processes.
1.137 +
1.138 + <strong>results = pprocess.Map(limit=limit)</strong><code></code>
1.139 +
1.140 + # Wrap the calculate function and manage it.
1.141 +
1.142 + <strong>calc = results.manage(pprocess.MakeParallel(calculate))</strong>
1.143 +
1.144 + # Perform the work.
1.145 +
1.146 + print "Calculating..."
1.147 + for i in range(0, N):
1.148 + for j in range(0, N):
1.149 + <strong>calc</strong>(i, j)
1.150 +
1.151 + # Show the results.
1.152 +
1.153 + for i in range(0, N):
1.154 + for result in results[i*N:i*N+N]:
1.155 + print result,
1.156 + print
1.157 +
1.158 + print "Time taken:", time.time() - t</pre>
1.159 +
1.160 +<p>(This code in context with <code>import</code> statements and functions is found in the <code>examples/simple_manage_map.py</code> file.)</p>
1.161 +
1.162 +<p>The principal changes in the above code involve the use of a <code>pprocess.Map</code> object to collect the results, and a version of the <code>calculate</code> function which is managed by the <code>Map</code> object. What the <code>Map</code>
1.163 +object does is to arrange the results of computations such that
1.164 +iterating over the object or accessing the object using list operations
1.165 +provides the results in the same order as their corresponding inputs.</p>
1.166 +
1.167 </body>
1.168 </html>