1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 2 <html xmlns="http://www.w3.org/1999/xhtml" lang="en-gb"> 3 <head> 4 <meta http-equiv="content-type" content="text/html; charset=UTF-8" /> 5 <title>pprocess - Tutorial</title> 6 </head> 7 <body> 8 9 <h1>pprocess - Tutorial</h1> 10 11 <p>The <code>pprocess</code> 12 module provides several mechanisms for running Python code concurrently 13 in several processes. The most straightforward way of making a program 14 parallel-aware - that is, where the program can take advantage of more 15 than one processor to simultaneously process data - is to use the 16 <code>pmap</code> function.</p> 17 18 <h2>Converting Map-Style Code</h2> 19 20 <p>Consider a program using the built-in <code>map</code> function and a sequence of inputs:</p> 21 22 <pre> 23 t = time.time() 24 25 # Initialise an array. 26 27 sequence = [] 28 for i in range(0, N): 29 for j in range(0, N): 30 sequence.append((i, j)) 31 32 # Perform the work. 33 34 results = map(calculate, sequence) 35 36 # Show the results. 37 38 for i in range(0, N): 39 for result in results[i*N:i*N+N]: 40 print result, 41 print 42 43 print "Time taken:", time.time() - t</pre> 44 45 <p>(This code in context with <code>import</code> statements and functions is found in the <code>examples/simple_map.py</code> file.)</p> 46 47 <p>The principal features of this program involve the preparation of an array for input purposes, and the use of the <code>map</code> function to iterate over the combinations of <code>i</code> and <code>j</code> in the array. Even if the <code>calculate</code> 48 function could be invoked independently for each input value, we have 49 to wait for each computation to complete before initiating a new 50 one.</p> 51 52 <p>In order to reduce the processing time - to speed the code up, 53 in other words - we can make this code use several processes instead of 54 just one. Here is the modified code:</p> 55 56 <pre> 57 t = time.time() 58 59 # Initialise an array. 60 61 sequence = [] 62 for i in range(0, N): 63 for j in range(0, N): 64 sequence.append((i, j)) 65 66 # Perform the work. 67 68 results = <strong>pprocess.pmap</strong>(calculate, sequence<strong>, limit=limit</strong>) 69 70 # Show the results. 71 72 for i in range(0, N): 73 for result in results[i*N:i*N+N]: 74 print result, 75 print 76 77 print "Time taken:", time.time() - t</pre> 78 79 <p>(This code in context with <code>import</code> statements and functions is found in the <code>examples/simple_pmap.py</code> file.)</p> 80 81 <p>By replacing usage of the <code>map</code> function with the <code>pprocess.pmap</code> 82 function, and specifying the limit on the number of processes to be active at any 83 given time, several calculations can now be performed in parallel.</p> 84 85 <h2>Converting Invocations to Parallel Operations</h2> 86 87 <p>Although some programs make natural use of the <code>map</code> function, others may employ an invocation in a nested loop. This may also be converted to a parallel program. Consider the following Python code:</p> 88 89 <pre> 90 t = time.time() 91 92 # Initialise an array. 93 94 results = [] 95 96 # Perform the work. 97 98 print "Calculating..." 99 for i in range(0, N): 100 for j in range(0, N): 101 results.append(calculate(i, j)) 102 103 # Show the results. 104 105 for i in range(0, N): 106 for result in results[i*N:i*N+N]: 107 print result, 108 print 109 110 print "Time taken:", time.time() - t</pre> 111 112 <p>(This code in context with <code>import</code> statements and functions is found in the <code>examples/simple1.py</code> file.)</p> 113 114 <p>Here, a computation in the <code>calculate</code> function is performed for each combination of <code>i</code> and <code>j</code> 115 in the nested loop, returning a result value. However, we must wait for 116 the completion of this function for each element before moving on to 117 the next element, and this means that the computations are performed 118 sequentially. Consequently, on a system with more than one processor, 119 even if we could call <code>calculate</code> for more than one combination of <code>i</code> and <code>j</code><code></code> and have the computations executing at the same time, the above program will not take advantage of such capabilities.</p> 120 121 <p>In order to reduce the processing time - to speed the code up, 122 in other words - we can make this code use several processes instead of 123 just one. Here is the modified code:</p> 124 125 <pre> 126 t = time.time() 127 128 # Initialise the results using map with a limit on the number of 129 # channels/processes. 130 131 <strong>results = pprocess.Map(limit=limit)</strong><code></code> 132 133 # Wrap the calculate function and manage it. 134 135 <strong>calc = results.manage(pprocess.MakeParallel(calculate))</strong> 136 137 # Perform the work. 138 139 print "Calculating..." 140 for i in range(0, N): 141 for j in range(0, N): 142 <strong>calc</strong>(i, j) 143 144 # Show the results. 145 146 for i in range(0, N): 147 for result in results[i*N:i*N+N]: 148 print result, 149 print 150 151 print "Time taken:", time.time() - t</pre> 152 153 <p>(This code in context with <code>import</code> statements and functions is found in the <code>examples/simple_manage_map.py</code> file.)</p> 154 155 <p>The principal changes in the above code involve the use of a <code>pprocess.Map</code> object to collect the results, and a version of the <code>calculate</code> function which is managed by the <code>Map</code> object. What the <code>Map</code> 156 object does is to arrange the results of computations such that 157 iterating over the object or accessing the object using list operations 158 provides the results in the same order as their corresponding inputs.</p> 159 160 </body> 161 </html>