paulb@95 | 1 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> |
paulb@95 | 2 | <html xmlns="http://www.w3.org/1999/xhtml" lang="en-gb"> |
paulb@95 | 3 | <head> |
paulb@95 | 4 | <meta content="text/html; charset=UTF-8" http-equiv="content-type" /> |
paulb@95 | 5 | <title>pprocess - Tutorial</title> |
paulb@95 | 6 | </head> |
paulb@95 | 7 | <body> |
paulb@95 | 8 | |
paulb@95 | 9 | <h1>pprocess - Tutorial</h1> |
paulb@95 | 10 | |
paulb@95 | 11 | <p>The <code>pprocess</code> |
paulb@95 | 12 | module provides several mechanisms for running Python code concurrently |
paulb@95 | 13 | in several processes. The most straightforward way of making a program |
paulb@95 | 14 | parallel-aware - that is, where the program can take advantage of more |
paulb@95 | 15 | than one processor to simultaneously process data - is to use the |
paulb@95 | 16 | <code>pmap</code> function. Consider the following Python code:</p> |
paulb@95 | 17 | |
paulb@95 | 18 | <pre> |
paulb@95 | 19 | t = time.time() |
paulb@95 | 20 | |
paulb@95 | 21 | sequence = [] |
paulb@95 | 22 | for i in range(0, N): |
paulb@95 | 23 | for j in range(0, N): |
paulb@95 | 24 | sequence.append((i, j)) |
paulb@95 | 25 | |
paulb@95 | 26 | results = map(calculate, sequence) |
paulb@95 | 27 | |
paulb@95 | 28 | print "Time taken:", time.time() - t |
paulb@95 | 29 | for i in range(0, N): |
paulb@95 | 30 | for result in results[i*N:i*N+N]: |
paulb@95 | 31 | print result, |
paulb@95 | 32 | print |
paulb@95 | 33 | </pre> |
paulb@95 | 34 | |
paulb@95 | 35 | <p>(This code in context with <code>import</code> statements and functions is found in the <code>examples/simple_map.py</code> file.)</p> |
paulb@95 | 36 | |
paulb@95 | 37 | <p>Here, we initialise a sequence of inputs for an N by N grid, perform a |
paulb@95 | 38 | calculation on each element, then print the resulting sequence as a |
paulb@95 | 39 | grid. Since the <code>map</code> function performs the calculations sequentially, even if the <code>calculate</code> |
paulb@95 | 40 | function could be invoked independently for each input value, we have |
paulb@95 | 41 | to wait for each calculation to complete before initiating a new |
paulb@95 | 42 | one.</p> |
paulb@95 | 43 | |
paulb@95 | 44 | <p>In order to reduce the processing time - to speed the code up, |
paulb@95 | 45 | in other words - we can make this code use several processes instead of |
paulb@95 | 46 | just one. Here is the modified code:</p> |
paulb@95 | 47 | |
paulb@95 | 48 | <pre> |
paulb@95 | 49 | t = time.time() |
paulb@95 | 50 | |
paulb@95 | 51 | sequence = [] |
paulb@95 | 52 | for i in range(0, N): |
paulb@95 | 53 | for j in range(0, N): |
paulb@95 | 54 | sequence.append((i, j)) |
paulb@95 | 55 | |
paulb@95 | 56 | results = <strong>pprocess.pmap</strong>(calculate, sequence<strong>, limit=limit</strong>) |
paulb@95 | 57 | |
paulb@95 | 58 | print "Time taken:", time.time() - t |
paulb@95 | 59 | for i in range(0, N): |
paulb@95 | 60 | for result in results[i*N:i*N+N]: |
paulb@95 | 61 | print result, |
paulb@95 | 62 | print |
paulb@95 | 63 | </pre> |
paulb@95 | 64 | |
paulb@95 | 65 | <p>(This code in context with <code>import</code> statements and functions is found in the <code>examples/simple_pmap.py</code> file.)</p> |
paulb@95 | 66 | |
paulb@95 | 67 | <p>By replacing usage of the <code>map</code> function with the <code>pprocess.pmap</code> |
paulb@95 | 68 | function, and specifying the limit on the number of processes to be active at any |
paulb@95 | 69 | given time, several calculations can now be performed in parallel.</p> |
paulb@95 | 70 | |
paulb@95 | 71 | </body> |
paulb@95 | 72 | </html> |