pprocess

Change of docs/tutorial.html

170:2cd56ed1e0f7
docs/tutorial.html
     1.1 --- a/docs/tutorial.html	Fri Sep 25 16:02:56 2015 +0200
     1.2 +++ b/docs/tutorial.html	Fri Sep 25 16:40:39 2015 +0200
     1.3 @@ -16,6 +16,7 @@
     1.4  use the <code>pmap</code> function.</p>
     1.5  
     1.6  <ul>
     1.7 +<li><a href="#note">A Note on Parallel Processes</a></li>
     1.8  <li><a href="#pmap">Converting Map-Style Code</a></li>
     1.9  <li><a href="#Map">Converting Invocations to Parallel Operations</a></li>
    1.10  <li><a href="#Queue">Converting Arbitrarily-Ordered Invocations</a>
    1.11 @@ -35,6 +36,81 @@
    1.12  <p>For a brief summary of each of the features of <code>pprocess</code>, see
    1.13  the <a href="reference.html">reference document</a>.</p>
    1.14  
    1.15 +<h2 id="note">A Note on Parallel Processes</h2>
    1.16 +
    1.17 +<p>The way <code>pprocess</code> uses multiple processes to perform work in
    1.18 +parallel involves the <code>fork</code> system call, which on modern operating
    1.19 +systems involves what is known as "copy-on-write" semantics. In plain language,
    1.20 +when <code>pprocess</code> creates a new <em>child</em> process to perform work
    1.21 +in parallel with other work that needs to be done, this new process will be a
    1.22 +near-identical copy of the original <em>parent</em> process, and the running
    1.23 +code will be able to access data resident in that parent process.</p>
    1.24 +
    1.25 +<p>However, when a child process modifies data, instead of changing that data
    1.26 +in such a way that the parent process can see the modifications, the parent
    1.27 +process will, in fact, remain oblivious to such changes. What happens is that
    1.28 +as soon as the child process attempts to modify the data, it obtains its own
    1.29 +separate copy which is then modified independently of the original data. Thus,
    1.30 +a <em>copy</em> of any data is made when an attempt is made to <em>write</em>
    1.31 +to such data. Meanwhile, the parent's copy of that data will be left untouched
    1.32 +by the activities of the child.</p>
    1.33 +
    1.34 +<p>It is therefore essential to note that any data distributed to other
    1.35 +processes, and which will then be modified by those processes, will not appear
    1.36 +to change in the parent process even if the objects employed are mutable. This
    1.37 +is rather different to the behaviour of a normal Python program: passing a
    1.38 +list to a function, for example, mutates that list in such a way that upon
    1.39 +returning from that function the modifications will still be present. For
    1.40 +example:</p>
    1.41 +
    1.42 +<pre>
    1.43 +def mutator(l):
    1.44 +    l.append(3)
    1.45 +
    1.46 +l = [1, 2]
    1.47 +mutator(l) # l is now [1, 2, 3]
    1.48 +</pre>
    1.49 +
    1.50 +<p>In contrast, passing a list to a child process will cause the list to
    1.51 +mutate in the child process, but the parent process will not see the list
    1.52 +change. For example:</p>
    1.53 +
    1.54 +<pre>
    1.55 +def mutator(l):
    1.56 +    l.append(3)
    1.57 +
    1.58 +results = pprocess.Map()
    1.59 +mutator = results.manage(pprocess.MakeParallel(mutator))
    1.60 +
    1.61 +l = [1, 2]
    1.62 +mutator(l) # l is now [1, 2]
    1.63 +</pre>
    1.64 +
    1.65 +<p>To communicate changes to data between processes, the modified objects must
    1.66 +be explicitly returned from child processes using the mechanisms described in
    1.67 +this documentation. For example:</p>
    1.68 +
    1.69 +<pre>
    1.70 +def mutator(l):
    1.71 +    l.append(3)
    1.72 +    return l       # the modified object is explicitly returned
    1.73 +
    1.74 +results = pprocess.Map()
    1.75 +mutator = results.manage(pprocess.MakeParallel(mutator))
    1.76 +
    1.77 +l = [1, 2]
    1.78 +mutator(l)
    1.79 +
    1.80 +all_l = results[:] # there are potentially many results, not just one
    1.81 +l = all_l[0]       # l is now [1, 2, 3], taken from the first result
    1.82 +</pre>
    1.83 +
    1.84 +<p>It is perhaps easiest to think of the communications mechanisms as
    1.85 +providing a gateway between processes through which information can be passed,
    1.86 +with the rest of a program's data being private and hidden from the other
    1.87 +processes (even if that data initially resembles what the other processes also
    1.88 +see within themselves).</p>
    1.89 +
    1.90  <h2 id="pmap">Converting Map-Style Code</h2>
    1.91  
    1.92  <p>Consider a program using the built-in <code>map</code> function and a sequence of inputs:</p>