1 Introduction
2 ------------
3
4 The pprocess module provides elementary support for parallel programming in
5 Python using a fork-based process creation model in conjunction with a
6 channel-based communications model implemented using socketpair and poll. On
7 systems with multiple CPUs or multicore CPUs, processes should take advantage
8 of as many CPUs or cores as the operating system permits.
9
10 Quick Start
11 -----------
12
13 Try running the simple examples. For example:
14
15 PYTHONPATH=. python examples/simple_create.py
16
17 (These examples show in different ways how limited number of processes can be
18 used to perform a parallel computation. The simple.py, simple1.py, simple2.py
19 and simple_map.py programs are sequential versions of the other programs.)
20
21 The following table summarises the features used in the programs:
22
23 Program (.py) pmap MakeParallel manage start create Map Queue Exchange
24 ------------- ---- ------------ ------ ----- ------ --- ----- --------
25 simple_create_map Yes Yes
26 simple_create_queue Yes Yes
27 simple_create Yes Yes
28 simple_managed_map Yes Yes Yes
29 simple_managed_queue Yes Yes Yes
30 simple_managed Yes Yes Yes
31 simple_pmap Yes
32 simple_start_queue Yes Yes Yes
33 simple_start Yes Yes
34
35 The simplest parallel program is simple_pmap.py which employs the pmap
36 function resembling the built-in map function in Python.
37
38 Other simple programs are those employing the Queue class, together with those
39 using the manage method which associates functions or callables with Queue or
40 Exchange objects for convenient invocation of those functions and the
41 management of their communications.
42
43 The most technically involved program is simple_start.py which uses the
44 Exchange class together with a calculation function which is aware of the
45 parallel environment and which communicates over the supplied communications
46 channel directly to the creating process.
47
48 It should be noted that with the exception of simple_start.py, those examples
49 employing calculation functions (as opposed to doing a calculation inline in a
50 loop body) all use MakeParallel to make those functions parallel-aware, thus
51 permitting the conversion of "normal" functions to a form usable in the
52 parallel environment.
53
54 An additional example not listed above, simple_managed_map_reusable.py,
55 employs the MakeReusable class instead of MakeParallel in order to demonstrate
56 reusable processes and channels.
57
58 The tutorial provides some information about the examples: docs/tutorial.xhtml
59
60 Parallel Raytracing with PyGmy
61 ------------------------------
62
63 The PyGmy raytracer modified to use pprocess can be run to investigate the
64 potential for speed increases in "real world" programs:
65
66 cd examples/PyGmy
67 PYTHONPATH=../..:. python scene.py
68
69 (This should produce a file called test.tif - a TIFF file containing a
70 raytraced scene image.)
71
72 Test Programs
73 -------------
74
75 There are some elementary tests:
76
77 PYTHONPATH=. python tests/create_loop.py
78 PYTHONPATH=. python tests/start_loop.py
79
80 (Simple loop demonstrations which use two different ways of creating and
81 starting the parallel processes.)
82
83 PYTHONPATH=. python tests/start_indexer.py <directory>
84
85 (A text indexing demonstration, where <directory> should be a directory
86 containing text files to be indexed, although HTML files will also work well
87 enough. After indexing the files, a prompt will appear, words or word
88 fragments can be entered, and matching words and their locations will be
89 shown. Run the program without arguments to see more information.)
90
91 Contact, Copyright and Licence Information
92 ------------------------------------------
93
94 The current Web page for pprocess at the time of release is:
95
96 http://www.boddie.org.uk/python/pprocess.html
97
98 The author can be contacted at the following e-mail address:
99
100 paul@boddie.org.uk
101
102 Copyright and licence information can be found in the docs directory - see
103 docs/COPYING.txt, docs/lgpl-3.0.txt and docs/gpl-3.0.txt for more information.
104
105 For the PyGmy raytracer example, different copyright and licence information
106 is provided in the docs directory - see docs/COPYING-PyGmy.txt and
107 docs/LICENCE-PyGmy.txt for more information.
108
109 Dependencies
110 ------------
111
112 This software depends on standard library features which are stated as being
113 available only on "UNIX"; it has only been tested on a GNU/Linux system.
114
115 New in pprocess 0.3.2 (Changes since pprocess 0.3.1)
116 ----------------------------------------------------
117
118 * Added a utility function to detect and return the number of processor
119 cores available.
120 * Added missing documentation stylesheet.
121
122 New in pprocess 0.3.1 (Changes since pprocess 0.3)
123 --------------------------------------------------
124
125 * Moved the reference material out of the module docstring and into a
126 separate document, converting it to XHTML in the process.
127 * Fixed the project name in the setup script.
128
129 New in pprocess 0.3 (Changes since parallel 0.2.5)
130 --------------------------------------------------
131
132 * Added managed callables: wrappers around callables which cause them to be
133 automatically managed by the exchange from which they were acquired.
134 * Added MakeParallel: a wrapper instantiated around a normal function which
135 sends the result of that function over the supplied channel when invoked.
136 * Added MakeReusable: a wrapper like MakeParallel which can be used in
137 conjunction with the newly-added reuse capability of the Exchange class in
138 order to reuse processes and channels.
139 * Added a Map class which attempts to emulate the built-in map function,
140 along with a pmap function using this class.
141 * Added a Queue class which provides a simpler iterator-style interface to
142 data produced by created processes.
143 * Added a create method to the Exchange class and an exit convenience
144 function to the module.
145 * Changed the Exchange implementation to not block when attempting to start
146 new processes beyond the process limit: such requests are queued and
147 performed as running processes are completed. This permits programs using
148 the start method to proceed to consumption of results more quickly.
149 * Extended and updated the examples. Added a tutorial.
150 * Added Ubuntu Feisty (7.04) package support.
151
152 New in parallel 0.2.5 (Changes since parallel 0.2.4)
153 ----------------------------------------------------
154
155 * Added a start method to the Exchange class for more convenient creation of
156 processes.
157 * Relicensed under the LGPL (version 3 or later) - this also fixes the
158 contradictory situation where the GPL was stated in the pprocess module
159 (which was not, in fact, the intention) and the LGPL was stated in the
160 documentation.
161
162 New in parallel 0.2.4 (Changes since parallel 0.2.3)
163 ----------------------------------------------------
164
165 * Set buffer sizes to zero for the file object wrappers around sockets: this
166 may prevent deadlock issues.
167
168 New in parallel 0.2.3 (Changes since parallel 0.2.2)
169 ----------------------------------------------------
170
171 * Added convenient message exchanges, offering methods handling common
172 situations at the cost of having to define a subclass of Exchange.
173 * Added a simple example of performing a parallel computation.
174 * Improved the PyGmy raytracer example to use the newly added functionality.
175
176 New in parallel 0.2.2 (Changes since parallel 0.2.1)
177 ----------------------------------------------------
178
179 * Changed the status testing in the Exchange class, potentially fixing the
180 premature closure of channels before all data was read.
181 * Fixed the PyGmy raytracer example's process accounting by relying on the
182 possibly more reliable Exchange behaviour, whilst also preventing
183 erroneous creation of "out of bounds" processes.
184 * Added a removed attribute on the Exchange to record which channels were
185 removed in the last call to the ready method.
186
187 New in parallel 0.2.1 (Changes since parallel 0.2)
188 --------------------------------------------------
189
190 * Added a PyGmy raytracer example.
191 * Updated copyright and licensing details (FSF address, additional works).
192
193 New in parallel 0.2 (Changes since parallel 0.1)
194 ------------------------------------------------
195
196 * Changed the name of the included module from parallel to pprocess in order
197 to avoid naming conflicts with PyParallel.
198
199 Release Procedures
200 ------------------
201
202 Update the pprocess __version__ attribute.
203 Change the version number and package filename/directory in the documentation.
204 Update the release notes (see above).
205 Check the release information in the PKG-INFO file.
206 Tag, export.
207 Archive, upload.
208 Update PyPI.
209
210 Making Packages
211 ---------------
212
213 To make Debian-based packages:
214
215 1. Create new package directories under packages if necessary.
216 2. Make a symbolic link in the distribution's root directory to keep the
217 Debian tools happy:
218
219 ln -s packages/ubuntu-hoary/python2.4-parallel-pprocess/debian/
220
221 Or:
222
223 ln -s packages/ubuntu-feisty/python-pprocess/debian/
224
225 3. Run the package builder:
226
227 dpkg-buildpackage -rfakeroot
228
229 4. Locate and tidy up the packages in the parent directory of the
230 distribution's root directory.