1 Introduction
2 ------------
3
4 The pprocess module provides elementary support for parallel programming in
5 Python using a fork-based process creation model in conjunction with a
6 channel-based communications model implemented using socketpair and poll. On
7 systems with multiple CPUs or multicore CPUs, processes should take advantage
8 of as many CPUs or cores as the operating system permits.
9
10 Tutorial
11 --------
12
13 The tutorial provides some information about the examples described below.
14 See the docs/tutorial.html file in the distribution for more details.
15
16 Reference
17 ---------
18
19 A description of the different mechanisms provided by the pprocess module can
20 be found in the reference document. See the docs/reference.html file in the
21 distribution for more details.
22
23 Quick Start
24 -----------
25
26 Try running the simple examples. For example:
27
28 PYTHONPATH=. python examples/simple_create.py
29
30 (These examples show in different ways how limited number of processes can be
31 used to perform a parallel computation. The simple.py, simple1.py, simple2.py
32 and simple_map.py programs are sequential versions of the other programs.)
33
34 The following table summarises the features used in the programs:
35
36 Program (.py) pmap MakeParallel manage start create Map Queue Exchange
37 ------------- ---- ------------ ------ ----- ------ --- ----- --------
38 simple_create_map Yes Yes
39 simple_create_queue Yes Yes
40 simple_create Yes Yes
41 simple_managed_map Yes Yes Yes
42 simple_managed_queue Yes Yes Yes
43 simple_managed Yes Yes Yes
44 simple_pmap Yes
45 simple_pmap_iter Yes
46 simple_start_queue Yes Yes Yes
47 simple_start Yes Yes
48
49 The simplest parallel programs are simple_pmap.py and simple_pmap_iter.py
50 which employ the pmap function resembling the built-in map function in
51 Python.
52
53 Other simple programs are those employing the Queue class, together with those
54 using the manage method which associates functions or callables with Queue or
55 Exchange objects for convenient invocation of those functions and the
56 management of their communications.
57
58 The most technically involved program is simple_start.py which uses the
59 Exchange class together with a calculation function which is aware of the
60 parallel environment and which communicates over the supplied communications
61 channel directly to the creating process.
62
63 It should be noted that with the exception of simple_start.py, those examples
64 employing calculation functions (as opposed to doing a calculation inline in a
65 loop body) all use MakeParallel to make those functions parallel-aware, thus
66 permitting the conversion of "normal" functions to a form usable in the
67 parallel environment.
68
69 Reusable Processes
70 ------------------
71
72 An additional example not listed above, simple_managed_map_reusable.py,
73 employs the MakeReusable class instead of MakeParallel in order to demonstrate
74 reusable processes and channels:
75
76 PYTHONPATH=. python examples/simple_managed_map_reusable.py
77
78 Continuous Process Communications
79 ---------------------------------
80
81 Another example not listed above, simple_continuous_queue.py, employs
82 continuous communications to monitor output from created processes:
83
84 PYTHONPATH=. python examples/simple_continuous_queue.py
85
86 Persistent Processes
87 --------------------
88
89 A number of persistent variants of some of the above examples employ a
90 persistent or background process which can be started by one process and
91 contacted later by another in order to collect the results of a computation.
92 For example:
93
94 PYTHONPATH=. python examples/simple_persistent_managed.py --start
95 PYTHONPATH=. python examples/simple_persistent_managed.py --reconnect
96
97 PYTHONPATH=. python examples/simple_background_queue.py --start
98 PYTHONPATH=. python examples/simple_background_queue.py --reconnect
99
100 PYTHONPATH=. python examples/simple_persistent_queue.py --start
101 PYTHONPATH=. python examples/simple_persistent_queue.py --reconnect
102
103 Parallel Raytracing with PyGmy
104 ------------------------------
105
106 The PyGmy raytracer modified to use pprocess can be run to investigate the
107 potential for speed increases in "real world" programs:
108
109 cd examples/PyGmy
110 PYTHONPATH=../..:. python scene.py
111
112 (This should produce a file called test.tif - a TIFF file containing a
113 raytraced scene image.)
114
115 Examples from the Concurrency SIG
116 ---------------------------------
117
118 The special interest group (SIG) for concurrency in Python proposed a
119 particular application as a showcase for concurrency libraries. Two examples
120 are included which demonstrate pprocess and the use of continuous processes to
121 implement the application concerned:
122
123 PYTHONPATH=. python examples/concurrency-sig/bottles.py
124 PYTHONPATH=. python examples/concurrency-sig/bottles_heartbeat.py
125
126 Test Programs
127 -------------
128
129 There are some elementary tests:
130
131 PYTHONPATH=. python tests/create_loop.py
132 PYTHONPATH=. python tests/start_loop.py
133
134 (Simple loop demonstrations which use two different ways of creating and
135 starting the parallel processes.)
136
137 PYTHONPATH=. python tests/start_indexer.py <directory>
138
139 (A text indexing demonstration, where <directory> should be a directory
140 containing text files to be indexed, although HTML files will also work well
141 enough. After indexing the files, a prompt will appear, words or word
142 fragments can be entered, and matching words and their locations will be
143 shown. Run the program without arguments to see more information.)
144
145 Contact, Copyright and Licence Information
146 ------------------------------------------
147
148 The current Web page for pprocess at the time of release is:
149
150 http://www.boddie.org.uk/python/pprocess.html
151
152 The author can be contacted at the following e-mail address:
153
154 paul@boddie.org.uk
155
156 Copyright and licence information can be found in the docs directory - see
157 docs/COPYING.txt, docs/lgpl-3.0.txt and docs/gpl-3.0.txt for more information.
158
159 For the PyGmy raytracer example, different copyright and licence information
160 is provided in the docs directory - see docs/COPYING-PyGmy.txt and
161 docs/LICENCE-PyGmy.txt for more information.
162
163 Dependencies
164 ------------
165
166 This software depends on standard library features which are stated as being
167 available only on "UNIX"; it has only been tested repeatedly on a GNU/Linux
168 system, and occasionally on systems running OpenSolaris.
169
170 New in pprocess 0.5.1 (Changes since pprocess 0.5)
171 --------------------------------------------------
172
173 * Added IOError handling when processes exit apparently without warning.
174
175 New in pprocess 0.5 (Changes since pprocess 0.4)
176 ------------------------------------------------
177
178 * Added proper support in the Exchange class for continuous communications
179 between processes, providing examples: simple_continuous_queue.py and the
180 concurrency-sig directory.
181 * Changed the Map class to permit incremental access to received results
182 from completed parts of the sequence of inputs, also adding an iteration
183 interface.
184 * Added an example, simple_pmap_iter.py, to demonstrate iteration over maps.
185 * Fixed the get_number_of_cores function to work with /proc/cpuinfo where
186 the "physical id" field is missing.
187 * Tidied the Exchange class, adding distinct status methods: unfinished and
188 busy.
189
190 New in pprocess 0.4 (Changes since pprocess 0.3.1)
191 --------------------------------------------------
192
193 * Added support for persistent/background processes.
194 * Added a utility function to detect and return the number of processor
195 cores available.
196 * Added missing documentation stylesheet.
197 * Added support for Solaris using pipes instead of socket pairs, since
198 the latter do not apparently work properly with poll on Solaris.
199
200 New in pprocess 0.3.1 (Changes since pprocess 0.3)
201 --------------------------------------------------
202
203 * Moved the reference material out of the module docstring and into a
204 separate document, converting it to XHTML in the process.
205 * Fixed the project name in the setup script.
206
207 New in pprocess 0.3 (Changes since parallel 0.2.5)
208 --------------------------------------------------
209
210 * Added managed callables: wrappers around callables which cause them to be
211 automatically managed by the exchange from which they were acquired.
212 * Added MakeParallel: a wrapper instantiated around a normal function which
213 sends the result of that function over the supplied channel when invoked.
214 * Added MakeReusable: a wrapper like MakeParallel which can be used in
215 conjunction with the newly-added reuse capability of the Exchange class in
216 order to reuse processes and channels.
217 * Added a Map class which attempts to emulate the built-in map function,
218 along with a pmap function using this class.
219 * Added a Queue class which provides a simpler iterator-style interface to
220 data produced by created processes.
221 * Added a create method to the Exchange class and an exit convenience
222 function to the module.
223 * Changed the Exchange implementation to not block when attempting to start
224 new processes beyond the process limit: such requests are queued and
225 performed as running processes are completed. This permits programs using
226 the start method to proceed to consumption of results more quickly.
227 * Extended and updated the examples. Added a tutorial.
228 * Added Ubuntu Feisty (7.04) package support.
229
230 New in parallel 0.2.5 (Changes since parallel 0.2.4)
231 ----------------------------------------------------
232
233 * Added a start method to the Exchange class for more convenient creation of
234 processes.
235 * Relicensed under the LGPL (version 3 or later) - this also fixes the
236 contradictory situation where the GPL was stated in the pprocess module
237 (which was not, in fact, the intention) and the LGPL was stated in the
238 documentation.
239
240 New in parallel 0.2.4 (Changes since parallel 0.2.3)
241 ----------------------------------------------------
242
243 * Set buffer sizes to zero for the file object wrappers around sockets: this
244 may prevent deadlock issues.
245
246 New in parallel 0.2.3 (Changes since parallel 0.2.2)
247 ----------------------------------------------------
248
249 * Added convenient message exchanges, offering methods handling common
250 situations at the cost of having to define a subclass of Exchange.
251 * Added a simple example of performing a parallel computation.
252 * Improved the PyGmy raytracer example to use the newly added functionality.
253
254 New in parallel 0.2.2 (Changes since parallel 0.2.1)
255 ----------------------------------------------------
256
257 * Changed the status testing in the Exchange class, potentially fixing the
258 premature closure of channels before all data was read.
259 * Fixed the PyGmy raytracer example's process accounting by relying on the
260 possibly more reliable Exchange behaviour, whilst also preventing
261 erroneous creation of "out of bounds" processes.
262 * Added a removed attribute on the Exchange to record which channels were
263 removed in the last call to the ready method.
264
265 New in parallel 0.2.1 (Changes since parallel 0.2)
266 --------------------------------------------------
267
268 * Added a PyGmy raytracer example.
269 * Updated copyright and licensing details (FSF address, additional works).
270
271 New in parallel 0.2 (Changes since parallel 0.1)
272 ------------------------------------------------
273
274 * Changed the name of the included module from parallel to pprocess in order
275 to avoid naming conflicts with PyParallel.
276
277 Release Procedures
278 ------------------
279
280 Update the pprocess __version__ attribute and the setup.py file version field.
281 Change the version number and package filename/directory in the documentation.
282 Update the release notes (see above).
283 Check the release information in the PKG-INFO file.
284 Tag, export.
285 Archive, upload.
286 Update PyPI.
287
288 Making Packages
289 ---------------
290
291 To make Debian-based packages:
292
293 1. Create new package directories under packages if necessary.
294 2. Make a symbolic link in the distribution's root directory to keep the
295 Debian tools happy:
296
297 ln -s packages/ubuntu-hoary/python2.4-parallel-pprocess/debian/
298
299 Or:
300
301 ln -s packages/ubuntu-feisty/python-pprocess/debian/
302
303 3. Run the package builder:
304
305 dpkg-buildpackage -rfakeroot
306
307 4. Locate and tidy up the packages in the parent directory of the
308 distribution's root directory.