paulb@22 | 1 | Introduction
|
paulb@22 | 2 | ------------
|
paulb@22 | 3 |
|
paulb@40 | 4 | The pprocess module provides elementary support for parallel programming in
|
paulb@22 | 5 | Python using a fork-based process creation model in conjunction with a
|
paulb@68 | 6 | channel-based communications model implemented using socketpair and poll. On
|
paulb@68 | 7 | systems with multiple CPUs or multicore CPUs, processes should take advantage
|
paulb@68 | 8 | of as many CPUs or cores as the operating system permits.
|
paulb@22 | 9 |
|
paul@168 | 10 | Since pprocess distributes work to other processes, certain aspects of the
|
paul@168 | 11 | behaviour of those processes may differ from the normal behaviour of such
|
paul@168 | 12 | code. For example, any mutable objects distributed to other processes can
|
paul@168 | 13 | still be modified, but any modifications will not be visible outside the
|
paul@169 | 14 | processes making such modifications.
|
paul@168 | 15 |
|
paulb@140 | 16 | Tutorial
|
paulb@140 | 17 | --------
|
paulb@140 | 18 |
|
paulb@140 | 19 | The tutorial provides some information about the examples described below.
|
paulb@144 | 20 | See the docs/tutorial.html file in the distribution for more details.
|
paulb@140 | 21 |
|
paulb@140 | 22 | Reference
|
paulb@140 | 23 | ---------
|
paulb@140 | 24 |
|
paulb@140 | 25 | A description of the different mechanisms provided by the pprocess module can
|
paulb@144 | 26 | be found in the reference document. See the docs/reference.html file in the
|
paulb@140 | 27 | distribution for more details.
|
paulb@140 | 28 |
|
paulb@22 | 29 | Quick Start
|
paulb@22 | 30 | -----------
|
paulb@22 | 31 |
|
paulb@105 | 32 | Try running the simple examples. For example:
|
paulb@68 | 33 |
|
paulb@100 | 34 | PYTHONPATH=. python examples/simple_create.py
|
paulb@105 | 35 |
|
paulb@105 | 36 | (These examples show in different ways how limited number of processes can be
|
paulb@113 | 37 | used to perform a parallel computation. The simple.py, simple1.py, simple2.py
|
paulb@113 | 38 | and simple_map.py programs are sequential versions of the other programs.)
|
paulb@105 | 39 |
|
paulb@105 | 40 | The following table summarises the features used in the programs:
|
paulb@105 | 41 |
|
paulb@113 | 42 | Program (.py) pmap MakeParallel manage start create Map Queue Exchange
|
paulb@113 | 43 | ------------- ---- ------------ ------ ----- ------ --- ----- --------
|
paulb@113 | 44 | simple_create_map Yes Yes
|
paulb@113 | 45 | simple_create_queue Yes Yes
|
paulb@113 | 46 | simple_create Yes Yes
|
paulb@113 | 47 | simple_managed_map Yes Yes Yes
|
paulb@113 | 48 | simple_managed_queue Yes Yes Yes
|
paulb@113 | 49 | simple_managed Yes Yes Yes
|
paulb@113 | 50 | simple_pmap Yes
|
paul@156 | 51 | simple_pmap_iter Yes
|
paulb@113 | 52 | simple_start_queue Yes Yes Yes
|
paulb@113 | 53 | simple_start Yes Yes
|
paulb@68 | 54 |
|
paul@156 | 55 | The simplest parallel programs are simple_pmap.py and simple_pmap_iter.py
|
paul@156 | 56 | which employ the pmap function resembling the built-in map function in
|
paul@156 | 57 | Python.
|
paulb@105 | 58 |
|
paulb@105 | 59 | Other simple programs are those employing the Queue class, together with those
|
paulb@105 | 60 | using the manage method which associates functions or callables with Queue or
|
paulb@105 | 61 | Exchange objects for convenient invocation of those functions and the
|
paulb@105 | 62 | management of their communications.
|
paulb@105 | 63 |
|
paulb@105 | 64 | The most technically involved program is simple_start.py which uses the
|
paulb@105 | 65 | Exchange class together with a calculation function which is aware of the
|
paulb@105 | 66 | parallel environment and which communicates over the supplied communications
|
paulb@105 | 67 | channel directly to the creating process.
|
paulb@105 | 68 |
|
paulb@105 | 69 | It should be noted that with the exception of simple_start.py, those examples
|
paulb@105 | 70 | employing calculation functions (as opposed to doing a calculation inline in a
|
paulb@105 | 71 | loop body) all use MakeParallel to make those functions parallel-aware, thus
|
paulb@105 | 72 | permitting the conversion of "normal" functions to a form usable in the
|
paulb@105 | 73 | parallel environment.
|
paulb@100 | 74 |
|
paulb@140 | 75 | Reusable Processes
|
paulb@140 | 76 | ------------------
|
paulb@140 | 77 |
|
paulb@119 | 78 | An additional example not listed above, simple_managed_map_reusable.py,
|
paulb@119 | 79 | employs the MakeReusable class instead of MakeParallel in order to demonstrate
|
paulb@140 | 80 | reusable processes and channels:
|
paulb@140 | 81 |
|
paulb@140 | 82 | PYTHONPATH=. python examples/simple_managed_map_reusable.py
|
paulb@140 | 83 |
|
paul@158 | 84 | Continuous Process Communications
|
paul@158 | 85 | ---------------------------------
|
paul@158 | 86 |
|
paul@158 | 87 | Another example not listed above, simple_continuous_queue.py, employs
|
paul@158 | 88 | continuous communications to monitor output from created processes:
|
paul@158 | 89 |
|
paul@158 | 90 | PYTHONPATH=. python examples/simple_continuous_queue.py
|
paul@158 | 91 |
|
paulb@140 | 92 | Persistent Processes
|
paulb@140 | 93 | --------------------
|
paulb@119 | 94 |
|
paulb@140 | 95 | A number of persistent variants of some of the above examples employ a
|
paulb@140 | 96 | persistent or background process which can be started by one process and
|
paulb@140 | 97 | contacted later by another in order to collect the results of a computation.
|
paulb@140 | 98 | For example:
|
paulb@140 | 99 |
|
paulb@140 | 100 | PYTHONPATH=. python examples/simple_persistent_managed.py --start
|
paulb@140 | 101 | PYTHONPATH=. python examples/simple_persistent_managed.py --reconnect
|
paulb@140 | 102 |
|
paulb@144 | 103 | PYTHONPATH=. python examples/simple_background_queue.py --start
|
paulb@144 | 104 | PYTHONPATH=. python examples/simple_background_queue.py --reconnect
|
paulb@100 | 105 |
|
paulb@148 | 106 | PYTHONPATH=. python examples/simple_persistent_queue.py --start
|
paulb@148 | 107 | PYTHONPATH=. python examples/simple_persistent_queue.py --reconnect
|
paulb@148 | 108 |
|
paulb@105 | 109 | Parallel Raytracing with PyGmy
|
paulb@105 | 110 | ------------------------------
|
paulb@105 | 111 |
|
paulb@100 | 112 | The PyGmy raytracer modified to use pprocess can be run to investigate the
|
paulb@105 | 113 | potential for speed increases in "real world" programs:
|
paulb@68 | 114 |
|
paulb@100 | 115 | cd examples/PyGmy
|
paulb@100 | 116 | PYTHONPATH=../..:. python scene.py
|
paulb@100 | 117 |
|
paulb@100 | 118 | (This should produce a file called test.tif - a TIFF file containing a
|
paulb@100 | 119 | raytraced scene image.)
|
paulb@100 | 120 |
|
paul@158 | 121 | Examples from the Concurrency SIG
|
paul@158 | 122 | ---------------------------------
|
paul@158 | 123 |
|
paul@163 | 124 | The special interest group (SIG) for concurrency in Python proposed a
|
paul@163 | 125 | particular application as a showcase for concurrency libraries. Two examples
|
paul@163 | 126 | are included which demonstrate pprocess and the use of continuous processes to
|
paul@163 | 127 | implement the application concerned:
|
paul@163 | 128 |
|
paul@158 | 129 | PYTHONPATH=. python examples/concurrency-sig/bottles.py
|
paul@158 | 130 | PYTHONPATH=. python examples/concurrency-sig/bottles_heartbeat.py
|
paul@158 | 131 |
|
paul@168 | 132 | Examples of Modifying Mutable Objects
|
paul@168 | 133 | -------------------------------------
|
paul@168 | 134 |
|
paul@168 | 135 | Mutable objects can be modified in processes created by pprocess, but the
|
paul@168 | 136 | modifications will not be visible in the parent process. The following
|
paul@168 | 137 | examples illustrate the problem:
|
paul@168 | 138 |
|
paul@171 | 139 | PYTHONPATH=. python examples/simple_mutation.py
|
paul@171 | 140 | PYTHONPATH=. python examples/simple_mutation_queue.py
|
paul@168 | 141 |
|
paul@168 | 142 | The former, non-parallel program will display the expected result of the
|
paul@168 | 143 | computation, whereas the latter, parallel program will fail to do so. This is
|
paul@168 | 144 | because the latter attempts to modify the input collection in order to use it
|
paul@168 | 145 | as a result collection, but these modifications are not propagated back to the
|
paul@168 | 146 | parent process.
|
paul@168 | 147 |
|
paulb@105 | 148 | Test Programs
|
paulb@105 | 149 | -------------
|
paulb@105 | 150 |
|
paulb@100 | 151 | There are some elementary tests:
|
paulb@22 | 152 |
|
paulb@22 | 153 | PYTHONPATH=. python tests/create_loop.py
|
paulb@22 | 154 | PYTHONPATH=. python tests/start_loop.py
|
paulb@22 | 155 |
|
paulb@22 | 156 | (Simple loop demonstrations which use two different ways of creating and
|
paulb@22 | 157 | starting the parallel processes.)
|
paulb@22 | 158 |
|
paulb@36 | 159 | PYTHONPATH=. python tests/start_indexer.py <directory>
|
paulb@22 | 160 |
|
paulb@36 | 161 | (A text indexing demonstration, where <directory> should be a directory
|
paulb@36 | 162 | containing text files to be indexed, although HTML files will also work well
|
paulb@36 | 163 | enough. After indexing the files, a prompt will appear, words or word
|
paulb@36 | 164 | fragments can be entered, and matching words and their locations will be
|
paulb@36 | 165 | shown. Run the program without arguments to see more information.)
|
paulb@22 | 166 |
|
paulb@22 | 167 | Contact, Copyright and Licence Information
|
paulb@22 | 168 | ------------------------------------------
|
paulb@22 | 169 |
|
paulb@132 | 170 | The current Web page for pprocess at the time of release is:
|
paulb@132 | 171 |
|
paulb@132 | 172 | http://www.boddie.org.uk/python/pprocess.html
|
paulb@132 | 173 |
|
paulb@132 | 174 | The author can be contacted at the following e-mail address:
|
paulb@22 | 175 |
|
paulb@22 | 176 | paul@boddie.org.uk
|
paulb@22 | 177 |
|
paulb@22 | 178 | Copyright and licence information can be found in the docs directory - see
|
paulb@78 | 179 | docs/COPYING.txt, docs/lgpl-3.0.txt and docs/gpl-3.0.txt for more information.
|
paulb@22 | 180 |
|
paulb@48 | 181 | For the PyGmy raytracer example, different copyright and licence information
|
paulb@48 | 182 | is provided in the docs directory - see docs/COPYING-PyGmy.txt and
|
paulb@48 | 183 | docs/LICENCE-PyGmy.txt for more information.
|
paulb@48 | 184 |
|
paulb@22 | 185 | Dependencies
|
paulb@22 | 186 | ------------
|
paulb@22 | 187 |
|
paulb@22 | 188 | This software depends on standard library features which are stated as being
|
paul@156 | 189 | available only on "UNIX"; it has only been tested repeatedly on a GNU/Linux
|
paul@156 | 190 | system, and occasionally on systems running OpenSolaris.
|
paulb@22 | 191 |
|
paul@168 | 192 | New in pprocess 0.5.2 (Changes since pprocess 0.5.1)
|
paul@168 | 193 | ----------------------------------------------------
|
paul@168 | 194 |
|
paul@168 | 195 | * Added examples involving mutable objects and the inability of pprocess to
|
paul@168 | 196 | automatically propagate changes to such objects back to parent processes.
|
paul@171 | 197 | * Added an explanatory section to the tutorial about data exchange between
|
paul@171 | 198 | processes and the differences from "normal" Python program behaviour.
|
paul@168 | 199 |
|
paul@166 | 200 | New in pprocess 0.5.1 (Changes since pprocess 0.5)
|
paul@166 | 201 | --------------------------------------------------
|
paul@166 | 202 |
|
paul@166 | 203 | * Added IOError handling when processes exit apparently without warning.
|
paul@166 | 204 |
|
paul@160 | 205 | New in pprocess 0.5 (Changes since pprocess 0.4)
|
paul@160 | 206 | ------------------------------------------------
|
paul@155 | 207 |
|
paul@160 | 208 | * Added proper support in the Exchange class for continuous communications
|
paul@160 | 209 | between processes, providing examples: simple_continuous_queue.py and the
|
paul@160 | 210 | concurrency-sig directory.
|
paul@156 | 211 | * Changed the Map class to permit incremental access to received results
|
paul@156 | 212 | from completed parts of the sequence of inputs, also adding an iteration
|
paul@156 | 213 | interface.
|
paul@156 | 214 | * Added an example, simple_pmap_iter.py, to demonstrate iteration over maps.
|
paul@160 | 215 | * Fixed the get_number_of_cores function to work with /proc/cpuinfo where
|
paul@160 | 216 | the "physical id" field is missing.
|
paul@160 | 217 | * Tidied the Exchange class, adding distinct status methods: unfinished and
|
paul@160 | 218 | busy.
|
paul@155 | 219 |
|
paulb@144 | 220 | New in pprocess 0.4 (Changes since pprocess 0.3.1)
|
paulb@144 | 221 | --------------------------------------------------
|
paulb@135 | 222 |
|
paulb@140 | 223 | * Added support for persistent/background processes.
|
paulb@135 | 224 | * Added a utility function to detect and return the number of processor
|
paulb@135 | 225 | cores available.
|
paulb@137 | 226 | * Added missing documentation stylesheet.
|
paulb@150 | 227 | * Added support for Solaris using pipes instead of socket pairs, since
|
paulb@150 | 228 | the latter do not apparently work properly with poll on Solaris.
|
paulb@135 | 229 |
|
paulb@131 | 230 | New in pprocess 0.3.1 (Changes since pprocess 0.3)
|
paulb@131 | 231 | --------------------------------------------------
|
paulb@131 | 232 |
|
paulb@131 | 233 | * Moved the reference material out of the module docstring and into a
|
paulb@131 | 234 | separate document, converting it to XHTML in the process.
|
paulb@131 | 235 | * Fixed the project name in the setup script.
|
paulb@131 | 236 |
|
paulb@126 | 237 | New in pprocess 0.3 (Changes since parallel 0.2.5)
|
paulb@100 | 238 | --------------------------------------------------
|
paulb@84 | 239 |
|
paulb@84 | 240 | * Added managed callables: wrappers around callables which cause them to be
|
paulb@84 | 241 | automatically managed by the exchange from which they were acquired.
|
paulb@84 | 242 | * Added MakeParallel: a wrapper instantiated around a normal function which
|
paulb@84 | 243 | sends the result of that function over the supplied channel when invoked.
|
paulb@119 | 244 | * Added MakeReusable: a wrapper like MakeParallel which can be used in
|
paulb@119 | 245 | conjunction with the newly-added reuse capability of the Exchange class in
|
paulb@119 | 246 | order to reuse processes and channels.
|
paulb@89 | 247 | * Added a Map class which attempts to emulate the built-in map function,
|
paulb@89 | 248 | along with a pmap function using this class.
|
paulb@100 | 249 | * Added a Queue class which provides a simpler iterator-style interface to
|
paulb@100 | 250 | data produced by created processes.
|
paulb@100 | 251 | * Added a create method to the Exchange class and an exit convenience
|
paulb@100 | 252 | function to the module.
|
paulb@100 | 253 | * Changed the Exchange implementation to not block when attempting to start
|
paulb@100 | 254 | new processes beyond the process limit: such requests are queued and
|
paulb@100 | 255 | performed as running processes are completed. This permits programs using
|
paulb@100 | 256 | the start method to proceed to consumption of results more quickly.
|
paulb@105 | 257 | * Extended and updated the examples. Added a tutorial.
|
paulb@100 | 258 | * Added Ubuntu Feisty (7.04) package support.
|
paulb@84 | 259 |
|
paulb@78 | 260 | New in parallel 0.2.5 (Changes since parallel 0.2.4)
|
paulb@78 | 261 | ----------------------------------------------------
|
paulb@78 | 262 |
|
paulb@78 | 263 | * Added a start method to the Exchange class for more convenient creation of
|
paulb@78 | 264 | processes.
|
paulb@78 | 265 | * Relicensed under the LGPL (version 3 or later) - this also fixes the
|
paulb@78 | 266 | contradictory situation where the GPL was stated in the pprocess module
|
paulb@78 | 267 | (which was not, in fact, the intention) and the LGPL was stated in the
|
paulb@78 | 268 | documentation.
|
paulb@78 | 269 |
|
paulb@73 | 270 | New in parallel 0.2.4 (Changes since parallel 0.2.3)
|
paulb@73 | 271 | ----------------------------------------------------
|
paulb@73 | 272 |
|
paulb@73 | 273 | * Set buffer sizes to zero for the file object wrappers around sockets: this
|
paulb@73 | 274 | may prevent deadlock issues.
|
paulb@73 | 275 |
|
paulb@68 | 276 | New in parallel 0.2.3 (Changes since parallel 0.2.2)
|
paulb@68 | 277 | ----------------------------------------------------
|
paulb@68 | 278 |
|
paulb@68 | 279 | * Added convenient message exchanges, offering methods handling common
|
paulb@68 | 280 | situations at the cost of having to define a subclass of Exchange.
|
paulb@68 | 281 | * Added a simple example of performing a parallel computation.
|
paulb@68 | 282 | * Improved the PyGmy raytracer example to use the newly added functionality.
|
paulb@68 | 283 |
|
paulb@55 | 284 | New in parallel 0.2.2 (Changes since parallel 0.2.1)
|
paulb@55 | 285 | ----------------------------------------------------
|
paulb@55 | 286 |
|
paulb@55 | 287 | * Changed the status testing in the Exchange class, potentially fixing the
|
paulb@55 | 288 | premature closure of channels before all data was read.
|
paulb@55 | 289 | * Fixed the PyGmy raytracer example's process accounting by relying on the
|
paulb@55 | 290 | possibly more reliable Exchange behaviour, whilst also preventing
|
paulb@55 | 291 | erroneous creation of "out of bounds" processes.
|
paulb@58 | 292 | * Added a removed attribute on the Exchange to record which channels were
|
paulb@58 | 293 | removed in the last call to the ready method.
|
paulb@55 | 294 |
|
paulb@48 | 295 | New in parallel 0.2.1 (Changes since parallel 0.2)
|
paulb@48 | 296 | --------------------------------------------------
|
paulb@48 | 297 |
|
paulb@48 | 298 | * Added a PyGmy raytracer example.
|
paulb@53 | 299 | * Updated copyright and licensing details (FSF address, additional works).
|
paulb@48 | 300 |
|
paulb@40 | 301 | New in parallel 0.2 (Changes since parallel 0.1)
|
paulb@40 | 302 | ------------------------------------------------
|
paulb@40 | 303 |
|
paulb@40 | 304 | * Changed the name of the included module from parallel to pprocess in order
|
paulb@40 | 305 | to avoid naming conflicts with PyParallel.
|
paulb@40 | 306 |
|
paulb@22 | 307 | Release Procedures
|
paulb@22 | 308 | ------------------
|
paulb@22 | 309 |
|
paul@155 | 310 | Update the pprocess __version__ attribute and the setup.py file version field.
|
paulb@22 | 311 | Change the version number and package filename/directory in the documentation.
|
paulb@22 | 312 | Update the release notes (see above).
|
paulb@22 | 313 | Check the release information in the PKG-INFO file.
|
paulb@22 | 314 | Tag, export.
|
paulb@22 | 315 | Archive, upload.
|
paulb@68 | 316 | Update PyPI.
|
paulb@26 | 317 |
|
paulb@26 | 318 | Making Packages
|
paulb@26 | 319 | ---------------
|
paulb@26 | 320 |
|
paulb@44 | 321 | To make Debian-based packages:
|
paulb@26 | 322 |
|
paulb@44 | 323 | 1. Create new package directories under packages if necessary.
|
paulb@26 | 324 | 2. Make a symbolic link in the distribution's root directory to keep the
|
paulb@26 | 325 | Debian tools happy:
|
paulb@26 | 326 |
|
paulb@44 | 327 | ln -s packages/ubuntu-hoary/python2.4-parallel-pprocess/debian/
|
paulb@26 | 328 |
|
paulb@100 | 329 | Or:
|
paulb@100 | 330 |
|
paulb@100 | 331 | ln -s packages/ubuntu-feisty/python-pprocess/debian/
|
paulb@100 | 332 |
|
paulb@26 | 333 | 3. Run the package builder:
|
paulb@26 | 334 |
|
paulb@26 | 335 | dpkg-buildpackage -rfakeroot
|
paulb@26 | 336 |
|
paulb@26 | 337 | 4. Locate and tidy up the packages in the parent directory of the
|
paulb@26 | 338 | distribution's root directory.
|