paulb@22 | 1 | Introduction
|
paulb@22 | 2 | ------------
|
paulb@22 | 3 |
|
paulb@40 | 4 | The pprocess module provides elementary support for parallel programming in
|
paulb@22 | 5 | Python using a fork-based process creation model in conjunction with a
|
paulb@68 | 6 | channel-based communications model implemented using socketpair and poll. On
|
paulb@68 | 7 | systems with multiple CPUs or multicore CPUs, processes should take advantage
|
paulb@68 | 8 | of as many CPUs or cores as the operating system permits.
|
paulb@22 | 9 |
|
paulb@140 | 10 | Tutorial
|
paulb@140 | 11 | --------
|
paulb@140 | 12 |
|
paulb@140 | 13 | The tutorial provides some information about the examples described below.
|
paulb@144 | 14 | See the docs/tutorial.html file in the distribution for more details.
|
paulb@140 | 15 |
|
paulb@140 | 16 | Reference
|
paulb@140 | 17 | ---------
|
paulb@140 | 18 |
|
paulb@140 | 19 | A description of the different mechanisms provided by the pprocess module can
|
paulb@144 | 20 | be found in the reference document. See the docs/reference.html file in the
|
paulb@140 | 21 | distribution for more details.
|
paulb@140 | 22 |
|
paulb@22 | 23 | Quick Start
|
paulb@22 | 24 | -----------
|
paulb@22 | 25 |
|
paulb@105 | 26 | Try running the simple examples. For example:
|
paulb@68 | 27 |
|
paulb@100 | 28 | PYTHONPATH=. python examples/simple_create.py
|
paulb@105 | 29 |
|
paulb@105 | 30 | (These examples show in different ways how limited number of processes can be
|
paulb@113 | 31 | used to perform a parallel computation. The simple.py, simple1.py, simple2.py
|
paulb@113 | 32 | and simple_map.py programs are sequential versions of the other programs.)
|
paulb@105 | 33 |
|
paulb@105 | 34 | The following table summarises the features used in the programs:
|
paulb@105 | 35 |
|
paulb@113 | 36 | Program (.py) pmap MakeParallel manage start create Map Queue Exchange
|
paulb@113 | 37 | ------------- ---- ------------ ------ ----- ------ --- ----- --------
|
paulb@113 | 38 | simple_create_map Yes Yes
|
paulb@113 | 39 | simple_create_queue Yes Yes
|
paulb@113 | 40 | simple_create Yes Yes
|
paulb@113 | 41 | simple_managed_map Yes Yes Yes
|
paulb@113 | 42 | simple_managed_queue Yes Yes Yes
|
paulb@113 | 43 | simple_managed Yes Yes Yes
|
paulb@113 | 44 | simple_pmap Yes
|
paulb@113 | 45 | simple_start_queue Yes Yes Yes
|
paulb@113 | 46 | simple_start Yes Yes
|
paulb@68 | 47 |
|
paulb@105 | 48 | The simplest parallel program is simple_pmap.py which employs the pmap
|
paulb@105 | 49 | function resembling the built-in map function in Python.
|
paulb@105 | 50 |
|
paulb@105 | 51 | Other simple programs are those employing the Queue class, together with those
|
paulb@105 | 52 | using the manage method which associates functions or callables with Queue or
|
paulb@105 | 53 | Exchange objects for convenient invocation of those functions and the
|
paulb@105 | 54 | management of their communications.
|
paulb@105 | 55 |
|
paulb@105 | 56 | The most technically involved program is simple_start.py which uses the
|
paulb@105 | 57 | Exchange class together with a calculation function which is aware of the
|
paulb@105 | 58 | parallel environment and which communicates over the supplied communications
|
paulb@105 | 59 | channel directly to the creating process.
|
paulb@105 | 60 |
|
paulb@105 | 61 | It should be noted that with the exception of simple_start.py, those examples
|
paulb@105 | 62 | employing calculation functions (as opposed to doing a calculation inline in a
|
paulb@105 | 63 | loop body) all use MakeParallel to make those functions parallel-aware, thus
|
paulb@105 | 64 | permitting the conversion of "normal" functions to a form usable in the
|
paulb@105 | 65 | parallel environment.
|
paulb@100 | 66 |
|
paulb@140 | 67 | Reusable Processes
|
paulb@140 | 68 | ------------------
|
paulb@140 | 69 |
|
paulb@119 | 70 | An additional example not listed above, simple_managed_map_reusable.py,
|
paulb@119 | 71 | employs the MakeReusable class instead of MakeParallel in order to demonstrate
|
paulb@140 | 72 | reusable processes and channels:
|
paulb@140 | 73 |
|
paulb@140 | 74 | PYTHONPATH=. python examples/simple_managed_map_reusable.py
|
paulb@140 | 75 |
|
paulb@140 | 76 | Persistent Processes
|
paulb@140 | 77 | --------------------
|
paulb@119 | 78 |
|
paulb@140 | 79 | A number of persistent variants of some of the above examples employ a
|
paulb@140 | 80 | persistent or background process which can be started by one process and
|
paulb@140 | 81 | contacted later by another in order to collect the results of a computation.
|
paulb@140 | 82 | For example:
|
paulb@140 | 83 |
|
paulb@140 | 84 | PYTHONPATH=. python examples/simple_persistent_managed.py --start
|
paulb@140 | 85 | PYTHONPATH=. python examples/simple_persistent_managed.py --reconnect
|
paulb@140 | 86 |
|
paulb@144 | 87 | PYTHONPATH=. python examples/simple_background_queue.py --start
|
paulb@144 | 88 | PYTHONPATH=. python examples/simple_background_queue.py --reconnect
|
paulb@100 | 89 |
|
paulb@148 | 90 | PYTHONPATH=. python examples/simple_persistent_queue.py --start
|
paulb@148 | 91 | PYTHONPATH=. python examples/simple_persistent_queue.py --reconnect
|
paulb@148 | 92 |
|
paulb@105 | 93 | Parallel Raytracing with PyGmy
|
paulb@105 | 94 | ------------------------------
|
paulb@105 | 95 |
|
paulb@100 | 96 | The PyGmy raytracer modified to use pprocess can be run to investigate the
|
paulb@105 | 97 | potential for speed increases in "real world" programs:
|
paulb@68 | 98 |
|
paulb@100 | 99 | cd examples/PyGmy
|
paulb@100 | 100 | PYTHONPATH=../..:. python scene.py
|
paulb@100 | 101 |
|
paulb@100 | 102 | (This should produce a file called test.tif - a TIFF file containing a
|
paulb@100 | 103 | raytraced scene image.)
|
paulb@100 | 104 |
|
paulb@105 | 105 | Test Programs
|
paulb@105 | 106 | -------------
|
paulb@105 | 107 |
|
paulb@100 | 108 | There are some elementary tests:
|
paulb@22 | 109 |
|
paulb@22 | 110 | PYTHONPATH=. python tests/create_loop.py
|
paulb@22 | 111 | PYTHONPATH=. python tests/start_loop.py
|
paulb@22 | 112 |
|
paulb@22 | 113 | (Simple loop demonstrations which use two different ways of creating and
|
paulb@22 | 114 | starting the parallel processes.)
|
paulb@22 | 115 |
|
paulb@36 | 116 | PYTHONPATH=. python tests/start_indexer.py <directory>
|
paulb@22 | 117 |
|
paulb@36 | 118 | (A text indexing demonstration, where <directory> should be a directory
|
paulb@36 | 119 | containing text files to be indexed, although HTML files will also work well
|
paulb@36 | 120 | enough. After indexing the files, a prompt will appear, words or word
|
paulb@36 | 121 | fragments can be entered, and matching words and their locations will be
|
paulb@36 | 122 | shown. Run the program without arguments to see more information.)
|
paulb@22 | 123 |
|
paulb@22 | 124 | Contact, Copyright and Licence Information
|
paulb@22 | 125 | ------------------------------------------
|
paulb@22 | 126 |
|
paulb@132 | 127 | The current Web page for pprocess at the time of release is:
|
paulb@132 | 128 |
|
paulb@132 | 129 | http://www.boddie.org.uk/python/pprocess.html
|
paulb@132 | 130 |
|
paulb@132 | 131 | The author can be contacted at the following e-mail address:
|
paulb@22 | 132 |
|
paulb@22 | 133 | paul@boddie.org.uk
|
paulb@22 | 134 |
|
paulb@22 | 135 | Copyright and licence information can be found in the docs directory - see
|
paulb@78 | 136 | docs/COPYING.txt, docs/lgpl-3.0.txt and docs/gpl-3.0.txt for more information.
|
paulb@22 | 137 |
|
paulb@48 | 138 | For the PyGmy raytracer example, different copyright and licence information
|
paulb@48 | 139 | is provided in the docs directory - see docs/COPYING-PyGmy.txt and
|
paulb@48 | 140 | docs/LICENCE-PyGmy.txt for more information.
|
paulb@48 | 141 |
|
paulb@22 | 142 | Dependencies
|
paulb@22 | 143 | ------------
|
paulb@22 | 144 |
|
paulb@22 | 145 | This software depends on standard library features which are stated as being
|
paulb@22 | 146 | available only on "UNIX"; it has only been tested on a GNU/Linux system.
|
paulb@22 | 147 |
|
paulb@144 | 148 | New in pprocess 0.4 (Changes since pprocess 0.3.1)
|
paulb@144 | 149 | --------------------------------------------------
|
paulb@135 | 150 |
|
paulb@140 | 151 | * Added support for persistent/background processes.
|
paulb@135 | 152 | * Added a utility function to detect and return the number of processor
|
paulb@135 | 153 | cores available.
|
paulb@137 | 154 | * Added missing documentation stylesheet.
|
paulb@135 | 155 |
|
paulb@131 | 156 | New in pprocess 0.3.1 (Changes since pprocess 0.3)
|
paulb@131 | 157 | --------------------------------------------------
|
paulb@131 | 158 |
|
paulb@131 | 159 | * Moved the reference material out of the module docstring and into a
|
paulb@131 | 160 | separate document, converting it to XHTML in the process.
|
paulb@131 | 161 | * Fixed the project name in the setup script.
|
paulb@131 | 162 |
|
paulb@126 | 163 | New in pprocess 0.3 (Changes since parallel 0.2.5)
|
paulb@100 | 164 | --------------------------------------------------
|
paulb@84 | 165 |
|
paulb@84 | 166 | * Added managed callables: wrappers around callables which cause them to be
|
paulb@84 | 167 | automatically managed by the exchange from which they were acquired.
|
paulb@84 | 168 | * Added MakeParallel: a wrapper instantiated around a normal function which
|
paulb@84 | 169 | sends the result of that function over the supplied channel when invoked.
|
paulb@119 | 170 | * Added MakeReusable: a wrapper like MakeParallel which can be used in
|
paulb@119 | 171 | conjunction with the newly-added reuse capability of the Exchange class in
|
paulb@119 | 172 | order to reuse processes and channels.
|
paulb@89 | 173 | * Added a Map class which attempts to emulate the built-in map function,
|
paulb@89 | 174 | along with a pmap function using this class.
|
paulb@100 | 175 | * Added a Queue class which provides a simpler iterator-style interface to
|
paulb@100 | 176 | data produced by created processes.
|
paulb@100 | 177 | * Added a create method to the Exchange class and an exit convenience
|
paulb@100 | 178 | function to the module.
|
paulb@100 | 179 | * Changed the Exchange implementation to not block when attempting to start
|
paulb@100 | 180 | new processes beyond the process limit: such requests are queued and
|
paulb@100 | 181 | performed as running processes are completed. This permits programs using
|
paulb@100 | 182 | the start method to proceed to consumption of results more quickly.
|
paulb@105 | 183 | * Extended and updated the examples. Added a tutorial.
|
paulb@100 | 184 | * Added Ubuntu Feisty (7.04) package support.
|
paulb@84 | 185 |
|
paulb@78 | 186 | New in parallel 0.2.5 (Changes since parallel 0.2.4)
|
paulb@78 | 187 | ----------------------------------------------------
|
paulb@78 | 188 |
|
paulb@78 | 189 | * Added a start method to the Exchange class for more convenient creation of
|
paulb@78 | 190 | processes.
|
paulb@78 | 191 | * Relicensed under the LGPL (version 3 or later) - this also fixes the
|
paulb@78 | 192 | contradictory situation where the GPL was stated in the pprocess module
|
paulb@78 | 193 | (which was not, in fact, the intention) and the LGPL was stated in the
|
paulb@78 | 194 | documentation.
|
paulb@78 | 195 |
|
paulb@73 | 196 | New in parallel 0.2.4 (Changes since parallel 0.2.3)
|
paulb@73 | 197 | ----------------------------------------------------
|
paulb@73 | 198 |
|
paulb@73 | 199 | * Set buffer sizes to zero for the file object wrappers around sockets: this
|
paulb@73 | 200 | may prevent deadlock issues.
|
paulb@73 | 201 |
|
paulb@68 | 202 | New in parallel 0.2.3 (Changes since parallel 0.2.2)
|
paulb@68 | 203 | ----------------------------------------------------
|
paulb@68 | 204 |
|
paulb@68 | 205 | * Added convenient message exchanges, offering methods handling common
|
paulb@68 | 206 | situations at the cost of having to define a subclass of Exchange.
|
paulb@68 | 207 | * Added a simple example of performing a parallel computation.
|
paulb@68 | 208 | * Improved the PyGmy raytracer example to use the newly added functionality.
|
paulb@68 | 209 |
|
paulb@55 | 210 | New in parallel 0.2.2 (Changes since parallel 0.2.1)
|
paulb@55 | 211 | ----------------------------------------------------
|
paulb@55 | 212 |
|
paulb@55 | 213 | * Changed the status testing in the Exchange class, potentially fixing the
|
paulb@55 | 214 | premature closure of channels before all data was read.
|
paulb@55 | 215 | * Fixed the PyGmy raytracer example's process accounting by relying on the
|
paulb@55 | 216 | possibly more reliable Exchange behaviour, whilst also preventing
|
paulb@55 | 217 | erroneous creation of "out of bounds" processes.
|
paulb@58 | 218 | * Added a removed attribute on the Exchange to record which channels were
|
paulb@58 | 219 | removed in the last call to the ready method.
|
paulb@55 | 220 |
|
paulb@48 | 221 | New in parallel 0.2.1 (Changes since parallel 0.2)
|
paulb@48 | 222 | --------------------------------------------------
|
paulb@48 | 223 |
|
paulb@48 | 224 | * Added a PyGmy raytracer example.
|
paulb@53 | 225 | * Updated copyright and licensing details (FSF address, additional works).
|
paulb@48 | 226 |
|
paulb@40 | 227 | New in parallel 0.2 (Changes since parallel 0.1)
|
paulb@40 | 228 | ------------------------------------------------
|
paulb@40 | 229 |
|
paulb@40 | 230 | * Changed the name of the included module from parallel to pprocess in order
|
paulb@40 | 231 | to avoid naming conflicts with PyParallel.
|
paulb@40 | 232 |
|
paulb@22 | 233 | Release Procedures
|
paulb@22 | 234 | ------------------
|
paulb@22 | 235 |
|
paulb@40 | 236 | Update the pprocess __version__ attribute.
|
paulb@22 | 237 | Change the version number and package filename/directory in the documentation.
|
paulb@22 | 238 | Update the release notes (see above).
|
paulb@22 | 239 | Check the release information in the PKG-INFO file.
|
paulb@22 | 240 | Tag, export.
|
paulb@22 | 241 | Archive, upload.
|
paulb@68 | 242 | Update PyPI.
|
paulb@26 | 243 |
|
paulb@26 | 244 | Making Packages
|
paulb@26 | 245 | ---------------
|
paulb@26 | 246 |
|
paulb@44 | 247 | To make Debian-based packages:
|
paulb@26 | 248 |
|
paulb@44 | 249 | 1. Create new package directories under packages if necessary.
|
paulb@26 | 250 | 2. Make a symbolic link in the distribution's root directory to keep the
|
paulb@26 | 251 | Debian tools happy:
|
paulb@26 | 252 |
|
paulb@44 | 253 | ln -s packages/ubuntu-hoary/python2.4-parallel-pprocess/debian/
|
paulb@26 | 254 |
|
paulb@100 | 255 | Or:
|
paulb@100 | 256 |
|
paulb@100 | 257 | ln -s packages/ubuntu-feisty/python-pprocess/debian/
|
paulb@100 | 258 |
|
paulb@26 | 259 | 3. Run the package builder:
|
paulb@26 | 260 |
|
paulb@26 | 261 | dpkg-buildpackage -rfakeroot
|
paulb@26 | 262 |
|
paulb@26 | 263 | 4. Locate and tidy up the packages in the parent directory of the
|
paulb@26 | 264 | distribution's root directory.
|