paulb@22 | 1 | Introduction
|
paulb@22 | 2 | ------------
|
paulb@22 | 3 |
|
paulb@40 | 4 | The pprocess module provides elementary support for parallel programming in
|
paulb@22 | 5 | Python using a fork-based process creation model in conjunction with a
|
paulb@68 | 6 | channel-based communications model implemented using socketpair and poll. On
|
paulb@68 | 7 | systems with multiple CPUs or multicore CPUs, processes should take advantage
|
paulb@68 | 8 | of as many CPUs or cores as the operating system permits.
|
paulb@22 | 9 |
|
paulb@22 | 10 | Quick Start
|
paulb@22 | 11 | -----------
|
paulb@22 | 12 |
|
paulb@105 | 13 | Try running the simple examples. For example:
|
paulb@68 | 14 |
|
paulb@100 | 15 | PYTHONPATH=. python examples/simple_create.py
|
paulb@105 | 16 |
|
paulb@105 | 17 | (These examples show in different ways how limited number of processes can be
|
paulb@113 | 18 | used to perform a parallel computation. The simple.py, simple1.py, simple2.py
|
paulb@113 | 19 | and simple_map.py programs are sequential versions of the other programs.)
|
paulb@105 | 20 |
|
paulb@105 | 21 | The following table summarises the features used in the programs:
|
paulb@105 | 22 |
|
paulb@113 | 23 | Program (.py) pmap MakeParallel manage start create Map Queue Exchange
|
paulb@113 | 24 | ------------- ---- ------------ ------ ----- ------ --- ----- --------
|
paulb@113 | 25 | simple_create_map Yes Yes
|
paulb@113 | 26 | simple_create_queue Yes Yes
|
paulb@113 | 27 | simple_create Yes Yes
|
paulb@113 | 28 | simple_managed_map Yes Yes Yes
|
paulb@113 | 29 | simple_managed_queue Yes Yes Yes
|
paulb@113 | 30 | simple_managed Yes Yes Yes
|
paulb@113 | 31 | simple_pmap Yes
|
paulb@113 | 32 | simple_start_queue Yes Yes Yes
|
paulb@113 | 33 | simple_start Yes Yes
|
paulb@68 | 34 |
|
paulb@105 | 35 | The simplest parallel program is simple_pmap.py which employs the pmap
|
paulb@105 | 36 | function resembling the built-in map function in Python.
|
paulb@105 | 37 |
|
paulb@105 | 38 | Other simple programs are those employing the Queue class, together with those
|
paulb@105 | 39 | using the manage method which associates functions or callables with Queue or
|
paulb@105 | 40 | Exchange objects for convenient invocation of those functions and the
|
paulb@105 | 41 | management of their communications.
|
paulb@105 | 42 |
|
paulb@105 | 43 | The most technically involved program is simple_start.py which uses the
|
paulb@105 | 44 | Exchange class together with a calculation function which is aware of the
|
paulb@105 | 45 | parallel environment and which communicates over the supplied communications
|
paulb@105 | 46 | channel directly to the creating process.
|
paulb@105 | 47 |
|
paulb@105 | 48 | It should be noted that with the exception of simple_start.py, those examples
|
paulb@105 | 49 | employing calculation functions (as opposed to doing a calculation inline in a
|
paulb@105 | 50 | loop body) all use MakeParallel to make those functions parallel-aware, thus
|
paulb@105 | 51 | permitting the conversion of "normal" functions to a form usable in the
|
paulb@105 | 52 | parallel environment.
|
paulb@100 | 53 |
|
paulb@119 | 54 | An additional example not listed above, simple_managed_map_reusable.py,
|
paulb@119 | 55 | employs the MakeReusable class instead of MakeParallel in order to demonstrate
|
paulb@119 | 56 | reusable processes and channels.
|
paulb@119 | 57 |
|
paulb@100 | 58 | The tutorial provides some information about the examples: docs/tutorial.xhtml
|
paulb@100 | 59 |
|
paulb@105 | 60 | Parallel Raytracing with PyGmy
|
paulb@105 | 61 | ------------------------------
|
paulb@105 | 62 |
|
paulb@100 | 63 | The PyGmy raytracer modified to use pprocess can be run to investigate the
|
paulb@105 | 64 | potential for speed increases in "real world" programs:
|
paulb@68 | 65 |
|
paulb@100 | 66 | cd examples/PyGmy
|
paulb@100 | 67 | PYTHONPATH=../..:. python scene.py
|
paulb@100 | 68 |
|
paulb@100 | 69 | (This should produce a file called test.tif - a TIFF file containing a
|
paulb@100 | 70 | raytraced scene image.)
|
paulb@100 | 71 |
|
paulb@105 | 72 | Test Programs
|
paulb@105 | 73 | -------------
|
paulb@105 | 74 |
|
paulb@100 | 75 | There are some elementary tests:
|
paulb@22 | 76 |
|
paulb@22 | 77 | PYTHONPATH=. python tests/create_loop.py
|
paulb@22 | 78 | PYTHONPATH=. python tests/start_loop.py
|
paulb@22 | 79 |
|
paulb@22 | 80 | (Simple loop demonstrations which use two different ways of creating and
|
paulb@22 | 81 | starting the parallel processes.)
|
paulb@22 | 82 |
|
paulb@36 | 83 | PYTHONPATH=. python tests/start_indexer.py <directory>
|
paulb@22 | 84 |
|
paulb@36 | 85 | (A text indexing demonstration, where <directory> should be a directory
|
paulb@36 | 86 | containing text files to be indexed, although HTML files will also work well
|
paulb@36 | 87 | enough. After indexing the files, a prompt will appear, words or word
|
paulb@36 | 88 | fragments can be entered, and matching words and their locations will be
|
paulb@36 | 89 | shown. Run the program without arguments to see more information.)
|
paulb@22 | 90 |
|
paulb@22 | 91 | Contact, Copyright and Licence Information
|
paulb@22 | 92 | ------------------------------------------
|
paulb@22 | 93 |
|
paulb@132 | 94 | The current Web page for pprocess at the time of release is:
|
paulb@132 | 95 |
|
paulb@132 | 96 | http://www.boddie.org.uk/python/pprocess.html
|
paulb@132 | 97 |
|
paulb@132 | 98 | The author can be contacted at the following e-mail address:
|
paulb@22 | 99 |
|
paulb@22 | 100 | paul@boddie.org.uk
|
paulb@22 | 101 |
|
paulb@22 | 102 | Copyright and licence information can be found in the docs directory - see
|
paulb@78 | 103 | docs/COPYING.txt, docs/lgpl-3.0.txt and docs/gpl-3.0.txt for more information.
|
paulb@22 | 104 |
|
paulb@48 | 105 | For the PyGmy raytracer example, different copyright and licence information
|
paulb@48 | 106 | is provided in the docs directory - see docs/COPYING-PyGmy.txt and
|
paulb@48 | 107 | docs/LICENCE-PyGmy.txt for more information.
|
paulb@48 | 108 |
|
paulb@22 | 109 | Dependencies
|
paulb@22 | 110 | ------------
|
paulb@22 | 111 |
|
paulb@22 | 112 | This software depends on standard library features which are stated as being
|
paulb@22 | 113 | available only on "UNIX"; it has only been tested on a GNU/Linux system.
|
paulb@22 | 114 |
|
paulb@131 | 115 | New in pprocess 0.3.1 (Changes since pprocess 0.3)
|
paulb@131 | 116 | --------------------------------------------------
|
paulb@131 | 117 |
|
paulb@131 | 118 | * Moved the reference material out of the module docstring and into a
|
paulb@131 | 119 | separate document, converting it to XHTML in the process.
|
paulb@131 | 120 | * Fixed the project name in the setup script.
|
paulb@131 | 121 |
|
paulb@126 | 122 | New in pprocess 0.3 (Changes since parallel 0.2.5)
|
paulb@100 | 123 | --------------------------------------------------
|
paulb@84 | 124 |
|
paulb@84 | 125 | * Added managed callables: wrappers around callables which cause them to be
|
paulb@84 | 126 | automatically managed by the exchange from which they were acquired.
|
paulb@84 | 127 | * Added MakeParallel: a wrapper instantiated around a normal function which
|
paulb@84 | 128 | sends the result of that function over the supplied channel when invoked.
|
paulb@119 | 129 | * Added MakeReusable: a wrapper like MakeParallel which can be used in
|
paulb@119 | 130 | conjunction with the newly-added reuse capability of the Exchange class in
|
paulb@119 | 131 | order to reuse processes and channels.
|
paulb@89 | 132 | * Added a Map class which attempts to emulate the built-in map function,
|
paulb@89 | 133 | along with a pmap function using this class.
|
paulb@100 | 134 | * Added a Queue class which provides a simpler iterator-style interface to
|
paulb@100 | 135 | data produced by created processes.
|
paulb@100 | 136 | * Added a create method to the Exchange class and an exit convenience
|
paulb@100 | 137 | function to the module.
|
paulb@100 | 138 | * Changed the Exchange implementation to not block when attempting to start
|
paulb@100 | 139 | new processes beyond the process limit: such requests are queued and
|
paulb@100 | 140 | performed as running processes are completed. This permits programs using
|
paulb@100 | 141 | the start method to proceed to consumption of results more quickly.
|
paulb@105 | 142 | * Extended and updated the examples. Added a tutorial.
|
paulb@100 | 143 | * Added Ubuntu Feisty (7.04) package support.
|
paulb@84 | 144 |
|
paulb@78 | 145 | New in parallel 0.2.5 (Changes since parallel 0.2.4)
|
paulb@78 | 146 | ----------------------------------------------------
|
paulb@78 | 147 |
|
paulb@78 | 148 | * Added a start method to the Exchange class for more convenient creation of
|
paulb@78 | 149 | processes.
|
paulb@78 | 150 | * Relicensed under the LGPL (version 3 or later) - this also fixes the
|
paulb@78 | 151 | contradictory situation where the GPL was stated in the pprocess module
|
paulb@78 | 152 | (which was not, in fact, the intention) and the LGPL was stated in the
|
paulb@78 | 153 | documentation.
|
paulb@78 | 154 |
|
paulb@73 | 155 | New in parallel 0.2.4 (Changes since parallel 0.2.3)
|
paulb@73 | 156 | ----------------------------------------------------
|
paulb@73 | 157 |
|
paulb@73 | 158 | * Set buffer sizes to zero for the file object wrappers around sockets: this
|
paulb@73 | 159 | may prevent deadlock issues.
|
paulb@73 | 160 |
|
paulb@68 | 161 | New in parallel 0.2.3 (Changes since parallel 0.2.2)
|
paulb@68 | 162 | ----------------------------------------------------
|
paulb@68 | 163 |
|
paulb@68 | 164 | * Added convenient message exchanges, offering methods handling common
|
paulb@68 | 165 | situations at the cost of having to define a subclass of Exchange.
|
paulb@68 | 166 | * Added a simple example of performing a parallel computation.
|
paulb@68 | 167 | * Improved the PyGmy raytracer example to use the newly added functionality.
|
paulb@68 | 168 |
|
paulb@55 | 169 | New in parallel 0.2.2 (Changes since parallel 0.2.1)
|
paulb@55 | 170 | ----------------------------------------------------
|
paulb@55 | 171 |
|
paulb@55 | 172 | * Changed the status testing in the Exchange class, potentially fixing the
|
paulb@55 | 173 | premature closure of channels before all data was read.
|
paulb@55 | 174 | * Fixed the PyGmy raytracer example's process accounting by relying on the
|
paulb@55 | 175 | possibly more reliable Exchange behaviour, whilst also preventing
|
paulb@55 | 176 | erroneous creation of "out of bounds" processes.
|
paulb@58 | 177 | * Added a removed attribute on the Exchange to record which channels were
|
paulb@58 | 178 | removed in the last call to the ready method.
|
paulb@55 | 179 |
|
paulb@48 | 180 | New in parallel 0.2.1 (Changes since parallel 0.2)
|
paulb@48 | 181 | --------------------------------------------------
|
paulb@48 | 182 |
|
paulb@48 | 183 | * Added a PyGmy raytracer example.
|
paulb@53 | 184 | * Updated copyright and licensing details (FSF address, additional works).
|
paulb@48 | 185 |
|
paulb@40 | 186 | New in parallel 0.2 (Changes since parallel 0.1)
|
paulb@40 | 187 | ------------------------------------------------
|
paulb@40 | 188 |
|
paulb@40 | 189 | * Changed the name of the included module from parallel to pprocess in order
|
paulb@40 | 190 | to avoid naming conflicts with PyParallel.
|
paulb@40 | 191 |
|
paulb@22 | 192 | Release Procedures
|
paulb@22 | 193 | ------------------
|
paulb@22 | 194 |
|
paulb@40 | 195 | Update the pprocess __version__ attribute.
|
paulb@22 | 196 | Change the version number and package filename/directory in the documentation.
|
paulb@22 | 197 | Update the release notes (see above).
|
paulb@22 | 198 | Check the release information in the PKG-INFO file.
|
paulb@22 | 199 | Tag, export.
|
paulb@22 | 200 | Archive, upload.
|
paulb@68 | 201 | Update PyPI.
|
paulb@26 | 202 |
|
paulb@26 | 203 | Making Packages
|
paulb@26 | 204 | ---------------
|
paulb@26 | 205 |
|
paulb@44 | 206 | To make Debian-based packages:
|
paulb@26 | 207 |
|
paulb@44 | 208 | 1. Create new package directories under packages if necessary.
|
paulb@26 | 209 | 2. Make a symbolic link in the distribution's root directory to keep the
|
paulb@26 | 210 | Debian tools happy:
|
paulb@26 | 211 |
|
paulb@44 | 212 | ln -s packages/ubuntu-hoary/python2.4-parallel-pprocess/debian/
|
paulb@26 | 213 |
|
paulb@100 | 214 | Or:
|
paulb@100 | 215 |
|
paulb@100 | 216 | ln -s packages/ubuntu-feisty/python-pprocess/debian/
|
paulb@100 | 217 |
|
paulb@26 | 218 | 3. Run the package builder:
|
paulb@26 | 219 |
|
paulb@26 | 220 | dpkg-buildpackage -rfakeroot
|
paulb@26 | 221 |
|
paulb@26 | 222 | 4. Locate and tidy up the packages in the parent directory of the
|
paulb@26 | 223 | distribution's root directory.
|