paul@263 | 1 | Optimisations
|
paul@263 | 2 | =============
|
paul@263 | 3 |
|
paul@263 | 4 | Some optimisations around constant objects might be possible; these depend on
|
paul@263 | 5 | the following:
|
paul@263 | 6 |
|
paul@263 | 7 | * Reliable tracking of assignments: where assignment operations occur, the
|
paul@263 | 8 | target of the assignment should be determined if any hope of optimisation
|
paul@263 | 9 | is to be maintained. Where no guarantees can be made about the target of
|
paul@263 | 10 | an assignment, no assignment-related information should be written to
|
paul@263 | 11 | potential targets.
|
paul@263 | 12 |
|
paul@263 | 13 | * Objects acting as "containers" of attributes must be regarded as "safe":
|
paul@263 | 14 | where assignments are recorded as occurring on an attribute, it must be
|
paul@263 | 15 | guaranteed that no other unforeseen ways exist to assign to such
|
paul@263 | 16 | attributes.
|
paul@263 | 17 |
|
paul@263 | 18 | The discussion below presents certain rules which must be imposed to uphold
|
paul@263 | 19 | the above requirements.
|
paul@263 | 20 |
|
paul@263 | 21 | Safe Containers
|
paul@263 | 22 | ---------------
|
paul@263 | 23 |
|
paul@263 | 24 | Where attributes of modules, classes and instances are only set once and are
|
paul@263 | 25 | effectively constant, it should be possible to circumvent the attribute lookup
|
paul@263 | 26 | mechanism and to directly reference the attribute value. This technique may
|
paul@263 | 27 | only be considered applicable for the following "container" objects, subject
|
paul@263 | 28 | to the noted constraints:
|
paul@263 | 29 |
|
paul@263 | 30 | 1. For modules, "safety" is enforced by ensuring that assignments to module
|
paul@263 | 31 | attributes are only permitted within the module itself either at the
|
paul@263 | 32 | top-level or via names declared as globals. Thus, the following would not
|
paul@263 | 33 | be permitted:
|
paul@263 | 34 |
|
paul@263 | 35 | another_module.this_module.attr = value
|
paul@263 | 36 |
|
paul@263 | 37 | In the above, this_module is a reference to the current module.
|
paul@263 | 38 |
|
paul@557 | 39 | Where assignments to module attributes are permitted from outside a
|
paul@557 | 40 | module, modules can no longer be relied upon to provide constant attribute
|
paul@557 | 41 | values, even if a cursory inspection of module-level code would suggest
|
paul@557 | 42 | that some attributes are effectively constant.
|
paul@557 | 43 |
|
paul@263 | 44 | 2. For classes, "safety" is enforced by ensuring that assignments to class
|
paul@263 | 45 | attributes are only permitted within the class definition, outside
|
paul@263 | 46 | methods. This would mean that classes would be "sealed" at definition time
|
paul@263 | 47 | (like functions).
|
paul@263 | 48 |
|
paul@263 | 49 | Unlike the property of function locals that they may only sensibly be accessed
|
paul@263 | 50 | within the function in which they reside, these cases demand additional
|
paul@263 | 51 | controls or assumptions on or about access to the stored data. Meanwhile, it
|
paul@263 | 52 | would be difficult to detect eligible attributes on arbitrary instances due to
|
paul@263 | 53 | the need for some kind of type inference or abstract execution.
|
paul@263 | 54 |
|
paul@263 | 55 | Constant Attributes
|
paul@263 | 56 | -------------------
|
paul@263 | 57 |
|
paul@263 | 58 | When accessed via "safe containers", as described above, any attribute with
|
paul@263 | 59 | only one recorded assignment on it can be considered a constant attribute and
|
paul@263 | 60 | this eligible for optimisation, the consequence of which would be the
|
paul@263 | 61 | replacement of a LoadAttrIndex instruction (which needs to look up an
|
paul@263 | 62 | attribute using the run-time details of the "container" and the compile-time
|
paul@263 | 63 | details of the attribute) with a LoadAttr instruction.
|
paul@263 | 64 |
|
paul@263 | 65 | However, some restrictions exist on assignment operations which may be
|
paul@263 | 66 | regarded to cause only one assignment in the lifetime of a program:
|
paul@263 | 67 |
|
paul@263 | 68 | 1. For module attributes, only assignments at the top-level outside loop
|
paul@263 | 69 | statements can be reliably assumed to cause only a single assignment.
|
paul@263 | 70 |
|
paul@263 | 71 | 2. For class attributes, only assignments at the top-level within class
|
paul@263 | 72 | definitions and outside loop statements can be reliably assumed to cause
|
paul@263 | 73 | only a single assignment.
|
paul@263 | 74 |
|
paul@263 | 75 | All assignments satisfying the "safe container" requirements, but not the
|
paul@263 | 76 | requirements immediately above, should each be recorded as causing at least
|
paul@263 | 77 | one assignment.
|
paul@263 | 78 |
|
paul@263 | 79 | Additional Controls
|
paul@263 | 80 | -------------------
|
paul@263 | 81 |
|
paul@263 | 82 | For the above cases for "container" objects, the following controls would need
|
paul@263 | 83 | to apply:
|
paul@263 | 84 |
|
paul@263 | 85 | 1. Modules would need to be immutable after initialisation. However, during
|
paul@263 | 86 | initialisation, there remains a possibility of another module attempting
|
paul@263 | 87 | to access the original module. For example, if ppp/__init__.py contained
|
paul@263 | 88 | the following...
|
paul@263 | 89 |
|
paul@263 | 90 | x = 1
|
paul@263 | 91 | import ppp.qqq
|
paul@263 | 92 | print x
|
paul@263 | 93 |
|
paul@263 | 94 | ...and if ppp/qqq.py contained the following...
|
paul@263 | 95 |
|
paul@263 | 96 | import ppp
|
paul@263 | 97 | ppp.x = 2
|
paul@263 | 98 |
|
paul@263 | 99 | ...then the value 2 would be printed. Since modules are objects which are
|
paul@263 | 100 | registered globally in a program, it would be possible to set attributes
|
paul@263 | 101 | in the above way.
|
paul@263 | 102 |
|
paul@263 | 103 | 2. Classes would need to be immutable after initialisation. However, since
|
paul@263 | 104 | classes are objects, any reference to a class after initialisation could
|
paul@263 | 105 | be used to set attributes on the class.
|
paul@263 | 106 |
|
paul@263 | 107 | Solutions:
|
paul@263 | 108 |
|
paul@263 | 109 | 1. Insist on global scope for module attribute assignments.
|
paul@263 | 110 |
|
paul@263 | 111 | 2. Insist on local scope within classes.
|
paul@263 | 112 |
|
paul@263 | 113 | Both of the above measures need to be enforced at run-time, since an arbitrary
|
paul@263 | 114 | attribute assignment could be attempted on any kind of object, yet to uphold
|
paul@263 | 115 | the properties of "safe containers", attempts to change attributes of such
|
paul@263 | 116 | objects should be denied. Since foreseen attribute assignment operations have
|
paul@263 | 117 | certain properties detectable at compile-time, it could be appropriate to
|
paul@263 | 118 | generate special instructions (or modified instructions) during the
|
paul@263 | 119 | initialisation of modules and classes for such foreseen assignments, whilst
|
paul@263 | 120 | employing normal attribute assignment operations in all other cases. Indeed,
|
paul@263 | 121 | the StoreAttr instruction, which is used to set attributes in "safe
|
paul@263 | 122 | containers" would be used exclusively for this purpose; the StoreAttrIndex
|
paul@263 | 123 | instruction would be used exclusively for all other attribute assignments.
|
paul@263 | 124 |
|
paul@263 | 125 | To ensure the "sealing" of modules and classes, entries in the attribute
|
paul@263 | 126 | lookup table would encode whether a class or module is being accessed, so
|
paul@263 | 127 | that the StoreAttrIndex instruction could reject such accesses.
|
paul@263 | 128 |
|
paul@263 | 129 | Constant Attribute Values
|
paul@263 | 130 | -------------------------
|
paul@263 | 131 |
|
paul@263 | 132 | Where an attribute value is itself regarded as constant, is a "safe container"
|
paul@263 | 133 | and is used in an operation accessing its own attributes, the value can be
|
paul@263 | 134 | directly inspected for optimisations or employed in the generated code. For
|
paul@263 | 135 | the attribute values themselves, only objects of a constant nature may be
|
paul@263 | 136 | considered suitable for this particular optimisation:
|
paul@263 | 137 |
|
paul@263 | 138 | * Classes
|
paul@263 | 139 | * Modules
|
paul@263 | 140 | * Instances defined as constant literals
|
paul@263 | 141 |
|
paul@263 | 142 | This is because arbitrary objects (such as most instances) have no
|
paul@263 | 143 | well-defined form before run-time and cannot be investigated further at
|
paul@263 | 144 | compile-time or have a representation inserted into the generated code.
|
paul@263 | 145 |
|
paul@263 | 146 | Class Attributes and Access via Instances
|
paul@263 | 147 | -----------------------------------------
|
paul@263 | 148 |
|
paul@263 | 149 | Unlike module attributes, class attributes can be accessed in a number of
|
paul@263 | 150 | different ways:
|
paul@263 | 151 |
|
paul@263 | 152 | * Using the class itself:
|
paul@263 | 153 |
|
paul@263 | 154 | C.x = 123
|
paul@263 | 155 | cls = C; cls.x = 234
|
paul@263 | 156 |
|
paul@263 | 157 | * Using a subclass of the class (for reading attributes):
|
paul@263 | 158 |
|
paul@263 | 159 | class D(C):
|
paul@263 | 160 | pass
|
paul@263 | 161 | D.x # setting D.x would populate D, not C
|
paul@263 | 162 |
|
paul@263 | 163 | * Using instances of the class or a subclass of the class (for reading
|
paul@263 | 164 | attributes):
|
paul@263 | 165 |
|
paul@263 | 166 | c = C()
|
paul@263 | 167 | c.x # setting c.x would populate c, not C
|
paul@263 | 168 |
|
paul@263 | 169 | Since assignments are only achieved using direct references to the class, and
|
paul@263 | 170 | since class attributes should be defined only within the class initialisation
|
paul@263 | 171 | process, the properties of class attributes should be consistent with those
|
paul@263 | 172 | desired.
|
paul@263 | 173 |
|
paul@263 | 174 | Method Access via Instances
|
paul@263 | 175 | ---------------------------
|
paul@263 | 176 |
|
paul@263 | 177 | It is desirable to optimise method access, even though most method calls are
|
paul@263 | 178 | likely to occur via instances. It is possible, given the properties of methods
|
paul@263 | 179 | as class attributes to investigate the kind of instance that the self
|
paul@263 | 180 | parameter/local refers to within each method: it should be an instance either
|
paul@263 | 181 | of the class in which the method is defined or a compatible class, although
|
paul@263 | 182 | situations exist where this might not be the case:
|
paul@263 | 183 |
|
paul@263 | 184 | * Explicit invocation of a method:
|
paul@263 | 185 |
|
paul@263 | 186 | d = D() # D is not related to C
|
paul@263 | 187 | C.f(d) # calling f(self) in C
|
paul@263 | 188 |
|
paul@263 | 189 | If blatant usage of incompatible instances were somehow disallowed, it would
|
paul@263 | 190 | still be necessary to investigate the properties of an instance's class and
|
paul@263 | 191 | its relationship with other classes. Consider the following example:
|
paul@263 | 192 |
|
paul@263 | 193 | class A:
|
paul@263 | 194 | def f(self): ...
|
paul@263 | 195 |
|
paul@263 | 196 | class B:
|
paul@263 | 197 | def f(self): ...
|
paul@263 | 198 | def g(self):
|
paul@263 | 199 | self.f()
|
paul@263 | 200 |
|
paul@263 | 201 | class C(A, B):
|
paul@263 | 202 | pass
|
paul@263 | 203 |
|
paul@263 | 204 | Here, instances of B passed into the method B.g could be assumed to provide
|
paul@263 | 205 | access to B.f when self.f is resolved at compile-time. However, instances of C
|
paul@263 | 206 | passed into B.g would instead provide access to A.f when self.f is resolved at
|
paul@263 | 207 | compile-time (since the method resolution order is C, A, B instead of just B).
|
paul@263 | 208 |
|
paul@263 | 209 | One solution might be to specialise methods for each instance type, but this
|
paul@263 | 210 | could be costly. Another less ambitious solution might only involve the
|
paul@263 | 211 | optimisation of such internal method calls if an unambiguous target can be
|
paul@263 | 212 | resolved.
|
paul@263 | 213 |
|
paul@263 | 214 | Narrowing Type Candidates using Attribute Names
|
paul@263 | 215 | -----------------------------------------------
|
paul@263 | 216 |
|
paul@263 | 217 | Within functions, it might be useful to record attribute accesses on local
|
paul@263 | 218 | names and to collect the names in order to help resolve targets in the
|
paul@263 | 219 | generated code. For example:
|
paul@263 | 220 |
|
paul@263 | 221 | def f(x, y):
|
paul@263 | 222 | x.method(y)
|
paul@263 | 223 | if x.something:
|
paul@263 | 224 | ...
|
paul@263 | 225 | return x.attr
|
paul@263 | 226 |
|
paul@263 | 227 | Here, the object referenced via x should have "method", "something" and "attr"
|
paul@263 | 228 | as attributes. We could impose a test for a type providing these attributes at
|
paul@263 | 229 | the earliest opportunity and then specialise the attribute accesses, perhaps
|
paul@263 | 230 | also invocations, in the generated code. AttributeError occurrences would need
|
paul@263 | 231 | to be considered, however, potentially disqualifying certain attributes from
|
paul@271 | 232 | any optimisations, and control flow would also need to be considered. The
|
paul@271 | 233 | result would resemble the following:
|
paul@271 | 234 |
|
paul@271 | 235 | def f(x, y):
|
paul@285 | 236 | if not isinstance(x, C) and x is not C:
|
paul@271 | 237 | raise TypeError
|
paul@271 | 238 | x.method(y)
|
paul@271 | 239 | if x.something:
|
paul@271 | 240 | ...
|
paul@271 | 241 | return x.attr
|
paul@263 | 242 |
|
paul@276 | 243 | With more specific information about the attributes used with a given name,
|
paul@276 | 244 | combinations of attributes can be provided instead of single attributes when
|
paul@276 | 245 | recording attribute usage. For example, from the above:
|
paul@276 | 246 |
|
paul@276 | 247 | x uses method -> indicates potential usage of C.method, D1.method, ...
|
paul@276 | 248 | x uses something -> indicates potential usage of C.something,
|
paul@276 | 249 | D2.something, ...
|
paul@276 | 250 | x uses attr -> indicates potential usage of C.attr, D3.attr, ...
|
paul@276 | 251 |
|
paul@276 | 252 | These unspecific declarations of attribute usage can be replaced with the
|
paul@276 | 253 | following:
|
paul@276 | 254 |
|
paul@276 | 255 | x uses method, something, attr -> indicates potential usage of C, D4, ...
|
paul@276 | 256 | (each providing all of the stated
|
paul@276 | 257 | attributes)
|
paul@276 | 258 |
|
paul@276 | 259 | Reducing Object Table Scope
|
paul@276 | 260 | ---------------------------
|
paul@276 | 261 |
|
paul@276 | 262 | Where attributes may be used in a program but never accessed via the object
|
paul@276 | 263 | table-dependent instructions, such attributes could be omitted from the object
|
paul@276 | 264 | table.
|
paul@276 | 265 |
|
paul@263 | 266 | Implemented Optimisation Types
|
paul@263 | 267 | ==============================
|
paul@263 | 268 |
|
paul@141 | 269 | Optimisation Prerequisites and Effect
|
paul@141 | 270 | ------------ ------------------------
|
paul@141 | 271 |
|
paul@141 | 272 | constant_storage value instruction references a constant;
|
paul@156 | 273 | (guidance) storage instruction references a constant;
|
paul@156 | 274 | | indicate whether both instructions satisfy the
|
paul@450 | 275 | | preconditions and should be removed/omitted
|
paul@141 | 276 |
|
paul@155 | 277 | constant_accessor value instruction references a constant;
|
paul@156 | 278 | (guidance) | target name provided (for use in producing an
|
paul@156 | 279 | | address access instruction)
|
paul@155 | 280 |
|
paul@141 | 281 | known_target value instruction references a constant;
|
paul@156 | 282 | (guidance) | target and context are provided (no instructions
|
paul@156 | 283 | | are removed)
|
paul@141 | 284 |
|
paul@141 | 285 | self_access value instruction references "self" in a method;
|
paul@155 | 286 | (guidance) specified attribute name always has the same
|
paul@155 | 287 | position;
|
paul@156 | 288 | | indicate whether an appropriate instruction can
|
paul@156 | 289 | | be generated for the access
|
paul@141 | 290 |
|
paul@141 | 291 | temp_storage value instruction is a simple input operation;
|
paul@142 | 292 | (elimination) value instruction is the last instruction;
|
paul@156 | 293 | (guidance) | remove the value instruction, provide the value
|
paul@156 | 294 | | instruction in place of a temporary storage
|
paul@156 | 295 | | reference
|
paul@141 | 296 |
|
paul@141 | 297 | no_operations input to the current instruction loads from the
|
paul@156 | 298 | (guidance) destination of the current instruction;
|
paul@156 | 299 | | indicate that the current instruction should be
|
paul@156 | 300 | | omitted
|
paul@141 | 301 |
|
paul@141 | 302 | unused_results value instruction is a simple input operation;
|
paul@142 | 303 | (elimination) value instruction is the final instruction of a
|
paul@141 | 304 | discarded expression;
|
paul@156 | 305 | | remove the value instruction
|
paul@253 | 306 |
|
paul@253 | 307 | unused_objects attribute is defined using a name which is not
|
paul@253 | 308 | (inspection) used in an active region of the program or is
|
paul@253 | 309 | defined within a class which is not used by
|
paul@253 | 310 | the program, object is unambiguously
|
paul@253 | 311 | referenced by such attributes
|
paul@253 | 312 | | remove such attributes and objects
|