1.1 --- a/docs/optimisations.txt Wed Sep 30 00:53:07 2009 +0200
1.2 +++ b/docs/optimisations.txt Sat Oct 10 01:52:27 2009 +0200
1.3 @@ -1,3 +1,234 @@
1.4 +Optimisations
1.5 +=============
1.6 +
1.7 +Some optimisations around constant objects might be possible; these depend on
1.8 +the following:
1.9 +
1.10 + * Reliable tracking of assignments: where assignment operations occur, the
1.11 + target of the assignment should be determined if any hope of optimisation
1.12 + is to be maintained. Where no guarantees can be made about the target of
1.13 + an assignment, no assignment-related information should be written to
1.14 + potential targets.
1.15 +
1.16 + * Objects acting as "containers" of attributes must be regarded as "safe":
1.17 + where assignments are recorded as occurring on an attribute, it must be
1.18 + guaranteed that no other unforeseen ways exist to assign to such
1.19 + attributes.
1.20 +
1.21 +The discussion below presents certain rules which must be imposed to uphold
1.22 +the above requirements.
1.23 +
1.24 +Safe Containers
1.25 +---------------
1.26 +
1.27 +Where attributes of modules, classes and instances are only set once and are
1.28 +effectively constant, it should be possible to circumvent the attribute lookup
1.29 +mechanism and to directly reference the attribute value. This technique may
1.30 +only be considered applicable for the following "container" objects, subject
1.31 +to the noted constraints:
1.32 +
1.33 + 1. For modules, "safety" is enforced by ensuring that assignments to module
1.34 + attributes are only permitted within the module itself either at the
1.35 + top-level or via names declared as globals. Thus, the following would not
1.36 + be permitted:
1.37 +
1.38 + another_module.this_module.attr = value
1.39 +
1.40 + In the above, this_module is a reference to the current module.
1.41 +
1.42 + 2. For classes, "safety" is enforced by ensuring that assignments to class
1.43 + attributes are only permitted within the class definition, outside
1.44 + methods. This would mean that classes would be "sealed" at definition time
1.45 + (like functions).
1.46 +
1.47 +Unlike the property of function locals that they may only sensibly be accessed
1.48 +within the function in which they reside, these cases demand additional
1.49 +controls or assumptions on or about access to the stored data. Meanwhile, it
1.50 +would be difficult to detect eligible attributes on arbitrary instances due to
1.51 +the need for some kind of type inference or abstract execution.
1.52 +
1.53 +Constant Attributes
1.54 +-------------------
1.55 +
1.56 +When accessed via "safe containers", as described above, any attribute with
1.57 +only one recorded assignment on it can be considered a constant attribute and
1.58 +this eligible for optimisation, the consequence of which would be the
1.59 +replacement of a LoadAttrIndex instruction (which needs to look up an
1.60 +attribute using the run-time details of the "container" and the compile-time
1.61 +details of the attribute) with a LoadAttr instruction.
1.62 +
1.63 +However, some restrictions exist on assignment operations which may be
1.64 +regarded to cause only one assignment in the lifetime of a program:
1.65 +
1.66 + 1. For module attributes, only assignments at the top-level outside loop
1.67 + statements can be reliably assumed to cause only a single assignment.
1.68 +
1.69 + 2. For class attributes, only assignments at the top-level within class
1.70 + definitions and outside loop statements can be reliably assumed to cause
1.71 + only a single assignment.
1.72 +
1.73 +All assignments satisfying the "safe container" requirements, but not the
1.74 +requirements immediately above, should each be recorded as causing at least
1.75 +one assignment.
1.76 +
1.77 +Additional Controls
1.78 +-------------------
1.79 +
1.80 +For the above cases for "container" objects, the following controls would need
1.81 +to apply:
1.82 +
1.83 + 1. Modules would need to be immutable after initialisation. However, during
1.84 + initialisation, there remains a possibility of another module attempting
1.85 + to access the original module. For example, if ppp/__init__.py contained
1.86 + the following...
1.87 +
1.88 + x = 1
1.89 + import ppp.qqq
1.90 + print x
1.91 +
1.92 + ...and if ppp/qqq.py contained the following...
1.93 +
1.94 + import ppp
1.95 + ppp.x = 2
1.96 +
1.97 + ...then the value 2 would be printed. Since modules are objects which are
1.98 + registered globally in a program, it would be possible to set attributes
1.99 + in the above way.
1.100 +
1.101 + 2. Classes would need to be immutable after initialisation. However, since
1.102 + classes are objects, any reference to a class after initialisation could
1.103 + be used to set attributes on the class.
1.104 +
1.105 +Solutions:
1.106 +
1.107 + 1. Insist on global scope for module attribute assignments.
1.108 +
1.109 + 2. Insist on local scope within classes.
1.110 +
1.111 +Both of the above measures need to be enforced at run-time, since an arbitrary
1.112 +attribute assignment could be attempted on any kind of object, yet to uphold
1.113 +the properties of "safe containers", attempts to change attributes of such
1.114 +objects should be denied. Since foreseen attribute assignment operations have
1.115 +certain properties detectable at compile-time, it could be appropriate to
1.116 +generate special instructions (or modified instructions) during the
1.117 +initialisation of modules and classes for such foreseen assignments, whilst
1.118 +employing normal attribute assignment operations in all other cases. Indeed,
1.119 +the StoreAttr instruction, which is used to set attributes in "safe
1.120 +containers" would be used exclusively for this purpose; the StoreAttrIndex
1.121 +instruction would be used exclusively for all other attribute assignments.
1.122 +
1.123 +To ensure the "sealing" of modules and classes, entries in the attribute
1.124 +lookup table would encode whether a class or module is being accessed, so
1.125 +that the StoreAttrIndex instruction could reject such accesses.
1.126 +
1.127 +Constant Attribute Values
1.128 +-------------------------
1.129 +
1.130 +Where an attribute value is itself regarded as constant, is a "safe container"
1.131 +and is used in an operation accessing its own attributes, the value can be
1.132 +directly inspected for optimisations or employed in the generated code. For
1.133 +the attribute values themselves, only objects of a constant nature may be
1.134 +considered suitable for this particular optimisation:
1.135 +
1.136 + * Classes
1.137 + * Modules
1.138 + * Instances defined as constant literals
1.139 +
1.140 +This is because arbitrary objects (such as most instances) have no
1.141 +well-defined form before run-time and cannot be investigated further at
1.142 +compile-time or have a representation inserted into the generated code.
1.143 +
1.144 +Class Attributes and Access via Instances
1.145 +-----------------------------------------
1.146 +
1.147 +Unlike module attributes, class attributes can be accessed in a number of
1.148 +different ways:
1.149 +
1.150 + * Using the class itself:
1.151 +
1.152 + C.x = 123
1.153 + cls = C; cls.x = 234
1.154 +
1.155 + * Using a subclass of the class (for reading attributes):
1.156 +
1.157 + class D(C):
1.158 + pass
1.159 + D.x # setting D.x would populate D, not C
1.160 +
1.161 + * Using instances of the class or a subclass of the class (for reading
1.162 + attributes):
1.163 +
1.164 + c = C()
1.165 + c.x # setting c.x would populate c, not C
1.166 +
1.167 +Since assignments are only achieved using direct references to the class, and
1.168 +since class attributes should be defined only within the class initialisation
1.169 +process, the properties of class attributes should be consistent with those
1.170 +desired.
1.171 +
1.172 +Method Access via Instances
1.173 +---------------------------
1.174 +
1.175 +It is desirable to optimise method access, even though most method calls are
1.176 +likely to occur via instances. It is possible, given the properties of methods
1.177 +as class attributes to investigate the kind of instance that the self
1.178 +parameter/local refers to within each method: it should be an instance either
1.179 +of the class in which the method is defined or a compatible class, although
1.180 +situations exist where this might not be the case:
1.181 +
1.182 + * Explicit invocation of a method:
1.183 +
1.184 + d = D() # D is not related to C
1.185 + C.f(d) # calling f(self) in C
1.186 +
1.187 +If blatant usage of incompatible instances were somehow disallowed, it would
1.188 +still be necessary to investigate the properties of an instance's class and
1.189 +its relationship with other classes. Consider the following example:
1.190 +
1.191 + class A:
1.192 + def f(self): ...
1.193 +
1.194 + class B:
1.195 + def f(self): ...
1.196 + def g(self):
1.197 + self.f()
1.198 +
1.199 + class C(A, B):
1.200 + pass
1.201 +
1.202 +Here, instances of B passed into the method B.g could be assumed to provide
1.203 +access to B.f when self.f is resolved at compile-time. However, instances of C
1.204 +passed into B.g would instead provide access to A.f when self.f is resolved at
1.205 +compile-time (since the method resolution order is C, A, B instead of just B).
1.206 +
1.207 +One solution might be to specialise methods for each instance type, but this
1.208 +could be costly. Another less ambitious solution might only involve the
1.209 +optimisation of such internal method calls if an unambiguous target can be
1.210 +resolved.
1.211 +
1.212 +Narrowing Type Candidates using Attribute Names
1.213 +-----------------------------------------------
1.214 +
1.215 +Within functions, it might be useful to record attribute accesses on local
1.216 +names and to collect the names in order to help resolve targets in the
1.217 +generated code. For example:
1.218 +
1.219 + def f(x, y):
1.220 + x.method(y)
1.221 + if x.something:
1.222 + ...
1.223 + return x.attr
1.224 +
1.225 +Here, the object referenced via x should have "method", "something" and "attr"
1.226 +as attributes. We could impose a test for a type providing these attributes at
1.227 +the earliest opportunity and then specialise the attribute accesses, perhaps
1.228 +also invocations, in the generated code. AttributeError occurrences would need
1.229 +to be considered, however, potentially disqualifying certain attributes from
1.230 +any optimisations, and control flow would also need to be considered.
1.231 +
1.232 +Implemented Optimisation Types
1.233 +==============================
1.234 +
1.235 Optimisation Prerequisites and Effect
1.236 ------------ ------------------------
1.237