micropython

docs/optimisations.txt

510:41b26d6cd7ea
2012-05-21 Paul Boddie Expanded the documentation, adding prerequisites, reports and roadmap sections. Updated the copyright information.
     1 Optimisations
     2 =============
     3 
     4 Some optimisations around constant objects might be possible; these depend on
     5 the following:
     6 
     7   * Reliable tracking of assignments: where assignment operations occur, the
     8     target of the assignment should be determined if any hope of optimisation
     9     is to be maintained. Where no guarantees can be made about the target of
    10     an assignment, no assignment-related information should be written to
    11     potential targets.
    12 
    13   * Objects acting as "containers" of attributes must be regarded as "safe":
    14     where assignments are recorded as occurring on an attribute, it must be
    15     guaranteed that no other unforeseen ways exist to assign to such
    16     attributes.
    17 
    18 The discussion below presents certain rules which must be imposed to uphold
    19 the above requirements.
    20 
    21 Safe Containers
    22 ---------------
    23 
    24 Where attributes of modules, classes and instances are only set once and are
    25 effectively constant, it should be possible to circumvent the attribute lookup
    26 mechanism and to directly reference the attribute value. This technique may
    27 only be considered applicable for the following "container" objects, subject
    28 to the noted constraints:
    29 
    30  1. For modules, "safety" is enforced by ensuring that assignments to module
    31     attributes are only permitted within the module itself either at the
    32     top-level or via names declared as globals. Thus, the following would not
    33     be permitted:
    34 
    35     another_module.this_module.attr = value
    36 
    37     In the above, this_module is a reference to the current module.
    38 
    39  2. For classes, "safety" is enforced by ensuring that assignments to class
    40     attributes are only permitted within the class definition, outside
    41     methods. This would mean that classes would be "sealed" at definition time
    42     (like functions).
    43 
    44 Unlike the property of function locals that they may only sensibly be accessed
    45 within the function in which they reside, these cases demand additional
    46 controls or assumptions on or about access to the stored data. Meanwhile, it
    47 would be difficult to detect eligible attributes on arbitrary instances due to
    48 the need for some kind of type inference or abstract execution.
    49 
    50 Constant Attributes
    51 -------------------
    52 
    53 When accessed via "safe containers", as described above, any attribute with
    54 only one recorded assignment on it can be considered a constant attribute and
    55 this eligible for optimisation, the consequence of which would be the
    56 replacement of a LoadAttrIndex instruction (which needs to look up an
    57 attribute using the run-time details of the "container" and the compile-time
    58 details of the attribute) with a LoadAttr instruction.
    59 
    60 However, some restrictions exist on assignment operations which may be
    61 regarded to cause only one assignment in the lifetime of a program:
    62 
    63  1. For module attributes, only assignments at the top-level outside loop
    64     statements can be reliably assumed to cause only a single assignment.
    65 
    66  2. For class attributes, only assignments at the top-level within class
    67     definitions and outside loop statements can be reliably assumed to cause
    68     only a single assignment.
    69 
    70 All assignments satisfying the "safe container" requirements, but not the
    71 requirements immediately above, should each be recorded as causing at least
    72 one assignment.
    73 
    74 Additional Controls
    75 -------------------
    76 
    77 For the above cases for "container" objects, the following controls would need
    78 to apply:
    79 
    80  1. Modules would need to be immutable after initialisation. However, during
    81     initialisation, there remains a possibility of another module attempting
    82     to access the original module. For example, if ppp/__init__.py contained
    83     the following...
    84 
    85     x = 1
    86     import ppp.qqq
    87     print x
    88 
    89     ...and if ppp/qqq.py contained the following...
    90 
    91     import ppp
    92     ppp.x = 2
    93 
    94     ...then the value 2 would be printed. Since modules are objects which are
    95     registered globally in a program, it would be possible to set attributes
    96     in the above way.
    97 
    98  2. Classes would need to be immutable after initialisation. However, since
    99     classes are objects, any reference to a class after initialisation could
   100     be used to set attributes on the class.
   101 
   102 Solutions:
   103 
   104  1. Insist on global scope for module attribute assignments.
   105 
   106  2. Insist on local scope within classes.
   107 
   108 Both of the above measures need to be enforced at run-time, since an arbitrary
   109 attribute assignment could be attempted on any kind of object, yet to uphold
   110 the properties of "safe containers", attempts to change attributes of such
   111 objects should be denied. Since foreseen attribute assignment operations have
   112 certain properties detectable at compile-time, it could be appropriate to
   113 generate special instructions (or modified instructions) during the
   114 initialisation of modules and classes for such foreseen assignments, whilst
   115 employing normal attribute assignment operations in all other cases. Indeed,
   116 the StoreAttr instruction, which is used to set attributes in "safe
   117 containers" would be used exclusively for this purpose; the StoreAttrIndex
   118 instruction would be used exclusively for all other attribute assignments.
   119 
   120 To ensure the "sealing" of modules and classes, entries in the attribute
   121 lookup table would encode whether a class or module is being accessed, so
   122 that the StoreAttrIndex instruction could reject such accesses.
   123 
   124 Constant Attribute Values
   125 -------------------------
   126 
   127 Where an attribute value is itself regarded as constant, is a "safe container"
   128 and is used in an operation accessing its own attributes, the value can be
   129 directly inspected for optimisations or employed in the generated code. For
   130 the attribute values themselves, only objects of a constant nature may be
   131 considered suitable for this particular optimisation:
   132 
   133   * Classes
   134   * Modules
   135   * Instances defined as constant literals
   136 
   137 This is because arbitrary objects (such as most instances) have no
   138 well-defined form before run-time and cannot be investigated further at
   139 compile-time or have a representation inserted into the generated code.
   140 
   141 Class Attributes and Access via Instances
   142 -----------------------------------------
   143 
   144 Unlike module attributes, class attributes can be accessed in a number of
   145 different ways:
   146 
   147   * Using the class itself:
   148 
   149     C.x = 123
   150     cls = C; cls.x = 234
   151 
   152   * Using a subclass of the class (for reading attributes):
   153 
   154     class D(C):
   155         pass
   156     D.x # setting D.x would populate D, not C
   157 
   158   * Using instances of the class or a subclass of the class (for reading
   159     attributes):
   160 
   161     c = C()
   162     c.x # setting c.x would populate c, not C
   163 
   164 Since assignments are only achieved using direct references to the class, and
   165 since class attributes should be defined only within the class initialisation
   166 process, the properties of class attributes should be consistent with those
   167 desired.
   168 
   169 Method Access via Instances
   170 ---------------------------
   171 
   172 It is desirable to optimise method access, even though most method calls are
   173 likely to occur via instances. It is possible, given the properties of methods
   174 as class attributes to investigate the kind of instance that the self
   175 parameter/local refers to within each method: it should be an instance either
   176 of the class in which the method is defined or a compatible class, although
   177 situations exist where this might not be the case:
   178 
   179   * Explicit invocation of a method:
   180 
   181     d = D() # D is not related to C
   182     C.f(d) # calling f(self) in C
   183 
   184 If blatant usage of incompatible instances were somehow disallowed, it would
   185 still be necessary to investigate the properties of an instance's class and
   186 its relationship with other classes. Consider the following example:
   187 
   188     class A:
   189         def f(self): ...
   190 
   191     class B:
   192         def f(self): ...
   193         def g(self):
   194             self.f()
   195 
   196     class C(A, B):
   197         pass
   198 
   199 Here, instances of B passed into the method B.g could be assumed to provide
   200 access to B.f when self.f is resolved at compile-time. However, instances of C
   201 passed into B.g would instead provide access to A.f when self.f is resolved at
   202 compile-time (since the method resolution order is C, A, B instead of just B).
   203 
   204 One solution might be to specialise methods for each instance type, but this
   205 could be costly. Another less ambitious solution might only involve the
   206 optimisation of such internal method calls if an unambiguous target can be
   207 resolved.
   208 
   209 Narrowing Type Candidates using Attribute Names
   210 -----------------------------------------------
   211 
   212 Within functions, it might be useful to record attribute accesses on local
   213 names and to collect the names in order to help resolve targets in the
   214 generated code. For example:
   215 
   216     def f(x, y):
   217         x.method(y)
   218         if x.something:
   219             ...
   220         return x.attr
   221 
   222 Here, the object referenced via x should have "method", "something" and "attr"
   223 as attributes. We could impose a test for a type providing these attributes at
   224 the earliest opportunity and then specialise the attribute accesses, perhaps
   225 also invocations, in the generated code. AttributeError occurrences would need
   226 to be considered, however, potentially disqualifying certain attributes from
   227 any optimisations, and control flow would also need to be considered. The
   228 result would resemble the following:
   229 
   230     def f(x, y):
   231         if not isinstance(x, C) and x is not C:
   232             raise TypeError
   233         x.method(y)
   234         if x.something:
   235             ...
   236         return x.attr
   237 
   238 With more specific information about the attributes used with a given name,
   239 combinations of attributes can be provided instead of single attributes when
   240 recording attribute usage. For example, from the above:
   241 
   242     x uses method    -> indicates potential usage of C.method, D1.method, ...
   243     x uses something -> indicates potential usage of C.something,
   244                         D2.something, ...
   245     x uses attr      -> indicates potential usage of C.attr, D3.attr, ...
   246 
   247 These unspecific declarations of attribute usage can be replaced with the
   248 following:
   249 
   250     x uses method, something, attr -> indicates potential usage of C, D4, ...
   251                                       (each providing all of the stated
   252                                       attributes)
   253 
   254 Reducing Object Table Scope
   255 ---------------------------
   256 
   257 Where attributes may be used in a program but never accessed via the object
   258 table-dependent instructions, such attributes could be omitted from the object
   259 table.
   260 
   261 Implemented Optimisation Types
   262 ==============================
   263 
   264 Optimisation                    Prerequisites and Effect
   265 ------------                    ------------------------
   266 
   267 constant_storage                value instruction references a constant;
   268 (guidance)                      storage instruction references a constant;
   269                               | indicate whether both instructions satisfy the
   270                               | preconditions and should be removed/omitted
   271 
   272 constant_accessor               value instruction references a constant;
   273 (guidance)                    | target name provided (for use in producing an
   274                               | address access instruction)
   275 
   276 known_target                    value instruction references a constant;
   277 (guidance)                    | target and context are provided (no instructions
   278                               | are removed)
   279 
   280 self_access                     value instruction references "self" in a method;
   281 (guidance)                      specified attribute name always has the same
   282                                 position;
   283                               | indicate whether an appropriate instruction can
   284                               | be generated for the access
   285 
   286 temp_storage                    value instruction is a simple input operation;
   287 (elimination)                   value instruction is the last instruction;
   288 (guidance)                    | remove the value instruction, provide the value
   289                               | instruction in place of a temporary storage
   290                               | reference
   291 
   292 no_operations                   input to the current instruction loads from the
   293 (guidance)                      destination of the current instruction;
   294                               | indicate that the current instruction should be
   295                               | omitted
   296 
   297 unused_results                  value instruction is a simple input operation;
   298 (elimination)                   value instruction is the final instruction of a
   299                                 discarded expression;
   300                               | remove the value instruction
   301 
   302 unused_objects                  attribute is defined using a name which is not
   303 (inspection)                    used in an active region of the program or is
   304                                 defined within a class which is not used by
   305                                 the program, object is unambiguously
   306                                 referenced by such attributes
   307                               | remove such attributes and objects