1 Concepts
2 ========
3
4 This document describes the underlying concepts employed in micropython.
5
6 * Namespaces and attribute definition
7 * Attribute usage observations
8 * Active code deduction
9 * Imports and circular import detection
10 * Contexts and values
11 * Tables, attributes and lookups
12 * Parameters and lookups
13
14 Namespaces and Attribute Definition
15 ===================================
16
17 Namespaces are any objects which can retain attributes.
18
19 * Module attributes are defined either at the module level or by global
20 statements.
21 * Class attributes are defined only within class statements.
22 * Instance attributes are defined only by assignments to attributes of self
23 or tentatively as references to attributes of self.
24 * Locals are effectively attributes of the local namespace and are not
25 accessible externally (and thus cannot support closures).
26
27 These restrictions apply because such attributes are thus explicitly declared,
28 permitting the use of tables (described below). Module and class attributes
29 can also be finalised in this way in order to permit certain optimisations.
30
31 Rebinding of attributes outside classes and modules can be allowed if
32 attribute usage observations are being used to detect such external
33 modifications to such objects. Without such observations, such rebinding
34 should be forbidden since apparently constant attributes might be modified in
35 a running program, but code may have been generated that provides specific
36 objects for those attributes under the assumption that they will not be
37 changed.
38
39 Observations during initial program inspection populate namespaces in a
40 simplistic fashion. If a loop is active, definitions record that the value of
41 a name can be set potentially many times. In local namespaces, definitions are
42 also recorded using the mechanisms employed to track attribute usage, and such
43 observations may provide a more sophisticated view of the potential values of
44 local names.
45
46 See rejected.txt for complicating mechanisms which could be applied to
47 mitigate the effects of these restrictions on optimisations.
48
49 Attribute Usage Observations
50 ============================
51
52 Within a scope, a name may be used in conjunction with attribute names in
53 order to access attributes on objects referenced by the name. However, such
54 observations can only be regarded as reliable if the object referenced is not
55 changed independently by some other mechanism or activity.
56
57 With conventional functions and methods, any locally defined names can be
58 considered private to that scope and thus immune to independent modification,
59 at least within reasonable features of the language. Although stack
60 modification may be possible, it seems appropriate to reject such features,
61 especially since they lend themselves to unmaintainable programs.
62
63 For names defined during the initialisation of a class, since the class itself
64 cannot be referenced by name until its declaration has been completely
65 evaluated, no independent modification can occur from outside the class scope.
66
67 For names defined during the initialisation of a module, global declarations
68 in functions permit the rebinding of global variables and since functions may
69 be invoked during module initialisation, independent modification can
70 potentially occur if any functions are called.
71
72 Module globals can be accessed from other modules that can refer to a module
73 by its name. Initially, an import is required to make a module available for
74 modification, but there is no restriction on whether a module has been
75 completely imported (and thus defined) before an import statement can make it
76 available to other modules. Consider the following package root definition:
77
78 # Module changed:
79 def f():
80 import changed.modifier
81 x = 123
82 f()
83
84 # Module changed.modifier:
85 import changed
86 changed.x = 456
87
88 Here, an import of changed will initially set x to 123, but then the function
89 f will be invoked and cause the changed.modifier module to be imported. Since
90 the changed module is already being imported, the import statement will not
91 try to perform the import operation again, but it will make the partially
92 defined module available for access. Thus, the changed.modifier module will
93 then set x to 456, and independent modification of the changed namespace will
94 have been performed.
95
96 In conclusion, module globals cannot therefore be regarded as immune to
97 operations that would disrupt usage observations. Consequently, only locals
98 and class definition "locals" can be reliably employed in attribute usage
99 observations.
100
101 Active Code Deduction
102 =====================
103
104 With information about the attributes provided by each object, it becomes
105 possible to deduce which code is active and which code is unused in a program.
106 This process involves the collection of attribute references, the
107 identification of currently unused objects, and the propagation of this new
108 object information to parts of the program where attribute usage may now
109 involve such objects.
110
111 Imports and Circular Import Detection
112 =====================================
113
114 The matter of whether any given module is potentially modified by another
115 module before it has been completely imported can be addressed by separating
116 the import process into distinct phases:
117
118 1. Discovery/loading
119 2. Parsing/structure processing
120 3. Completion/code processing
121
122 Upon discovering a module, a test is made to determine whether it is already
123 being imported; if not, it is then loaded and its structure inspected to
124 determine whether it may import other modules, which will then in turn be
125 discovered and loaded. Once no more modules can be loaded, they will then be
126 completed by undergoing the more thorough systematic processing of each
127 module's contents, defining program units and requesting the completion of
128 other modules when import statements are encountered.
129
130 The motivation for such a multi-phase approach is to detect circular imports
131 in the structure processing phase before modules are populated and deductions
132 made about their objects' behaviour. Thus, globals belonging to a module known
133 to be imported in a circular fashion will already be regarded as potentially
134 modifiable by other modules and attribute usage observations will not be
135 recorded.
136
137 Contexts and Values
138 ===================
139
140 Values are used as the common reference representation in micropython: as
141 stored representations of attributes (of classes, instances, modules, and
142 other objects supporting attribute-like entities) as well as the stored values
143 associated with names in functions and methods.
144
145 Unlike other implementations, micropython does not create things like bound
146 method objects for individual instances. Instead, all objects are referenced
147 using a context, reference pair:
148
149 Value Layout
150 ------------
151
152 0 1
153 context object
154 reference reference
155
156 Specific implementations might reverse this ordering for optimisation
157 purposes.
158
159 Rationale
160 ---------
161
162 To reduce the number of created objects whilst retaining the ability to
163 support bound method invocations. The context indicates the context in which
164 an invocation is performed, typically the owner of the method.
165
166 Usage
167 -----
168
169 The context may be inserted as the first argument when a value is involved in
170 an invocation. This argument may then be omitted from the invocation if its
171 usage is not appropriate.
172
173 See invocation.txt for details.
174
175 Context Value Types
176 -------------------
177
178 The following types of context value exist:
179
180 Type Usage Transformations
181 ---- ----- ---------------
182
183 Replaceable With functions (not methods) May be replaced with an
184 instance or a class when a
185 value is stored on an
186 instance or class
187
188 Placeholder With classes May not be replaced
189
190 Instance With instances (and constants) May not be replaced
191 or functions as methods
192
193 Class With functions as methods May be replaced when a
194 value is loaded from a
195 class attribute via an
196 instance
197
198 Contexts in Acquired Values
199 ---------------------------
200
201 There are four classes of instructions which provide values:
202
203 Instruction Purpose Context Operations
204 ----------- ------- ------------------
205
206 1) LoadConst Load module, constant Use loaded object with itself
207 as context
208
209 2) LoadFunction Load function Combine replaceable context
210 with loaded object
211
212 3) LoadClass Load class Combine placeholder context
213 with loaded object
214
215 4) LoadAddress* Load attribute from Preserve or override stored
216 LoadAttr* class, module, context (as described in
217 instance assignment.txt)
218
219 In order to comply with traditional Python behaviour, contexts may or may not
220 represent the object from which an attribute has been acquired.
221
222 See assignment.txt for details.
223
224 Contexts in Stored Values
225 -------------------------
226
227 There are two classes of instruction for storing values:
228
229 Instruction Purpose Context Operations
230 ----------- ------- ------------------
231
232 1) StoreAddress Store attribute in a Preserve context; note that no
233 known object test for class attribute
234 assignment should be necessary
235 since this instruction should only
236 be generated for module globals
237
238 StoreAttr Store attribute in an Preserve context; note that no
239 instance test for class attribute
240 assignment should be necessary
241 since this instruction should only
242 be generated for self accesses
243
244 StoreAttrIndex Store attribute in an Preserve context; since the index
245 unknown object lookup could yield a class
246 attribute, a test of the nature of
247 the nature of the structure is
248 necessary in order to prevent
249 assignments to classes
250
251 2) StoreAddressContext Store attribute in a Override context if appropriate;
252 known object if the value has a replaceable
253 context, permit the target to
254 take ownership of the value
255
256 See assignment.txt for details.
257
258 Tables, Attributes and Lookups
259 ==============================
260
261 Attribute lookups, where the exact location of an object attribute is deduced,
262 are performed differently in micropython than in other implementations.
263 Instead of providing attribute dictionaries, in which attributes are found,
264 attributes are located at fixed places in object structures (described in
265 lowlevel.txt) and their locations are stored using a special representation
266 known as a table.
267
268 For a given program, a table can be considered as being like a matrix mapping
269 classes to attribute names. For example:
270
271 class A:
272 # instances have attributes x, y
273
274 class B(A):
275 # introduces attribute z for instances
276
277 class C:
278 # instances have attributes a, b, z
279
280 This would provide the following table, referred to as an object table in the
281 context of classes and instances:
282
283 Class/attr a b x y z
284
285 A 1 2
286 B 1 2 3
287 C 1 2 3
288
289 A limitation of this representation is that instance attributes may not shadow
290 class attributes: if an attribute with a given name is not defined on an
291 instance, an attribute with the same name cannot be provided by the class of
292 the instance or any superclass of the instance's class. This impacts the
293 provision of the __class__ attribute, as described below.
294
295 The table can be compacted using a representation known as a displacement
296 list (referred to as an object list in this context):
297
298 Classes with attribute offsets
299
300 classcode A
301 attrcode a b x y z
302
303 B
304 a b x y z
305
306 C
307 a b x y z
308
309 List . . 1 2 1 2 3 1 2 . . 3
310
311 Here, the classcode refers to the offset in the list at which a class's
312 attributes are defined, whereas the attrcode defines the offset within a
313 region of attributes corresponding to a single attribute of a given name.
314
315 Attribute Locations
316 -------------------
317
318 The locations stored in table/list elements are generally for instance
319 attributes relative to the location of the instance, whereas those for class
320 attributes and module attributes are generally absolute addresses. Thus, each
321 occupied table cell has the following structure:
322
323 attrcode, uses-absolute-address, address (or location)
324
325 This could be given instead as follows:
326
327 attrcode, is-class-or-module-attribute, location
328
329 Since uses-absolute-address corresponds to is-class-or-module-attribute, and
330 since there is a need to test for classes and modules to prevent assignment to
331 attributes of such objects, this particular information is always required.
332
333 The __class__ Attribute
334 -----------------------
335
336 In Python 2.x, at least with new-style classes, instances have __class__
337 attributes which indicate the class from which they have been instantiated,
338 whereas classes have __class__ attributes which reference the type class.
339 With the object table, it is not possible to provide absolute addresses which
340 can be used for both classes and instances, since this would result in classes
341 and instances having the same class, and thus the class of a class would be
342 the class itself.
343
344 One solution is to use object-relative values in the table so that referencing
345 the __class__ attribute of an instance produces a value which can be combined
346 with an instance's address to yield the address of the attribute, which itself
347 refers to the instance's class, whereas referencing the __class__ attribute of
348 a class produces a similar object-relative value that is combined with the
349 class's address to yield the address of the attribute, which itself refers to
350 the special type class.
351
352 Obviously, the above solution requires both classes and instances to retain an
353 attribute location specifically to hold the value appropriate for each object
354 type, whereas a scheme which omits the __class__ attribute on classes would be
355 able to employ an absolute address in the table and maintain only a single
356 address to refer to the class for all instances. The only problem with not
357 providing a sensible __class__ attribute entry for classes would be the need
358 for special treatment of __class__ to prevent inappropriate consultation of
359 the table for classes.
360
361 Comparing Tables as Matrices with Displacement Lists
362 ----------------------------------------------------
363
364 Although displacement lists can provide reasonable levels of compaction for
365 attribute data, the element size is larger than that required for a simple
366 matrix: the attribute code (attrcode) need not be stored since each element
367 unambiguously refers to the availability of an attribute for a particular
368 class or instance of that class, and so the data at a given element need not
369 be tested for relevance to a given attribute access operation.
370
371 Given a program with 20 object types and 100 attribute types, a matrix would
372 occupy the following amount of space:
373
374 number of object types * number of attribute types * element size
375 = 20 * 100 * 1 (assuming that a single location is sufficient for an element)
376 = 2000
377
378 In contrast, given a compaction to 40% of the matrix size (without considering
379 element size) in a displacement list, the amount of space would be as follows:
380
381 number of elements * element size
382 = 40% * (20 * 100) * 2 (assuming that one additional location is required)
383 = 1600
384
385 Consequently, the principal overhead of using a displacement list is likely to
386 be in the need to check element relevance when retrieving values from such a
387 list.
388
389 Parameters and Lookups
390 ======================
391
392 Since Python supports keyword arguments when making invocations, it becomes
393 necessary to record the parameter names associated with each function or
394 method. Just as object tables record attributes positions on classes and
395 instances, parameter tables record parameter positions in function or method
396 parameter lists.
397
398 For a given program, a parameter table can be considered as being like a
399 matrix mapping functions/methods to parameter names. For example:
400
401 def f(x, y, z):
402 pass
403
404 def g(a, b, c):
405 pass
406
407 def h(a, x):
408 pass
409
410 This would provide the following table, referred to as a parameter table in
411 the context of functions and methods:
412
413 Function/param a b c x y z
414
415 f 1 2 3
416 g 1 2 3
417 h 1 2
418
419 Confusion can occur when functions are adopted as methods, since the context
420 then occupies the first slot in the invocation frame:
421
422 def f(x, y, z):
423 pass
424
425 f(x=1, y=2, z=3) -> f(<context>, 1, 2, 3)
426 -> f(1, 2, 3)
427
428 class C:
429 f = f
430
431 def g(x, y, z):
432 pass
433
434 c = C()
435
436 c.f(y=2, z=3) -> f(<context>, 2, 3)
437 c.g(y=2, z=3) -> C.g(<context>, 2, 3)
438
439 Just as with parameter tables, a displacement list can be prepared from a
440 parameter table:
441
442 Functions with parameter (attribute) offsets
443
444 funccode f
445 attrcode a b c x y z
446
447 g
448 a b c x y z
449
450 h
451 a b c x y z
452
453 List . . . 1 2 3 1 2 3 1 . . 2 . .
454
455 Here, the funccode refers to the offset in the list at which a function's
456 parameters are defined, whereas the attrcode defines the offset within a
457 region of attributes corresponding to a single parameter of a given name.