1 Concepts
2 ========
3
4 This document describes the underlying concepts employed in micropython.
5
6 * Namespaces and attribute definition
7 * Attribute usage observations
8 * Imports and circular import detection
9 * Contexts and values
10 * Tables, attributes and lookups
11 * Parameters and lookups
12 * List and tuple representations
13
14 Namespaces and Attribute Definition
15 ===================================
16
17 Namespaces are any objects which can retain attributes.
18
19 * Module attributes are defined either at the module level or by global
20 statements.
21 * Class attributes are defined only within class statements.
22 * Instance attributes are defined only by assignments to attributes of self
23 or tentatively as references to attributes of self.
24 * Locals are effectively attributes of the local namespace and are not
25 accessible externally (and thus cannot support closures).
26
27 These restrictions apply because such attributes are thus explicitly declared,
28 permitting the use of tables (described below). Module and class attributes
29 can also be finalised in this way in order to permit certain optimisations.
30
31 Rebinding of attributes outside classes and modules can be allowed if
32 attribute usage observations are being used to detect such external
33 modifications to such objects. Without such observations, such rebinding
34 should be forbidden since apparently constant attributes might be modified in
35 a running program, but code may have been generated that provides specific
36 objects for those attributes under the assumption that they will not be
37 changed.
38
39 Observations during initial program inspection populate namespaces in a
40 simplistic fashion. If a loop is active, definitions record that the value of
41 a name can be set potentially many times. In local namespaces, definitions are
42 also recorded using the mechanisms employed to track attribute usage, and such
43 observations may provide a more sophisticated view of the potential values of
44 local names.
45
46 See rejected.txt for complicating mechanisms which could be applied to
47 mitigate the effects of these restrictions on optimisations.
48
49 Attribute Usage Observations
50 ============================
51
52 Within a scope, a name may be used in conjunction with attribute names in
53 order to access attributes on objects referenced by the name. However, such
54 observations can only be regarded as reliable if the object referenced is not
55 changed independently by some other mechanism or activity.
56
57 With conventional functions and methods, any locally defined names can be
58 considered private to that scope and thus immune to independent modification,
59 at least within reasonable features of the language. Although stack
60 modification may be possible, it seems appropriate to reject such features,
61 especially since they lend themselves to unmaintainable programs.
62
63 For names defined during the initialisation of a class, since the class itself
64 cannot be referenced by name until its declaration has been completely
65 evaluated, no independent modification can occur from outside the class scope.
66
67 For names defined during the initialisation of a module, global declarations
68 in functions permit the rebinding of global variables and since functions may
69 be invoked during module initialisation, independent modification can
70 potentially occur if any functions are called.
71
72 Module globals can be accessed from other modules that can refer to a module
73 by its name. Initially, an import is required to make a module available for
74 modification, but there is no restriction on whether a module has been
75 completely imported (and thus defined) before an import statement can make it
76 available to other modules. Consider the following package root definition:
77
78 # Module changed:
79 def f():
80 import changed.modifier
81 x = 123
82 f()
83
84 # Module changed.modifier:
85 import changed
86 changed.x = 456
87
88 Here, an import of changed will initially set x to 123, but then the function
89 f will be invoked and cause the changed.modifier module to be imported. Since
90 the changed module is already being imported, the import statement will not
91 try to perform the import operation again, but it will make the partially
92 defined module available for access. Thus, the changed.modifier module will
93 then set x to 456, and independent modification of the changed namespace will
94 have been performed.
95
96 In conclusion, module globals cannot therefore be regarded as immune to
97 operations that would disrupt usage observations. Consequently, only locals
98 and class definition "locals" can be reliably employed in attribute usage
99 observations.
100
101 Imports and Circular Import Detection
102 =====================================
103
104 The matter of whether any given module is potentially modified by another
105 module before it has been completely imported can be addressed by separating
106 the import process into distinct phases:
107
108 1. Discovery/loading
109 2. Parsing/structure processing
110 3. Completion/code processing
111
112 Upon discovering a module, a test is made to determine whether it is already
113 being imported; if not, it is then loaded and its structure inspected to
114 determine whether it may import other modules, which will then in turn be
115 discovered and loaded. Once no more modules can be loaded, they will then be
116 completed by undergoing the more thorough systematic processing of each
117 module's contents, defining program units and requesting the completion of
118 other modules when import statements are encountered.
119
120 The motivation for such a multi-phase approach is to detect circular imports
121 in the structure processing phase before modules are populated and deductions
122 made about their objects' behaviour. Thus, globals belonging to a module known
123 to be imported in a circular fashion will already be regarded as potentially
124 modifiable by other modules and attribute usage observations will not be
125 recorded.
126
127 Contexts and Values
128 ===================
129
130 Values are used as the common reference representation in micropython: as
131 stored representations of attributes (of classes, instances, modules, and
132 other objects supporting attribute-like entities) as well as the stored values
133 associated with names in functions and methods.
134
135 Unlike other implementations, micropython does not create things like bound
136 method objects for individual instances. Instead, all objects are referenced
137 using a context, reference pair:
138
139 Value Layout
140 ------------
141
142 0 1
143 context object
144 reference reference
145
146 Specific implementations might reverse this ordering for optimisation
147 purposes.
148
149 Rationale
150 ---------
151
152 To reduce the number of created objects whilst retaining the ability to
153 support bound method invocations. The context indicates the context in which
154 an invocation is performed, typically the owner of the method.
155
156 Usage
157 -----
158
159 The context may be inserted as the first argument when a value is involved in
160 an invocation. This argument may then be omitted from the invocation if its
161 usage is not appropriate.
162
163 See invocation.txt for details.
164
165 Context Value Types
166 -------------------
167
168 The following types of context value exist:
169
170 Type Usage Transformations
171 ---- ----- ---------------
172
173 Replaceable With functions (not methods) May be replaced with an
174 instance or a class when a
175 value is stored on an
176 instance or class
177
178 Placeholder With classes May not be replaced
179
180 Instance With instances (and constants) May not be replaced
181 or functions as methods
182
183 Class With functions as methods May be replaced when a
184 value is loaded from a
185 class attribute via an
186 instance
187
188 Contexts in Acquired Values
189 ---------------------------
190
191 There are four classes of instructions which provide values:
192
193 Instruction Purpose Context Operations
194 ----------- ------- ------------------
195
196 1) LoadConst Load module, constant Use loaded object with itself
197 as context
198
199 2) LoadFunction Load function Combine replaceable context
200 with loaded object
201
202 3) LoadClass Load class Combine placeholder context
203 with loaded object
204
205 4) LoadAddress* Load attribute from Preserve or override stored
206 LoadAttr* class, module, context (as described in
207 instance assignment.txt)
208
209 In order to comply with traditional Python behaviour, contexts may or may not
210 represent the object from which an attribute has been acquired.
211
212 See assignment.txt for details.
213
214 Contexts in Stored Values
215 -------------------------
216
217 There are two classes of instruction for storing values:
218
219 Instruction Purpose Context Operations
220 ----------- ------- ------------------
221
222 1) StoreAddress Store attribute in a Preserve context; note that no
223 known object test for class attribute
224 assignment should be necessary
225 since this instruction should only
226 be generated for module globals
227
228 StoreAttr Store attribute in an Preserve context; note that no
229 instance test for class attribute
230 assignment should be necessary
231 since this instruction should only
232 be generated for self accesses
233
234 StoreAttrIndex Store attribute in an Preserve context; since the index
235 unknown object lookup could yield a class
236 attribute, a test of the nature of
237 the nature of the structure is
238 necessary in order to prevent
239 assignments to classes
240
241 2) StoreAddressContext Store attribute in a Override context if appropriate;
242 known object if the value has a replaceable
243 context, permit the target to
244 take ownership of the value
245
246 See assignment.txt for details.
247
248 Tables, Attributes and Lookups
249 ==============================
250
251 Attribute lookups, where the exact location of an object attribute is deduced,
252 are performed differently in micropython than in other implementations.
253 Instead of providing attribute dictionaries, in which attributes are found,
254 attributes are located at fixed places in object structures (described in
255 lowlevel.txt) and their locations are stored using a special representation
256 known as a table.
257
258 For a given program, a table can be considered as being like a matrix mapping
259 classes to attribute names. For example:
260
261 class A:
262 # instances have attributes x, y
263
264 class B(A):
265 # introduces attribute z for instances
266
267 class C:
268 # instances have attributes a, b, z
269
270 This would provide the following table, referred to as an object table in the
271 context of classes and instances:
272
273 Class/attr a b x y z
274
275 A 1 2
276 B 1 2 3
277 C 1 2 3
278
279 A limitation of this representation is that instance attributes may not shadow
280 class attributes: if an attribute with a given name is not defined on an
281 instance, an attribute with the same name cannot be provided by the class of
282 the instance or any superclass of the instance's class. This impacts the
283 provision of the __class__ attribute, as described below.
284
285 The table can be compacted using a representation known as a displacement
286 list (referred to as an object list in this context):
287
288 Classes with attribute offsets
289
290 classcode A
291 attrcode a b x y z
292
293 B
294 a b x y z
295
296 C
297 a b x y z
298
299 List . . 1 2 1 2 3 1 2 . . 3
300
301 Here, the classcode refers to the offset in the list at which a class's
302 attributes are defined, whereas the attrcode defines the offset within a
303 region of attributes corresponding to a single attribute of a given name.
304
305 Attribute Locations
306 -------------------
307
308 The locations stored in table/list elements are generally for instance
309 attributes relative to the location of the instance, whereas those for class
310 attributes and module attributes are generally absolute addresses. Thus, each
311 occupied table cell has the following structure:
312
313 attrcode, uses-absolute-address, address (or location)
314
315 This could be given instead as follows:
316
317 attrcode, is-class-or-module-attribute, location
318
319 Since uses-absolute-address corresponds to is-class-or-module-attribute, and
320 since there is a need to test for classes and modules to prevent assignment to
321 attributes of such objects, this particular information is always required.
322
323 The __class__ Attribute
324 -----------------------
325
326 In Python 2.x, at least with new-style classes, instances have __class__
327 attributes which indicate the class from which they have been instantiated,
328 whereas classes have __class__ attributes which reference the type class.
329 With the object table, it is not possible to provide absolute addresses which
330 can be used for both classes and instances, since this would result in classes
331 and instances having the same class, and thus the class of a class would be
332 the class itself.
333
334 One solution is to use object-relative values in the table so that referencing
335 the __class__ attribute of an instance produces a value which can be combined
336 with an instance's address to yield the address of the attribute, which itself
337 refers to the instance's class, whereas referencing the __class__ attribute of
338 a class produces a similar object-relative value that is combined with the
339 class's address to yield the address of the attribute, which itself refers to
340 the special type class.
341
342 Obviously, the above solution requires both classes and instances to retain an
343 attribute location specifically to hold the value appropriate for each object
344 type, whereas a scheme which omits the __class__ attribute on classes would be
345 able to employ an absolute address in the table and maintain only a single
346 address to refer to the class for all instances. The only problem with not
347 providing a sensible __class__ attribute entry for classes would be the need
348 for special treatment of __class__ to prevent inappropriate consultation of
349 the table for classes.
350
351 Comparing Tables as Matrices with Displacement Lists
352 ----------------------------------------------------
353
354 Although displacement lists can provide reasonable levels of compaction for
355 attribute data, the element size is larger than that required for a simple
356 matrix: the attribute code (attrcode) need not be stored since each element
357 unambiguously refers to the availability of an attribute for a particular
358 class or instance of that class, and so the data at a given element need not
359 be tested for relevance to a given attribute access operation.
360
361 Given a program with 20 object types and 100 attribute types, a matrix would
362 occupy the following amount of space:
363
364 number of object types * number of attribute types * element size
365 = 20 * 100 * 1 (assuming that a single location is sufficient for an element)
366 = 2000
367
368 In contrast, given a compaction to 40% of the matrix size (without considering
369 element size) in a displacement list, the amount of space would be as follows:
370
371 number of elements * element size
372 = 40% * (20 * 100) * 2 (assuming that one additional location is required)
373 = 1600
374
375 Consequently, the principal overhead of using a displacement list is likely to
376 be in the need to check element relevance when retrieving values from such a
377 list.
378
379 Parameters and Lookups
380 ======================
381
382 Since Python supports keyword arguments when making invocations, it becomes
383 necessary to record the parameter names associated with each function or
384 method. Just as object tables record attributes positions on classes and
385 instances, parameter tables record parameter positions in function or method
386 parameter lists.
387
388 For a given program, a parameter table can be considered as being like a
389 matrix mapping functions/methods to parameter names. For example:
390
391 def f(x, y, z):
392 pass
393
394 def g(a, b, c):
395 pass
396
397 def h(a, x):
398 pass
399
400 This would provide the following table, referred to as a parameter table in
401 the context of functions and methods:
402
403 Function/param a b c x y z
404
405 f 1 2 3
406 g 1 2 3
407 h 1 2
408
409 Confusion can occur when functions are adopted as methods, since the context
410 then occupies the first slot in the invocation frame:
411
412 def f(x, y, z):
413 pass
414
415 f(x=1, y=2, z=3) -> f(<context>, 1, 2, 3)
416 -> f(1, 2, 3)
417
418 class C:
419 f = f
420
421 def g(x, y, z):
422 pass
423
424 c = C()
425
426 c.f(y=2, z=3) -> f(<context>, 2, 3)
427 c.g(y=2, z=3) -> C.g(<context>, 2, 3)
428
429 Just as with parameter tables, a displacement list can be prepared from a
430 parameter table:
431
432 Functions with parameter (attribute) offsets
433
434 funccode f
435 attrcode a b c x y z
436
437 g
438 a b c x y z
439
440 h
441 a b c x y z
442
443 List . . . 1 2 3 1 2 3 1 . . 2 . .
444
445 Here, the funccode refers to the offset in the list at which a function's
446 parameters are defined, whereas the attrcode defines the offset within a
447 region of attributes corresponding to a single parameter of a given name.