1 Concepts
2 ========
3
4 This document describes the underlying concepts employed in micropython.
5
6 * Namespaces and attribute definition
7 * Contexts and values
8 * Tables, attributes and lookups
9 * Objects and structures
10 * Parameters and lookups
11 * Instantiation
12 * Register usage
13 * List and tuple representations
14
15 Namespaces and Attribute Definition
16 ===================================
17
18 Namespaces are any objects which can retain attributes.
19
20 * Module attributes are defined either at the module level or by global
21 statements.
22 * Class attributes are defined only within class statements.
23 * Instance attributes are defined only by assignments to attributes of self
24 within __init__ methods.
25
26 These restrictions apply because such attributes are thus explicitly declared,
27 permitting the use of tables (described below). Module and class attributes
28 can also be finalised in this way in order to permit certain optimisations.
29
30 An additional restriction required for the current implementation of tables
31 (as described below) applies to class definitions: each class must be defined
32 using a unique name; repeated definition of classes having the same name is
33 thus not permitted. This restriction arises from the use of the "full name" of
34 a class as a key to the object table, where the full name is a qualified path
35 via the module hierarchy ending with the name of the class.
36
37 See rejected.txt for complicating mechanisms which could be applied to
38 mitigate the effects of these restrictions on optimisations.
39
40 Contexts and Values
41 ===================
42
43 Values are used as the common reference representation in micropython: as
44 stored representations of attributes (of classes, instances, modules, and
45 other objects supporting attribute-like entities) as well as the stored values
46 associated with names in functions and methods.
47
48 Unlike other implementations, micropython does not create things like bound
49 method objects for individual instances. Instead, all objects are referenced
50 using a context, reference pair:
51
52 Value Layout
53 ------------
54
55 0 1
56 context object
57 reference reference
58
59 Specific implementations might reverse this ordering for optimisation
60 purposes.
61
62 Rationale
63 ---------
64
65 To reduce the number of created objects whilst retaining the ability to
66 support bound method invocations. The context indicates the context in which
67 an invocation is performed, typically the owner of the method.
68
69 Usage
70 -----
71
72 The context may be inserted as the first argument when a value is involved in
73 an invocation. This argument may then be omitted from the invocation if its
74 usage is not appropriate.
75
76 See invocation.txt for details.
77
78 Context Value Types
79 -------------------
80
81 The following types of context value exist:
82
83 Type Usage Transformations
84 ---- ----- ---------------
85
86 Replaceable With functions (not methods) May be replaced with an
87 instance or a class when a
88 value is stored on an
89 instance or class
90
91 Placeholder With classes May not be replaced
92
93 Instance With instances (and constants) May not be replaced
94 or functions as methods
95
96 Class With functions as methods May be replaced when a
97 value is loaded from a
98 class attribute via an
99 instance
100
101 Contexts in Acquired Values
102 ---------------------------
103
104 There are four classes of instructions which provide values:
105
106 Instruction Purpose Context Operations
107 ----------- ------- ------------------
108
109 1) LoadConst Load module, constant Use loaded object with itself
110 as context
111
112 2) LoadFunction Load function Combine replaceable context
113 with loaded object
114
115 3) LoadClass Load class Combine placeholder context
116 with loaded object
117
118 4) LoadAddress* Load attribute from Preserve or override stored
119 LoadAttr* class, module, context (as described in
120 instance assignment.txt)
121
122 In order to comply with traditional Python behaviour, contexts may or may not
123 represent the object from which an attribute has been acquired.
124
125 See assignment.txt for details.
126
127 Contexts in Stored Values
128 -------------------------
129
130 There are two classes of instruction for storing values:
131
132 Instruction Purpose Context Operations
133 ----------- ------- ------------------
134
135 1) StoreAddress Store attribute in a Preserve context; note that no
136 known object test for class attribute
137 assignment should be necessary
138 since this instruction should only
139 be generated for module globals
140
141 StoreAttr Store attribute in an Preserve context; note that no
142 instance test for class attribute
143 assignment should be necessary
144 since this instruction should only
145 be generated for self accesses
146
147 StoreAttrIndex Store attribute in an Preserve context; since the index
148 unknown object lookup could yield a class
149 attribute, a test of the nature of
150 the nature of the structure is
151 necessary in order to prevent
152 assignments to classes
153
154 2) StoreAddressContext Store attribute in a Override context if appropriate;
155 known object if the value has a replaceable
156 context, permit the target to
157 take ownership of the value
158
159 See assignment.txt for details.
160
161 Tables, Attributes and Lookups
162 ==============================
163
164 Attribute lookups, where the exact location of an object attribute is deduced,
165 are performed differently in micropython than in other implementations.
166 Instead of providing attribute dictionaries, in which attributes are found,
167 attributes are located at fixed places in object structures (described below)
168 and their locations are stored using a special representation known as a
169 table.
170
171 For a given program, a table can be considered as being like a matrix mapping
172 classes to attribute names. For example:
173
174 class A:
175 # instances have attributes x, y
176
177 class B(A):
178 # introduces attribute z for instances
179
180 class C:
181 # instances have attributes a, b, z
182
183 This would provide the following table, referred to as an object table in the
184 context of classes and instances:
185
186 Class/attr a b x y z
187
188 A 1 2
189 B 1 2 3
190 C 1 2 3
191
192 A limitation of this representation is that instance attributes may not shadow
193 class attributes: if an attribute with a given name is not defined on an
194 instance, an attribute with the same name cannot be provided by the class of
195 the instance or any superclass of the instance's class.
196
197 The table can be compacted using a representation known as a displacement
198 list (referred to as an object list in this context):
199
200 Classes with attribute offsets
201
202 classcode A
203 attrcode a b x y z
204
205 B
206 a b x y z
207
208 C
209 a b x y z
210
211 List . . 1 2 1 2 3 1 2 . . 3
212
213 Here, the classcode refers to the offset in the list at which a class's
214 attributes are defined, whereas the attrcode defines the offset within a
215 region of attributes corresponding to a single attribute of a given name.
216
217 Attribute Locations
218 -------------------
219
220 The locations stored in table/list elements are for instance attributes
221 relative to the location of the instance, whereas those for class attributes
222 and modules are absolute addresses (although these could also be changed to
223 object-relative locations). Thus, each occupied table cell has the following
224 structure:
225
226 attrcode, uses-absolute-address, address (or location)
227
228 Objects and Structures
229 ======================
230
231 As well as references, micropython needs to have actual objects to refer to.
232 Since classes, functions and instances are all objects, it is desirable that
233 certain common features and operations are supported in the same way for all
234 of these things. To permit this, a common data structure format is used.
235
236 Header.................................................... Attributes.................
237
238 Identifier Identifier Address Identifier Size Object Object ...
239
240 0 1 2 3 4 5 6 7
241 classcode attrcode/ invocation funccode size __class__ attribute ...
242 instance reference reference reference
243 status
244
245 Classcode
246 ---------
247
248 Used in attribute lookup.
249
250 Here, the classcode refers to the attribute lookup table for the object (as
251 described above). Classes and instances share the same classcode, and their
252 structures reflect this. Functions all belong to the same type and thus employ
253 the classcode for the function built-in type, whereas modules have distinct
254 types since they must support different sets of attributes.
255
256 Attrcode
257 --------
258
259 Used to test instances for membership of classes (or descendants of classes).
260
261 Since, in traditional Python, classes are only ever instances of some generic
262 built-in type, support for testing such a relationship directly has been
263 removed and the attrcode is not specified for classes: the presence of an
264 attrcode indicates that a given object is an instance. In addition, support
265 has also been removed for testing modules in the same way, meaning that the
266 attrcode is also not specified for modules.
267
268 See the "Testing Instance Compatibility with Classes (Attrcode)" section below
269 for details of attrcodes.
270
271 Invocation Reference
272 --------------------
273
274 Used when an object is called.
275
276 This is the address of the code to be executed when an invocation is performed
277 on the object.
278
279 Funccode
280 --------
281
282 Used to look up argument positions by name.
283
284 The strategy with keyword arguments in micropython is to attempt to position
285 such arguments in the invocation frame as it is being constructed.
286
287 See the "Parameters and Lookups" section for more information.
288
289 Size
290 ----
291
292 Used to indicate the size of an object including attributes.
293
294 Attributes
295 ----------
296
297 For classes, modules and instances, the attributes in the structure correspond
298 to the attributes of each kind of object. For functions, however, the
299 attributes in the structure correspond to the default arguments for each
300 function, if any.
301
302 Structure Types
303 ---------------
304
305 Class C:
306
307 0 1 2 3 4 5 6 7
308 classcode (unused) __new__ funccode size class type attribute ...
309 for C reference for reference reference
310 instantiator
311
312 Instance of C:
313
314 0 1 2 3 4 5 6 7
315 classcode attrcode C.__call__ funccode size class C attribute ...
316 for C for C reference for reference reference
317 (if exists) C.__call__
318
319 Function f:
320
321 0 1 2 3 4 5 6 7
322 classcode attrcode code funccode size class attribute ...
323 for for reference function (default)
324 function function reference reference
325
326 Module m:
327
328 0 1 2 3 4 5 6 7
329 classcode attrcode (unused) (unused) (unused) module type attribute ...
330 for m for m reference (global)
331 reference
332
333 The __class__ Attribute
334 -----------------------
335
336 All objects support the __class__ attribute and this is illustrated above with
337 the first attribute.
338
339 Class: refers to the type class (type.__class__ also refers to the type class)
340 Function: refers to the function class
341 Instance: refers to the class instantiated to make the object
342
343 Lists and Tuples
344 ----------------
345
346 The built-in list and tuple sequences employ variable length structures using
347 the attribute locations to store their elements, where each element is a
348 reference to a separately stored object.
349
350 Testing Instance Compatibility with Classes (Attrcode)
351 ------------------------------------------------------
352
353 Although it would be possible to have a data structure mapping classes to
354 compatible classes, such as a matrix indicating the subclasses (or
355 superclasses) of each class, the need to retain the key to such a data
356 structure for each class might introduce a noticeable overhead.
357
358 Instead of having a separate structure, descendant classes of each class are
359 inserted as special attributes into the object table. This requires an extra
360 key to be retained, since each class must provide its own attribute code such
361 that upon an instance/class compatibility test, the code may be obtained and
362 used in the object table.
363
364 Invocation and Code References
365 ------------------------------
366
367 Modules: there is no meaningful invocation reference since modules cannot be
368 explicitly called.
369
370 Functions: a simple code reference is employed pointing to code implementing
371 the function. Note that the function locals are completely distinct from this
372 structure and are not comparable to attributes. Instead, attributes are
373 reserved for default parameter values, although they do not appear in the
374 object table described above, appearing instead in a separate parameter table
375 described below.
376
377 Classes: given that classes must be invoked in order to create instances, a
378 reference must be provided in class structures. However, this reference does
379 not point directly at the __init__ method of the class. Instead, the
380 referenced code belongs to a special initialiser function, __new__, consisting
381 of the following instructions:
382
383 create instance for C
384 call C.__init__(instance, ...)
385 return instance
386
387 Instances: each instance employs a reference to any __call__ method defined in
388 the class hierarchy for the instance, thus maintaining its callable nature.
389
390 Both classes and modules may contain code in their definitions - the former in
391 the "body" of the class, potentially defining attributes, and the latter as
392 the "top-level" code in the module, potentially defining attributes/globals -
393 but this code is not associated with any invocation target. It is thus
394 generated in order of appearance and is not referenced externally.
395
396 Invocation Operation
397 --------------------
398
399 Consequently, regardless of the object an invocation is always done as
400 follows:
401
402 get invocation reference from the header
403 jump to reference
404
405 Additional preparation is necessary before the above code: positional
406 arguments must be saved in the invocation frame, and keyword arguments must be
407 resolved and saved to the appropriate position in the invocation frame.
408
409 See invocation.txt for details.
410
411 Parameters and Lookups
412 ======================
413
414 Since Python supports keyword arguments when making invocations, it becomes
415 necessary to record the parameter names associated with each function or
416 method. Just as object tables record attributes positions on classes and
417 instances, parameter tables record parameter positions in function or method
418 parameter lists.
419
420 For a given program, a parameter table can be considered as being like a
421 matrix mapping functions/methods to parameter names. For example:
422
423 def f(x, y, z):
424 pass
425
426 def g(a, b, c):
427 pass
428
429 def h(a, x):
430 pass
431
432 This would provide the following table, referred to as a parameter table in
433 the context of functions and methods:
434
435 Function/param a b c x y z
436
437 f 1 2 3
438 g 1 2 3
439 h 1 2
440
441 Confusion can occur when functions are adopted as methods, since the context
442 then occupies the first slot in the invocation frame:
443
444 def f(x, y, z):
445 pass
446
447 f(x=1, y=2, z=3) -> f(<context>, 1, 2, 3)
448 -> f(1, 2, 3)
449
450 class C:
451 f = f
452
453 def g(x, y, z):
454 pass
455
456 c = C()
457
458 c.f(y=2, z=3) -> f(<context>, 2, 3)
459 c.g(y=2, z=3) -> C.g(<context>, 2, 3)
460
461 Just as with parameter tables, a displacement list can be prepared from a
462 parameter table:
463
464 Functions with parameter (attribute) offsets
465
466 funccode f
467 attrcode a b c x y z
468
469 g
470 a b c x y z
471
472 h
473 a b c x y z
474
475 List . . . 1 2 3 1 2 3 1 . . 2 . .
476
477 Here, the funccode refers to the offset in the list at which a function's
478 parameters are defined, whereas the attrcode defines the offset within a
479 region of attributes corresponding to a single parameter of a given name.
480
481 Instantiation
482 =============
483
484 When instantiating classes, memory must be reserved for the header of the
485 resulting instance, along with locations for the attributes of the instance.
486 Since the instance header contains data common to all instances of a class, a
487 template header is copied to the start of the newly reserved memory region.
488
489 Register Usage
490 ==============
491
492 During code generation, much of the evaluation produces results which are
493 implicitly recorded in the "active value" register, and various instructions
494 will consume the active value. In addition, some instructions will consume a
495 separate "active source value" from a register, typically those which are
496 assigning the result of an expression to an assignment target.
497
498 Since values often need to be retained for later use, a set of temporary
499 storage locations are typically employed. However, optimisations may reduce
500 the need to use such temporary storage where instructions which provide the
501 "active value" can be re-executed and will produce the same result.
502
503 List and Tuple Representations
504 ==============================
505
506 Since tuples have a fixed size, the representation of a tuple instance is
507 merely a header describing the size of the entire object, together with a
508 sequence of references to the object "stored" at each position in the
509 structure. Such references consist of the usual context and reference pair.
510
511 Lists, however, have a variable size and must be accessible via an unchanging
512 location even as more memory is allocated elsewhere to accommodate the
513 contents of the list. Consequently, the representation must resemble the
514 following:
515
516 Structure header for list (size == header plus special attribute)
517 Special attribute referencing the underlying sequence
518
519 The underlying sequence has a fixed size, like a tuple, but may contain fewer
520 elements than the size of the sequence permits:
521
522 Special header indicating the current size and allocated size
523 Element
524 ... <-- current size
525 (Unused space)
526 ... <-- allocated size
527
528 This representation permits the allocation of a new sequence when space is
529 exhausted in an existing sequence, with the new sequence address stored in the
530 main list structure. Since access to the contents of the list must go through
531 the main list structure, underlying allocation activities may take place
532 without the users of a list having to be aware of such activities.