1 Low-level Implementation Details
2 ================================
3
4 Although micropython delegates the generation of low-level program code and
5 data to syspython, various considerations of how an eventual program might be
6 structured have been used to inform the way micropython represents the details
7 of a program. This document describes these considerations and indicates how
8 syspython or other technologies might represent a working program.
9
10 Objects and Structures
11 ======================
12
13 As well as references, micropython needs to have actual objects to refer to.
14 Since classes, functions and instances are all objects, it is desirable that
15 certain common features and operations are supported in the same way for all
16 of these things. To permit this, a common data structure format is used.
17
18 Header.................................................... Attributes.................
19
20 Identifier Identifier Address Identifier Size Object ...
21
22 0 1 2 3 4 5 6
23 classcode attrcode/ invocation funccode size attribute ...
24 instance reference reference
25 status
26
27 Classcode
28 ---------
29
30 Used in attribute lookup.
31
32 Here, the classcode refers to the attribute lookup table for the object (as
33 described in concepts.txt). Classes and instances share the same classcode,
34 and their structures reflect this. Functions all belong to the same type and
35 thus employ the classcode for the function built-in type, whereas modules have
36 distinct types since they must support different sets of attributes.
37
38 Attrcode
39 --------
40
41 Used to test instances for membership of classes (or descendants of classes).
42
43 Since, in traditional Python, classes are only ever instances of some generic
44 built-in type, support for testing such a relationship directly has been
45 removed and the attrcode is not specified for classes: the presence of an
46 attrcode indicates that a given object is an instance. In addition, support
47 has also been removed for testing modules in the same way, meaning that the
48 attrcode is also not specified for modules.
49
50 See the "Testing Instance Compatibility with Classes (Attrcode)" section below
51 for details of attrcodes.
52
53 Invocation Reference
54 --------------------
55
56 Used when an object is called.
57
58 This is the address of the code to be executed when an invocation is performed
59 on the object.
60
61 Funccode
62 --------
63
64 Used to look up argument positions by name.
65
66 The strategy with keyword arguments in micropython is to attempt to position
67 such arguments in the invocation frame as it is being constructed.
68
69 See the "Parameters and Lookups" section for more information.
70
71 Size
72 ----
73
74 Used to indicate the size of an object including attributes.
75
76 Attributes
77 ----------
78
79 For classes, modules and instances, the attributes in the structure correspond
80 to the attributes of each kind of object. For functions, however, the
81 attributes in the structure correspond to the default arguments for each
82 function, if any.
83
84 Structure Types
85 ---------------
86
87 Class C:
88
89 0 1 2 3 4 5 6
90 classcode (unused) __new__ funccode size attribute ...
91 for C reference for reference
92 instantiator
93
94 Instance of C:
95
96 0 1 2 3 4 5 6
97 classcode attrcode C.__call__ funccode size attribute ...
98 for C for C reference for reference
99 (if exists) C.__call__
100
101 Function f:
102
103 0 1 2 3 4 5 6
104 classcode attrcode code funccode size attribute ...
105 for for reference (default)
106 function function reference
107
108 Module m:
109
110 0 1 2 3 4 5 6
111 classcode attrcode (unused) (unused) (unused) attribute ...
112 for m for m (global)
113 reference
114
115 The __class__ Attribute
116 -----------------------
117
118 All objects should support the __class__ attribute, and in most cases this is
119 done using the object table, yielding a common address for all instances of a
120 given class.
121
122 Function: refers to the function class
123 Instance: refers to the class instantiated to make the object
124
125 The object table cannot support two definitions simultaneously for both
126 instances and their classes. Consequently, __class__ access on classes must be
127 tested for and a special result returned.
128
129 Class: refers to the type class (type.__class__ also refers to the type class)
130
131 For convenience, the first attribute of a class will be the common __class__
132 attribute for all its instances. As noted above, direct access to this
133 attribute will not be possible for classes, and a constant result will be
134 returned instead.
135
136 Lists and Tuples
137 ----------------
138
139 The built-in list and tuple sequences employ variable length structures using
140 the attribute locations to store their elements, where each element is a
141 reference to a separately stored object.
142
143 Testing Instance Compatibility with Classes (Attrcode)
144 ------------------------------------------------------
145
146 Although it would be possible to have a data structure mapping classes to
147 compatible classes, such as a matrix indicating the subclasses (or
148 superclasses) of each class, the need to retain the key to such a data
149 structure for each class might introduce a noticeable overhead.
150
151 Instead of having a separate structure, descendant classes of each class are
152 inserted as special attributes into the object table. This requires an extra
153 key to be retained, since each class must provide its own attribute code such
154 that upon an instance/class compatibility test, the code may be obtained and
155 used in the object table.
156
157 Invocation and Code References
158 ------------------------------
159
160 Modules: there is no meaningful invocation reference since modules cannot be
161 explicitly called.
162
163 Functions: a simple code reference is employed pointing to code implementing
164 the function. Note that the function locals are completely distinct from this
165 structure and are not comparable to attributes. Instead, attributes are
166 reserved for default parameter values, although they do not appear in the
167 object table described above, appearing instead in a separate parameter table
168 described in concepts.txt.
169
170 Classes: given that classes must be invoked in order to create instances, a
171 reference must be provided in class structures. However, this reference does
172 not point directly at the __init__ method of the class. Instead, the
173 referenced code belongs to a special initialiser function, __new__, consisting
174 of the following instructions:
175
176 create instance for C
177 call C.__init__(instance, ...)
178 return instance
179
180 Instances: each instance employs a reference to any __call__ method defined in
181 the class hierarchy for the instance, thus maintaining its callable nature.
182
183 Both classes and modules may contain code in their definitions - the former in
184 the "body" of the class, potentially defining attributes, and the latter as
185 the "top-level" code in the module, potentially defining attributes/globals -
186 but this code is not associated with any invocation target. It is thus
187 generated in order of appearance and is not referenced externally.
188
189 Invocation Operation
190 --------------------
191
192 Consequently, regardless of the object an invocation is always done as
193 follows:
194
195 get invocation reference from the header
196 jump to reference
197
198 Additional preparation is necessary before the above code: positional
199 arguments must be saved in the invocation frame, and keyword arguments must be
200 resolved and saved to the appropriate position in the invocation frame.
201
202 See invocation.txt for details.
203
204 Instantiation
205 =============
206
207 When instantiating classes, memory must be reserved for the header of the
208 resulting instance, along with locations for the attributes of the instance.
209 Since the instance header contains data common to all instances of a class, a
210 template header is copied to the start of the newly reserved memory region.
211
212 List and Tuple Representations
213 ==============================
214
215 Since tuples have a fixed size, the representation of a tuple instance is
216 merely a header describing the size of the entire object, together with a
217 sequence of references to the object "stored" at each position in the
218 structure. Such references consist of the usual context and reference pair.
219
220 Lists, however, have a variable size and must be accessible via an unchanging
221 location even as more memory is allocated elsewhere to accommodate the
222 contents of the list. Consequently, the representation must resemble the
223 following:
224
225 Structure header for list (size == header plus special attribute)
226 Special attribute referencing the underlying sequence
227
228 The underlying sequence has a fixed size, like a tuple, but may contain fewer
229 elements than the size of the sequence permits:
230
231 Special header indicating the current size and allocated size
232 Element
233 ... <-- current size
234 (Unused space)
235 ... <-- allocated size
236
237 This representation permits the allocation of a new sequence when space is
238 exhausted in an existing sequence, with the new sequence address stored in the
239 main list structure. Since access to the contents of the list must go through
240 the main list structure, underlying allocation activities may take place
241 without the users of a list having to be aware of such activities.