paul@766 | 1 | Low-level Implementation Details
|
paul@766 | 2 | ================================
|
paul@766 | 3 |
|
paul@766 | 4 | Although micropython delegates the generation of low-level program code and
|
paul@766 | 5 | data to syspython, various considerations of how an eventual program might be
|
paul@766 | 6 | structured have been used to inform the way micropython represents the details
|
paul@766 | 7 | of a program. This document describes these considerations and indicates how
|
paul@766 | 8 | syspython or other technologies might represent a working program.
|
paul@766 | 9 |
|
paul@768 | 10 | * Objects and structures
|
paul@768 | 11 | * Instantiation
|
paul@768 | 12 | * List and tuple representations
|
paul@768 | 13 | * Instruction evaluation model
|
paul@768 | 14 |
|
paul@766 | 15 | Objects and Structures
|
paul@766 | 16 | ======================
|
paul@766 | 17 |
|
paul@766 | 18 | As well as references, micropython needs to have actual objects to refer to.
|
paul@766 | 19 | Since classes, functions and instances are all objects, it is desirable that
|
paul@766 | 20 | certain common features and operations are supported in the same way for all
|
paul@766 | 21 | of these things. To permit this, a common data structure format is used.
|
paul@766 | 22 |
|
paul@766 | 23 | Header.................................................... Attributes.................
|
paul@766 | 24 |
|
paul@766 | 25 | Identifier Identifier Address Identifier Size Object ...
|
paul@766 | 26 |
|
paul@766 | 27 | 0 1 2 3 4 5 6
|
paul@766 | 28 | classcode attrcode/ invocation funccode size attribute ...
|
paul@766 | 29 | instance reference reference
|
paul@766 | 30 | status
|
paul@766 | 31 |
|
paul@766 | 32 | Classcode
|
paul@766 | 33 | ---------
|
paul@766 | 34 |
|
paul@766 | 35 | Used in attribute lookup.
|
paul@766 | 36 |
|
paul@766 | 37 | Here, the classcode refers to the attribute lookup table for the object (as
|
paul@766 | 38 | described in concepts.txt). Classes and instances share the same classcode,
|
paul@766 | 39 | and their structures reflect this. Functions all belong to the same type and
|
paul@766 | 40 | thus employ the classcode for the function built-in type, whereas modules have
|
paul@766 | 41 | distinct types since they must support different sets of attributes.
|
paul@766 | 42 |
|
paul@766 | 43 | Attrcode
|
paul@766 | 44 | --------
|
paul@766 | 45 |
|
paul@766 | 46 | Used to test instances for membership of classes (or descendants of classes).
|
paul@766 | 47 |
|
paul@766 | 48 | Since, in traditional Python, classes are only ever instances of some generic
|
paul@766 | 49 | built-in type, support for testing such a relationship directly has been
|
paul@766 | 50 | removed and the attrcode is not specified for classes: the presence of an
|
paul@766 | 51 | attrcode indicates that a given object is an instance. In addition, support
|
paul@766 | 52 | has also been removed for testing modules in the same way, meaning that the
|
paul@766 | 53 | attrcode is also not specified for modules.
|
paul@766 | 54 |
|
paul@766 | 55 | See the "Testing Instance Compatibility with Classes (Attrcode)" section below
|
paul@766 | 56 | for details of attrcodes.
|
paul@766 | 57 |
|
paul@766 | 58 | Invocation Reference
|
paul@766 | 59 | --------------------
|
paul@766 | 60 |
|
paul@766 | 61 | Used when an object is called.
|
paul@766 | 62 |
|
paul@766 | 63 | This is the address of the code to be executed when an invocation is performed
|
paul@766 | 64 | on the object.
|
paul@766 | 65 |
|
paul@766 | 66 | Funccode
|
paul@766 | 67 | --------
|
paul@766 | 68 |
|
paul@766 | 69 | Used to look up argument positions by name.
|
paul@766 | 70 |
|
paul@766 | 71 | The strategy with keyword arguments in micropython is to attempt to position
|
paul@766 | 72 | such arguments in the invocation frame as it is being constructed.
|
paul@766 | 73 |
|
paul@766 | 74 | See the "Parameters and Lookups" section for more information.
|
paul@766 | 75 |
|
paul@766 | 76 | Size
|
paul@766 | 77 | ----
|
paul@766 | 78 |
|
paul@766 | 79 | Used to indicate the size of an object including attributes.
|
paul@766 | 80 |
|
paul@766 | 81 | Attributes
|
paul@766 | 82 | ----------
|
paul@766 | 83 |
|
paul@766 | 84 | For classes, modules and instances, the attributes in the structure correspond
|
paul@766 | 85 | to the attributes of each kind of object. For functions, however, the
|
paul@766 | 86 | attributes in the structure correspond to the default arguments for each
|
paul@766 | 87 | function, if any.
|
paul@766 | 88 |
|
paul@766 | 89 | Structure Types
|
paul@766 | 90 | ---------------
|
paul@766 | 91 |
|
paul@766 | 92 | Class C:
|
paul@766 | 93 |
|
paul@766 | 94 | 0 1 2 3 4 5 6
|
paul@766 | 95 | classcode (unused) __new__ funccode size attribute ...
|
paul@766 | 96 | for C reference for reference
|
paul@766 | 97 | instantiator
|
paul@766 | 98 |
|
paul@766 | 99 | Instance of C:
|
paul@766 | 100 |
|
paul@766 | 101 | 0 1 2 3 4 5 6
|
paul@766 | 102 | classcode attrcode C.__call__ funccode size attribute ...
|
paul@766 | 103 | for C for C reference for reference
|
paul@766 | 104 | (if exists) C.__call__
|
paul@766 | 105 |
|
paul@766 | 106 | Function f:
|
paul@766 | 107 |
|
paul@766 | 108 | 0 1 2 3 4 5 6
|
paul@766 | 109 | classcode attrcode code funccode size attribute ...
|
paul@766 | 110 | for for reference (default)
|
paul@766 | 111 | function function reference
|
paul@766 | 112 |
|
paul@766 | 113 | Module m:
|
paul@766 | 114 |
|
paul@766 | 115 | 0 1 2 3 4 5 6
|
paul@766 | 116 | classcode attrcode (unused) (unused) (unused) attribute ...
|
paul@766 | 117 | for m for m (global)
|
paul@766 | 118 | reference
|
paul@766 | 119 |
|
paul@766 | 120 | The __class__ Attribute
|
paul@766 | 121 | -----------------------
|
paul@766 | 122 |
|
paul@766 | 123 | All objects should support the __class__ attribute, and in most cases this is
|
paul@766 | 124 | done using the object table, yielding a common address for all instances of a
|
paul@766 | 125 | given class.
|
paul@766 | 126 |
|
paul@766 | 127 | Function: refers to the function class
|
paul@766 | 128 | Instance: refers to the class instantiated to make the object
|
paul@766 | 129 |
|
paul@766 | 130 | The object table cannot support two definitions simultaneously for both
|
paul@766 | 131 | instances and their classes. Consequently, __class__ access on classes must be
|
paul@766 | 132 | tested for and a special result returned.
|
paul@766 | 133 |
|
paul@766 | 134 | Class: refers to the type class (type.__class__ also refers to the type class)
|
paul@766 | 135 |
|
paul@766 | 136 | For convenience, the first attribute of a class will be the common __class__
|
paul@766 | 137 | attribute for all its instances. As noted above, direct access to this
|
paul@766 | 138 | attribute will not be possible for classes, and a constant result will be
|
paul@766 | 139 | returned instead.
|
paul@766 | 140 |
|
paul@766 | 141 | Lists and Tuples
|
paul@766 | 142 | ----------------
|
paul@766 | 143 |
|
paul@766 | 144 | The built-in list and tuple sequences employ variable length structures using
|
paul@766 | 145 | the attribute locations to store their elements, where each element is a
|
paul@766 | 146 | reference to a separately stored object.
|
paul@766 | 147 |
|
paul@766 | 148 | Testing Instance Compatibility with Classes (Attrcode)
|
paul@766 | 149 | ------------------------------------------------------
|
paul@766 | 150 |
|
paul@766 | 151 | Although it would be possible to have a data structure mapping classes to
|
paul@766 | 152 | compatible classes, such as a matrix indicating the subclasses (or
|
paul@766 | 153 | superclasses) of each class, the need to retain the key to such a data
|
paul@766 | 154 | structure for each class might introduce a noticeable overhead.
|
paul@766 | 155 |
|
paul@766 | 156 | Instead of having a separate structure, descendant classes of each class are
|
paul@766 | 157 | inserted as special attributes into the object table. This requires an extra
|
paul@766 | 158 | key to be retained, since each class must provide its own attribute code such
|
paul@766 | 159 | that upon an instance/class compatibility test, the code may be obtained and
|
paul@766 | 160 | used in the object table.
|
paul@766 | 161 |
|
paul@766 | 162 | Invocation and Code References
|
paul@766 | 163 | ------------------------------
|
paul@766 | 164 |
|
paul@766 | 165 | Modules: there is no meaningful invocation reference since modules cannot be
|
paul@766 | 166 | explicitly called.
|
paul@766 | 167 |
|
paul@766 | 168 | Functions: a simple code reference is employed pointing to code implementing
|
paul@766 | 169 | the function. Note that the function locals are completely distinct from this
|
paul@766 | 170 | structure and are not comparable to attributes. Instead, attributes are
|
paul@766 | 171 | reserved for default parameter values, although they do not appear in the
|
paul@766 | 172 | object table described above, appearing instead in a separate parameter table
|
paul@766 | 173 | described in concepts.txt.
|
paul@766 | 174 |
|
paul@766 | 175 | Classes: given that classes must be invoked in order to create instances, a
|
paul@766 | 176 | reference must be provided in class structures. However, this reference does
|
paul@766 | 177 | not point directly at the __init__ method of the class. Instead, the
|
paul@766 | 178 | referenced code belongs to a special initialiser function, __new__, consisting
|
paul@766 | 179 | of the following instructions:
|
paul@766 | 180 |
|
paul@766 | 181 | create instance for C
|
paul@766 | 182 | call C.__init__(instance, ...)
|
paul@766 | 183 | return instance
|
paul@766 | 184 |
|
paul@766 | 185 | Instances: each instance employs a reference to any __call__ method defined in
|
paul@766 | 186 | the class hierarchy for the instance, thus maintaining its callable nature.
|
paul@766 | 187 |
|
paul@766 | 188 | Both classes and modules may contain code in their definitions - the former in
|
paul@766 | 189 | the "body" of the class, potentially defining attributes, and the latter as
|
paul@766 | 190 | the "top-level" code in the module, potentially defining attributes/globals -
|
paul@766 | 191 | but this code is not associated with any invocation target. It is thus
|
paul@766 | 192 | generated in order of appearance and is not referenced externally.
|
paul@766 | 193 |
|
paul@766 | 194 | Invocation Operation
|
paul@766 | 195 | --------------------
|
paul@766 | 196 |
|
paul@766 | 197 | Consequently, regardless of the object an invocation is always done as
|
paul@766 | 198 | follows:
|
paul@766 | 199 |
|
paul@766 | 200 | get invocation reference from the header
|
paul@766 | 201 | jump to reference
|
paul@766 | 202 |
|
paul@766 | 203 | Additional preparation is necessary before the above code: positional
|
paul@766 | 204 | arguments must be saved in the invocation frame, and keyword arguments must be
|
paul@766 | 205 | resolved and saved to the appropriate position in the invocation frame.
|
paul@766 | 206 |
|
paul@766 | 207 | See invocation.txt for details.
|
paul@766 | 208 |
|
paul@766 | 209 | Instantiation
|
paul@766 | 210 | =============
|
paul@766 | 211 |
|
paul@766 | 212 | When instantiating classes, memory must be reserved for the header of the
|
paul@766 | 213 | resulting instance, along with locations for the attributes of the instance.
|
paul@766 | 214 | Since the instance header contains data common to all instances of a class, a
|
paul@766 | 215 | template header is copied to the start of the newly reserved memory region.
|
paul@766 | 216 |
|
paul@766 | 217 | List and Tuple Representations
|
paul@766 | 218 | ==============================
|
paul@766 | 219 |
|
paul@766 | 220 | Since tuples have a fixed size, the representation of a tuple instance is
|
paul@766 | 221 | merely a header describing the size of the entire object, together with a
|
paul@766 | 222 | sequence of references to the object "stored" at each position in the
|
paul@766 | 223 | structure. Such references consist of the usual context and reference pair.
|
paul@766 | 224 |
|
paul@766 | 225 | Lists, however, have a variable size and must be accessible via an unchanging
|
paul@766 | 226 | location even as more memory is allocated elsewhere to accommodate the
|
paul@766 | 227 | contents of the list. Consequently, the representation must resemble the
|
paul@766 | 228 | following:
|
paul@766 | 229 |
|
paul@766 | 230 | Structure header for list (size == header plus special attribute)
|
paul@766 | 231 | Special attribute referencing the underlying sequence
|
paul@766 | 232 |
|
paul@766 | 233 | The underlying sequence has a fixed size, like a tuple, but may contain fewer
|
paul@766 | 234 | elements than the size of the sequence permits:
|
paul@766 | 235 |
|
paul@766 | 236 | Special header indicating the current size and allocated size
|
paul@766 | 237 | Element
|
paul@766 | 238 | ... <-- current size
|
paul@766 | 239 | (Unused space)
|
paul@766 | 240 | ... <-- allocated size
|
paul@766 | 241 |
|
paul@766 | 242 | This representation permits the allocation of a new sequence when space is
|
paul@766 | 243 | exhausted in an existing sequence, with the new sequence address stored in the
|
paul@766 | 244 | main list structure. Since access to the contents of the list must go through
|
paul@766 | 245 | the main list structure, underlying allocation activities may take place
|
paul@766 | 246 | without the users of a list having to be aware of such activities.
|
paul@768 | 247 |
|
paul@768 | 248 | Instruction Evaluation Model
|
paul@768 | 249 | ============================
|
paul@768 | 250 |
|
paul@768 | 251 | Programs use a value stack containing local and temporary storage. A value
|
paul@768 | 252 | stack pointer indicates the top of this stack. In addition, a separate stack
|
paul@768 | 253 | is used to record the invocation frames. All stack pointers refer to the next
|
paul@768 | 254 | address to be used by the stack, not the address of the uppermost element.
|
paul@768 | 255 |
|
paul@768 | 256 | Frame Stack Value Stack
|
paul@768 | 257 | ----------- ----------- Address of Callable
|
paul@768 | 258 | -------------------
|
paul@768 | 259 | previous ...
|
paul@768 | 260 | current ------> callable -----> identifier
|
paul@768 | 261 | arg1 reference to code
|
paul@768 | 262 | arg2
|
paul@768 | 263 | arg3
|
paul@768 | 264 | local4
|
paul@768 | 265 | local5
|
paul@768 | 266 | ...
|
paul@768 | 267 |
|
paul@768 | 268 | Unlike the CPython virtual machine, programs do not use a value stack
|
paul@768 | 269 | containing the results of expression evaluations. Instead, temporary storage
|
paul@768 | 270 | is statically allocated and employed by instructions.
|
paul@768 | 271 |
|
paul@768 | 272 | Loading local names and temporary values is a matter of performing
|
paul@768 | 273 | frame-relative accesses to the value stack.
|
paul@768 | 274 |
|
paul@768 | 275 | Invocations and Argument Evaluation
|
paul@768 | 276 | -----------------------------------
|
paul@768 | 277 |
|
paul@768 | 278 | When preparing for an invocation, the caller first sets the invocation frame
|
paul@768 | 279 | pointer. A number of locations for the arguments required by the invocation
|
paul@768 | 280 | are then reserved. With a frame to prepare, positional arguments are added to
|
paul@768 | 281 | the frame in order such that the first argument positions are filled. The
|
paul@768 | 282 | names of keyword arguments are used (in the form of table index values) to
|
paul@768 | 283 | consult the parameter table and to find the frame location in which such
|
paul@768 | 284 | arguments are to be stored.
|
paul@768 | 285 |
|
paul@768 | 286 | fn(a, b, d=1, e=2, c=3) -> fn(a, b, c, d, e)
|
paul@768 | 287 |
|
paul@768 | 288 | Frame
|
paul@768 | 289 | -----
|
paul@768 | 290 |
|
paul@768 | 291 | a a a a
|
paul@768 | 292 | b b b b
|
paul@768 | 293 | ___ ___ ___ --> 3
|
paul@768 | 294 | ___ --> 1 1 | 1
|
paul@768 | 295 | ___ | ___ --> 2 | 2
|
paul@768 | 296 | | | |
|
paul@768 | 297 | 1 ----------- 2 ----------- 3 -----------
|
paul@768 | 298 |
|
paul@768 | 299 | Conceptually, the frame can be considered as a collection of attributes, as
|
paul@768 | 300 | seen in other kinds of structures:
|
paul@768 | 301 |
|
paul@768 | 302 | Frame for invocation of fn:
|
paul@768 | 303 |
|
paul@768 | 304 | 0 1 2 3 4 5
|
paul@768 | 305 | code a b c d e
|
paul@768 | 306 | reference
|
paul@768 | 307 |
|
paul@768 | 308 | Where arguments are specified positionally, such "attributes" are set in
|
paul@768 | 309 | order. Keyword arguments are set using a name-to-position attribute-like
|
paul@768 | 310 | mapping, where the position of each argument is discovered using the parameter
|
paul@768 | 311 | table.
|
paul@768 | 312 |
|
paul@768 | 313 | Method Invocations
|
paul@768 | 314 | ------------------
|
paul@768 | 315 |
|
paul@768 | 316 | Method invocations incorporate an implicit first argument which is obtained
|
paul@768 | 317 | from the context of the method:
|
paul@768 | 318 |
|
paul@768 | 319 | method(a, b, d=1, e=2, c=3) -> method(self, a, b, c, d, e)
|
paul@768 | 320 |
|
paul@768 | 321 | Frame
|
paul@768 | 322 | -----
|
paul@768 | 323 |
|
paul@768 | 324 | context of method
|
paul@768 | 325 | a
|
paul@768 | 326 | b
|
paul@768 | 327 | 3
|
paul@768 | 328 | 1
|
paul@768 | 329 | 2
|
paul@768 | 330 |
|
paul@768 | 331 | Although it could be possible to permit any object to be provided as the first
|
paul@768 | 332 | argument, in order to optimise instance attribute access in methods, we should
|
paul@768 | 333 | seek to restrict the object type.
|
paul@768 | 334 |
|
paul@768 | 335 | Verifying Supplied Arguments
|
paul@768 | 336 | ----------------------------
|
paul@768 | 337 |
|
paul@768 | 338 | In order to ensure a correct invocation, it is also necessary to check the
|
paul@768 | 339 | number of supplied arguments. If the target of the invocation is known at
|
paul@768 | 340 | compile-time, no additional instructions need to be emitted; otherwise, the
|
paul@768 | 341 | generated code must test for the following situations:
|
paul@768 | 342 |
|
paul@768 | 343 | 1. That the number of supplied arguments is equal to the number of expected
|
paul@768 | 344 | parameters.
|
paul@768 | 345 |
|
paul@768 | 346 | 2. That no keyword argument overwrites an existing positional parameter.
|
paul@768 | 347 |
|
paul@768 | 348 | Default Arguments
|
paul@768 | 349 | -----------------
|
paul@768 | 350 |
|
paul@768 | 351 | Some arguments may have default values which are used if no value is provided
|
paul@768 | 352 | in an invocation. Such defaults are initialised when the function itself is
|
paul@768 | 353 | initialised, and are used to fill in any invocation frames that are known at
|
paul@768 | 354 | compile-time.
|
paul@768 | 355 |
|
paul@768 | 356 | To obtain the default values, a reference to the function object is required.
|
paul@768 | 357 | This can be provided either in an additional frame location or in a special
|
paul@768 | 358 | register.
|
paul@768 | 359 |
|
paul@768 | 360 | Tuples, Frames and Allocation
|
paul@768 | 361 | -----------------------------
|
paul@768 | 362 |
|
paul@768 | 363 | Using the approach where arguments are treated like attributes in some kind of
|
paul@768 | 364 | structure, we could choose to allocate frames in places other than a stack.
|
paul@768 | 365 | This would produce something somewhat similar to a plain tuple object.
|