1.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000
1.2 +++ b/docs/lowlevel.txt Sun Dec 08 17:31:37 2013 +0100
1.3 @@ -0,0 +1,241 @@
1.4 +Low-level Implementation Details
1.5 +================================
1.6 +
1.7 +Although micropython delegates the generation of low-level program code and
1.8 +data to syspython, various considerations of how an eventual program might be
1.9 +structured have been used to inform the way micropython represents the details
1.10 +of a program. This document describes these considerations and indicates how
1.11 +syspython or other technologies might represent a working program.
1.12 +
1.13 +Objects and Structures
1.14 +======================
1.15 +
1.16 +As well as references, micropython needs to have actual objects to refer to.
1.17 +Since classes, functions and instances are all objects, it is desirable that
1.18 +certain common features and operations are supported in the same way for all
1.19 +of these things. To permit this, a common data structure format is used.
1.20 +
1.21 + Header.................................................... Attributes.................
1.22 +
1.23 + Identifier Identifier Address Identifier Size Object ...
1.24 +
1.25 + 0 1 2 3 4 5 6
1.26 + classcode attrcode/ invocation funccode size attribute ...
1.27 + instance reference reference
1.28 + status
1.29 +
1.30 +Classcode
1.31 +---------
1.32 +
1.33 +Used in attribute lookup.
1.34 +
1.35 +Here, the classcode refers to the attribute lookup table for the object (as
1.36 +described in concepts.txt). Classes and instances share the same classcode,
1.37 +and their structures reflect this. Functions all belong to the same type and
1.38 +thus employ the classcode for the function built-in type, whereas modules have
1.39 +distinct types since they must support different sets of attributes.
1.40 +
1.41 +Attrcode
1.42 +--------
1.43 +
1.44 +Used to test instances for membership of classes (or descendants of classes).
1.45 +
1.46 +Since, in traditional Python, classes are only ever instances of some generic
1.47 +built-in type, support for testing such a relationship directly has been
1.48 +removed and the attrcode is not specified for classes: the presence of an
1.49 +attrcode indicates that a given object is an instance. In addition, support
1.50 +has also been removed for testing modules in the same way, meaning that the
1.51 +attrcode is also not specified for modules.
1.52 +
1.53 +See the "Testing Instance Compatibility with Classes (Attrcode)" section below
1.54 +for details of attrcodes.
1.55 +
1.56 +Invocation Reference
1.57 +--------------------
1.58 +
1.59 +Used when an object is called.
1.60 +
1.61 +This is the address of the code to be executed when an invocation is performed
1.62 +on the object.
1.63 +
1.64 +Funccode
1.65 +--------
1.66 +
1.67 +Used to look up argument positions by name.
1.68 +
1.69 +The strategy with keyword arguments in micropython is to attempt to position
1.70 +such arguments in the invocation frame as it is being constructed.
1.71 +
1.72 +See the "Parameters and Lookups" section for more information.
1.73 +
1.74 +Size
1.75 +----
1.76 +
1.77 +Used to indicate the size of an object including attributes.
1.78 +
1.79 +Attributes
1.80 +----------
1.81 +
1.82 +For classes, modules and instances, the attributes in the structure correspond
1.83 +to the attributes of each kind of object. For functions, however, the
1.84 +attributes in the structure correspond to the default arguments for each
1.85 +function, if any.
1.86 +
1.87 +Structure Types
1.88 +---------------
1.89 +
1.90 +Class C:
1.91 +
1.92 + 0 1 2 3 4 5 6
1.93 + classcode (unused) __new__ funccode size attribute ...
1.94 + for C reference for reference
1.95 + instantiator
1.96 +
1.97 +Instance of C:
1.98 +
1.99 + 0 1 2 3 4 5 6
1.100 + classcode attrcode C.__call__ funccode size attribute ...
1.101 + for C for C reference for reference
1.102 + (if exists) C.__call__
1.103 +
1.104 +Function f:
1.105 +
1.106 + 0 1 2 3 4 5 6
1.107 + classcode attrcode code funccode size attribute ...
1.108 + for for reference (default)
1.109 + function function reference
1.110 +
1.111 +Module m:
1.112 +
1.113 + 0 1 2 3 4 5 6
1.114 + classcode attrcode (unused) (unused) (unused) attribute ...
1.115 + for m for m (global)
1.116 + reference
1.117 +
1.118 +The __class__ Attribute
1.119 +-----------------------
1.120 +
1.121 +All objects should support the __class__ attribute, and in most cases this is
1.122 +done using the object table, yielding a common address for all instances of a
1.123 +given class.
1.124 +
1.125 +Function: refers to the function class
1.126 +Instance: refers to the class instantiated to make the object
1.127 +
1.128 +The object table cannot support two definitions simultaneously for both
1.129 +instances and their classes. Consequently, __class__ access on classes must be
1.130 +tested for and a special result returned.
1.131 +
1.132 +Class: refers to the type class (type.__class__ also refers to the type class)
1.133 +
1.134 +For convenience, the first attribute of a class will be the common __class__
1.135 +attribute for all its instances. As noted above, direct access to this
1.136 +attribute will not be possible for classes, and a constant result will be
1.137 +returned instead.
1.138 +
1.139 +Lists and Tuples
1.140 +----------------
1.141 +
1.142 +The built-in list and tuple sequences employ variable length structures using
1.143 +the attribute locations to store their elements, where each element is a
1.144 +reference to a separately stored object.
1.145 +
1.146 +Testing Instance Compatibility with Classes (Attrcode)
1.147 +------------------------------------------------------
1.148 +
1.149 +Although it would be possible to have a data structure mapping classes to
1.150 +compatible classes, such as a matrix indicating the subclasses (or
1.151 +superclasses) of each class, the need to retain the key to such a data
1.152 +structure for each class might introduce a noticeable overhead.
1.153 +
1.154 +Instead of having a separate structure, descendant classes of each class are
1.155 +inserted as special attributes into the object table. This requires an extra
1.156 +key to be retained, since each class must provide its own attribute code such
1.157 +that upon an instance/class compatibility test, the code may be obtained and
1.158 +used in the object table.
1.159 +
1.160 +Invocation and Code References
1.161 +------------------------------
1.162 +
1.163 +Modules: there is no meaningful invocation reference since modules cannot be
1.164 +explicitly called.
1.165 +
1.166 +Functions: a simple code reference is employed pointing to code implementing
1.167 +the function. Note that the function locals are completely distinct from this
1.168 +structure and are not comparable to attributes. Instead, attributes are
1.169 +reserved for default parameter values, although they do not appear in the
1.170 +object table described above, appearing instead in a separate parameter table
1.171 +described in concepts.txt.
1.172 +
1.173 +Classes: given that classes must be invoked in order to create instances, a
1.174 +reference must be provided in class structures. However, this reference does
1.175 +not point directly at the __init__ method of the class. Instead, the
1.176 +referenced code belongs to a special initialiser function, __new__, consisting
1.177 +of the following instructions:
1.178 +
1.179 + create instance for C
1.180 + call C.__init__(instance, ...)
1.181 + return instance
1.182 +
1.183 +Instances: each instance employs a reference to any __call__ method defined in
1.184 +the class hierarchy for the instance, thus maintaining its callable nature.
1.185 +
1.186 +Both classes and modules may contain code in their definitions - the former in
1.187 +the "body" of the class, potentially defining attributes, and the latter as
1.188 +the "top-level" code in the module, potentially defining attributes/globals -
1.189 +but this code is not associated with any invocation target. It is thus
1.190 +generated in order of appearance and is not referenced externally.
1.191 +
1.192 +Invocation Operation
1.193 +--------------------
1.194 +
1.195 +Consequently, regardless of the object an invocation is always done as
1.196 +follows:
1.197 +
1.198 + get invocation reference from the header
1.199 + jump to reference
1.200 +
1.201 +Additional preparation is necessary before the above code: positional
1.202 +arguments must be saved in the invocation frame, and keyword arguments must be
1.203 +resolved and saved to the appropriate position in the invocation frame.
1.204 +
1.205 +See invocation.txt for details.
1.206 +
1.207 +Instantiation
1.208 +=============
1.209 +
1.210 +When instantiating classes, memory must be reserved for the header of the
1.211 +resulting instance, along with locations for the attributes of the instance.
1.212 +Since the instance header contains data common to all instances of a class, a
1.213 +template header is copied to the start of the newly reserved memory region.
1.214 +
1.215 +List and Tuple Representations
1.216 +==============================
1.217 +
1.218 +Since tuples have a fixed size, the representation of a tuple instance is
1.219 +merely a header describing the size of the entire object, together with a
1.220 +sequence of references to the object "stored" at each position in the
1.221 +structure. Such references consist of the usual context and reference pair.
1.222 +
1.223 +Lists, however, have a variable size and must be accessible via an unchanging
1.224 +location even as more memory is allocated elsewhere to accommodate the
1.225 +contents of the list. Consequently, the representation must resemble the
1.226 +following:
1.227 +
1.228 + Structure header for list (size == header plus special attribute)
1.229 + Special attribute referencing the underlying sequence
1.230 +
1.231 +The underlying sequence has a fixed size, like a tuple, but may contain fewer
1.232 +elements than the size of the sequence permits:
1.233 +
1.234 + Special header indicating the current size and allocated size
1.235 + Element
1.236 + ... <-- current size
1.237 + (Unused space)
1.238 + ... <-- allocated size
1.239 +
1.240 +This representation permits the allocation of a new sequence when space is
1.241 +exhausted in an existing sequence, with the new sequence address stored in the
1.242 +main list structure. Since access to the contents of the list must go through
1.243 +the main list structure, underlying allocation activities may take place
1.244 +without the users of a list having to be aware of such activities.