micropython

Annotated README.txt

65:24df65a47aa8
2008-04-06 Paul Boddie Made optimised attribute access via self optional. Renamed an example.
paul@15 1
Namespace Definition
paul@15 2
====================
paul@15 3
paul@15 4
Module attributes are defined either at the module level or by global
paul@15 5
statements.
paul@15 6
paul@15 7
Class attributes are defined only within class statements.
paul@15 8
paul@15 9
Instance attributes are defined only by assignments to attributes of self
paul@15 10
within __init__ methods.
paul@15 11
paul@42 12
Potential Restrictions
paul@42 13
----------------------
paul@42 14
paul@42 15
Names of classes and functions could be restricted to only refer to those
paul@42 16
objects within the same namespace. If redefinition were to occur, or if
paul@42 17
multiple possibilities were present, these restrictions could be moderated as
paul@42 18
follows:
paul@42 19
paul@42 20
  * Classes assigned to the same name could provide the union of their
paul@42 21
    attributes. This would, however, cause a potential collision of attribute
paul@42 22
    definitions such as methods.
paul@42 23
paul@42 24
  * Functions, if they share compatible signatures, could share parameter list
paul@42 25
    definitions.
paul@42 26
paul@11 27
Data Structures
paul@11 28
===============
paul@11 29
paul@45 30
The fundamental "value type" is a pair of references: one pointing to the
paul@45 31
referenced object represented by the interchangeable value; one referring to
paul@45 32
the context of the referenced object, typically the object through which the
paul@45 33
referenced object was acquired as an attribute.A
paul@45 34
paul@45 35
Value Layout
paul@45 36
------------
paul@45 37
paul@45 38
    0           1
paul@45 39
    object      context
paul@45 40
    reference   reference
paul@45 41
paul@52 42
Acquiring Values
paul@52 43
----------------
paul@52 44
paul@52 45
Values are acquired through name lookups and attribute access, yielding
paul@52 46
the appropriate object reference together with a context reference as
paul@52 47
indicated in the following table:
paul@52 48
paul@52 49
    Type of Access          Context     Notes
paul@52 50
    --------------          -------     -----
paul@52 51
paul@52 52
    Local name              Preserved   Functions provide no context
paul@52 53
paul@52 54
    Global name             Preserved   Modules provide no context
paul@52 55
paul@52 56
    Class-originating       Accessor    Methods acquire the context of their
paul@52 57
    attribute                 -or-      accessor if an instance...
paul@52 58
                            Preserved   or retain the original context if the
paul@52 59
                                        accessor is a class
paul@52 60
paul@52 61
    Instance-originating    Preserved   Methods retain their original context
paul@52 62
    attribute
paul@52 63
paul@53 64
There may be some scope for simplifying the above, to the detriment of Python
paul@52 65
compatibility, since the unbound vs. bound methods situation can be confusing.
paul@52 66
paul@45 67
Objects
paul@45 68
-------
paul@45 69
paul@11 70
Since classes, functions and instances are all "objects", each must support
paul@11 71
certain features and operations in the same way.
paul@11 72
paul@11 73
The __class__ Attribute
paul@11 74
-----------------------
paul@11 75
paul@11 76
All objects support the __class__ attribute:
paul@11 77
paul@11 78
Class: refers to the type class (type.__class__ also refers to the type class)
paul@11 79
Function: refers to the function class
paul@11 80
Instance: refers to the class instantiated to make the object
paul@11 81
paul@11 82
Invocation
paul@11 83
----------
paul@11 84
paul@11 85
The following actions need to be supported:
paul@11 86
paul@11 87
Class: create instance, call __init__ with instance, return object
paul@11 88
Function: call function body, return result
paul@11 89
Instance: call __call__ method, return result
paul@11 90
paul@11 91
Structure Layout
paul@11 92
----------------
paul@11 93
paul@11 94
A suitable structure layout might be something like this:
paul@11 95
paul@59 96
    Identifier  Address     Type        Object      ...
paul@59 97
paul@11 98
    0           1           2           3           4
paul@11 99
    classcode   invocation  __class__   attribute   ...
paul@11 100
                reference   reference   reference
paul@11 101
paul@11 102
Here, the classcode refers to the attribute lookup table for the object. Since
paul@11 103
classes and instances share the same classcode, they might resemble the
paul@11 104
following:
paul@11 105
paul@11 106
Class C:
paul@11 107
paul@11 108
    0           1           2           3           4
paul@11 109
    code for C  __new__     class type  attribute   ...
paul@11 110
                reference   reference   reference
paul@11 111
paul@11 112
Instance of C:
paul@11 113
paul@11 114
    0           1           2           3           4
paul@11 115
    code for C  C.__call__  class C     attribute   ...
paul@11 116
                reference   reference   reference
paul@11 117
                (if exists)
paul@11 118
paul@11 119
The __new__ reference would lead to code consisting of the following
paul@11 120
instructions:
paul@11 121
paul@11 122
    create instance for C
paul@11 123
    call C.__init__(instance, ...)
paul@11 124
    return instance
paul@11 125
paul@11 126
If C has a __call__ attribute, the invocation "slot" of C instances would
paul@11 127
refer to the same thing as C.__call__.
paul@11 128
paul@11 129
For functions, the same general layout applies:
paul@11 130
paul@11 131
Function f:
paul@11 132
paul@11 133
    0           1           2           3           4
paul@11 134
    code for    code        class       attribute   ...
paul@11 135
    function    reference   function    reference
paul@11 136
                            reference
paul@11 137
paul@37 138
Here, the code reference would lead to code for the function. Note that the
paul@37 139
function locals are completely distinct from this structure and are not
paul@37 140
comparable to attributes.
paul@37 141
paul@38 142
For modules, there is no meaningful invocation reference:
paul@37 143
paul@37 144
Module m:
paul@37 145
paul@37 146
    0           1           2           3           4
paul@38 147
    code for m  (unused)    module type attribute   ...
paul@38 148
                            reference   (global)
paul@37 149
                                        reference
paul@11 150
paul@38 151
Both classes and modules have code in their definitions, but this would be
paul@38 152
generated in order and not referenced externally.
paul@38 153
paul@11 154
Invocation Operation
paul@11 155
--------------------
paul@11 156
paul@11 157
Consequently, regardless of the object an invocation is always done as
paul@11 158
follows:
paul@11 159
paul@11 160
    get invocation reference (at object+1)
paul@11 161
    jump to reference
paul@11 162
paul@11 163
Additional preparation is necessary before the above code: positional
paul@11 164
arguments must be saved to the parameter stack, and keyword arguments must be
paul@11 165
resolved and saved to the appropriate position in the parameter stack.
paul@11 166
paul@11 167
Attribute Operations
paul@11 168
--------------------
paul@11 169
paul@53 170
Attribute access needs to go through the attribute lookup table. Some
paul@53 171
optimisations are possible and are described in the appropriate section.
paul@53 172
paul@53 173
One important aspect of attribute access is the appropriate setting of the
paul@53 174
context in the acquired attribute value. From the table describing the
paul@53 175
acquisition of values, it is clear that the principal exception is that where
paul@53 176
a class-originating attribute is accessed on an instance. Consequently, the
paul@53 177
following algorithm could be employed once an attribute has been located:
paul@53 178
paul@53 179
 1. If the attribute's context is a special value, indicating that it should
paul@53 180
    be replaced upon instance access, then proceed to the next step;
paul@53 181
    otherwise, acquire both the context and the object as they are.
paul@53 182
paul@53 183
 2. If the accessor is an instance, use that as the value's context, acquiring
paul@53 184
    only the object from the attribute.
paul@53 185
paul@53 186
Where accesses can be determined ahead of time (as discussed in the
paul@53 187
optimisations section), the above algorithm may not necessarily be employed in
paul@53 188
the generated code for some accesses.
paul@21 189
paul@21 190
Instruction Evaluation Model
paul@21 191
============================
paul@21 192
paul@21 193
Programs use a value stack where evaluated instructions may save their
paul@21 194
results. A value stack pointer indicates the top of this stack. In addition, a
paul@21 195
separate stack is used to record the invocation frames. All stack pointers
paul@21 196
refer to the next address to be used by the stack, not the address of the
paul@21 197
uppermost element.
paul@21 198
paul@21 199
    Frame Stack     Value Stack
paul@21 200
    -----------     -----------     Address of Callable
paul@21 201
                                    -------------------
paul@21 202
    previous        ...
paul@21 203
    current ------> callable -----> identifier
paul@21 204
                    arg1            reference to code
paul@21 205
                    arg2
paul@21 206
                    arg3
paul@21 207
                    local4
paul@21 208
                    local5
paul@21 209
                    ...
paul@21 210
paul@21 211
Loading local names is a matter of performing frame-relative accesses to the
paul@21 212
value stack.
paul@21 213
paul@21 214
Invocations and Argument Evaluation
paul@21 215
-----------------------------------
paul@21 216
paul@21 217
When preparing for an invocation, the caller first sets the invocation frame
paul@21 218
pointer. Then, positional arguments are added to the stack such that the first
paul@21 219
argument positions are filled. A number of stack locations for the remaining
paul@21 220
arguments specified in the program are then reserved. The names of keyword
paul@21 221
arguments are used (in the form of table index values) to consult the
paul@21 222
parameter table and to find the location in which such arguments are to be
paul@21 223
stored.
paul@21 224
paul@21 225
    fn(a, b, d=1, e=2, c=3) -> fn(a, b, c, d, e)
paul@21 226
paul@21 227
    Value Stack
paul@21 228
    -----------
paul@21 229
paul@21 230
    ...             ...             ...             ...
paul@21 231
    fn              fn              fn              fn
paul@21 232
    a               a               a               a
paul@21 233
    b               b               b               b
paul@21 234
    ___             ___             ___         --> 3
paul@21 235
    ___         --> 1               1           |   1
paul@21 236
    ___         |   ___         --> 2           |   2
paul@21 237
    1 -----------   2 -----------   3 -----------
paul@21 238
paul@21 239
Conceptually, the frame can be considered as a collection of attributes, as
paul@21 240
seen in other kinds of structures:
paul@21 241
paul@21 242
Frame for invocation of fn:
paul@21 243
paul@21 244
    0           1           2           3           4           5
paul@21 245
    code        a           b           c           d           e
paul@21 246
    reference
paul@21 247
paul@21 248
However, where arguments are specified positionally, such "attributes" are not
paul@21 249
set using a comparable approach to that employed with other structures.
paul@21 250
Keyword arguments are set using an attribute-like mechanism, though, where the
paul@21 251
position of each argument discovered using the parameter table.
paul@21 252
paul@45 253
Method invocations incorporate an implicit first argument which is obtained
paul@45 254
from the context of the method:
paul@45 255
paul@45 256
    method(a, b, d=1, e=2, c=3) -> method(self, a, b, c, d, e)
paul@45 257
paul@45 258
    Value Stack
paul@45 259
    -----------
paul@45 260
paul@45 261
    ...
paul@45 262
    method
paul@45 263
    context of method
paul@45 264
    a
paul@45 265
    b
paul@45 266
    3
paul@45 267
    1
paul@45 268
    2
paul@45 269
paul@45 270
Although it could be possible to permit any object to be provided as the first
paul@45 271
argument, in order to optimise instance attribute access in methods, we should
paul@45 272
seek to restrict the object type.
paul@45 273
paul@58 274
Verifying Supplied Arguments
paul@58 275
----------------------------
paul@58 276
paul@58 277
In order to ensure a correct invocation, it is also necessary to check the
paul@58 278
number of supplied arguments. If the target of the invocation is known at
paul@58 279
compile-time, no additional instructions need to be emitted; otherwise, the
paul@58 280
generated code must test for the following situations:
paul@58 281
paul@58 282
 1. That the number of supplied arguments is equal to the number of expected
paul@58 283
    parameters.
paul@58 284
paul@58 285
 2. That no keyword argument overwrites an existing positional parameter.
paul@58 286
paul@21 287
Tuples, Frames and Allocation
paul@21 288
-----------------------------
paul@21 289
paul@21 290
Using the approach where arguments are treated like attributes in some kind of
paul@21 291
structure, we could choose to allocate frames in places other than a stack.
paul@21 292
This would produce something somewhat similar to a plain tuple object.
paul@23 293
paul@23 294
Optimisations
paul@23 295
=============
paul@23 296
paul@29 297
Some optimisations around constant objects might be possible; these depend on
paul@29 298
the following:
paul@29 299
paul@29 300
  * Reliable tracking of assignments: where assignment operations occur, the
paul@29 301
    target of the assignment should be determined if any hope of optimisation
paul@29 302
    is to be maintained. Where no guarantees can be made about the target of
paul@29 303
    an assignment, no assignment-related information should be written to
paul@29 304
    potential targets.
paul@29 305
paul@29 306
  * Objects acting as "containers" of attributes must be regarded as "safe":
paul@29 307
    where assignments are recorded as occurring on an attribute, it must be
paul@29 308
    guaranteed that no other unforeseen ways exist to assign to such
paul@29 309
    attributes.
paul@29 310
paul@29 311
The discussion below presents certain rules which must be imposed to uphold
paul@29 312
the above requirements.
paul@29 313
paul@30 314
Safe Containers
paul@30 315
---------------
paul@28 316
paul@23 317
Where attributes of modules, classes and instances are only set once and are
paul@23 318
effectively constant, it should be possible to circumvent the attribute lookup
paul@28 319
mechanism and to directly reference the attribute value. This technique may
paul@30 320
only be considered applicable for the following "container" objects, subject
paul@30 321
to the noted constraints:
paul@28 322
paul@30 323
 1. For modules, "safety" is enforced by ensuring that assignments to module
paul@30 324
    attributes are only permitted within the module itself either at the
paul@30 325
    top-level or via names declared as globals. Thus, the following would not
paul@30 326
    be permitted:
paul@28 327
paul@28 328
    another_module.this_module.attr = value
paul@28 329
paul@29 330
    In the above, this_module is a reference to the current module.
paul@28 331
paul@30 332
 2. For classes, "safety" is enforced by ensuring that assignments to class
paul@30 333
    attributes are only permitted within the class definition, outside
paul@30 334
    methods. This would mean that classes would be "sealed" at definition time
paul@30 335
    (like functions).
paul@28 336
paul@28 337
Unlike the property of function locals that they may only sensibly be accessed
paul@28 338
within the function in which they reside, these cases demand additional
paul@28 339
controls or assumptions on or about access to the stored data. Meanwhile, it
paul@28 340
would be difficult to detect eligible attributes on arbitrary instances due to
paul@28 341
the need for some kind of type inference or abstract execution.
paul@28 342
paul@30 343
Constant Attributes
paul@30 344
-------------------
paul@30 345
paul@30 346
When accessed via "safe containers", as described above, any attribute with
paul@30 347
only one recorded assignment on it can be considered a constant attribute and
paul@30 348
this eligible for optimisation, the consequence of which would be the
paul@30 349
replacement of a LoadAttrIndex instruction (which needs to look up an
paul@30 350
attribute using the run-time details of the "container" and the compile-time
paul@30 351
details of the attribute) with a LoadAttr instruction.
paul@30 352
paul@30 353
However, some restrictions exist on assignment operations which may be
paul@30 354
regarded to cause only one assignment in the lifetime of a program:
paul@30 355
paul@30 356
 1. For module attributes, only assignments at the top-level outside loop
paul@30 357
    statements can be reliably assumed to cause only a single assignment.
paul@30 358
paul@30 359
 2. For class attributes, only assignments at the top-level within class
paul@30 360
    definitions and outside loop statements can be reliably assumed to cause
paul@30 361
    only a single assignment.
paul@30 362
paul@30 363
All assignments satisfying the "safe container" requirements, but not the
paul@30 364
requirements immediately above, should each be recorded as causing at least
paul@30 365
one assignment.
paul@28 366
paul@29 367
Additional Controls
paul@29 368
-------------------
paul@29 369
paul@29 370
For the above cases for "container" objects, the following controls would need
paul@29 371
to apply:
paul@29 372
paul@29 373
 1. Modules would need to be immutable after initialisation. However, during
paul@29 374
    initialisation, there remains a possibility of another module attempting
paul@29 375
    to access the original module. For example, if ppp/__init__.py contained
paul@29 376
    the following...
paul@29 377
paul@29 378
    x = 1
paul@29 379
    import ppp.qqq
paul@29 380
    print x
paul@29 381
paul@29 382
    ...and if ppp/qqq.py contained the following...
paul@29 383
paul@29 384
    import ppp
paul@29 385
    ppp.x = 2
paul@29 386
paul@29 387
    ...then the value 2 would be printed. Since modules are objects which are
paul@29 388
    registered globally in a program, it would be possible to set attributes
paul@29 389
    in the above way.
paul@29 390
paul@29 391
 2. Classes would need to be immutable after initialisation. However, since
paul@29 392
    classes are objects, any reference to a class after initialisation could
paul@29 393
    be used to set attributes on the class.
paul@29 394
paul@29 395
Solutions:
paul@29 396
paul@29 397
 1. Insist on global scope for module attribute assignments.
paul@29 398
paul@29 399
 2. Insist on local scope within classes.
paul@29 400
paul@29 401
Both of the above measures need to be enforced at run-time, since an arbitrary
paul@29 402
attribute assignment could be attempted on any kind of object, yet to uphold
paul@29 403
the properties of "safe containers", attempts to change attributes of such
paul@29 404
objects should be denied. Since foreseen attribute assignment operations have
paul@29 405
certain properties detectable at compile-time, it could be appropriate to
paul@29 406
generate special instructions (or modified instructions) during the
paul@29 407
initialisation of modules and classes for such foreseen assignments, whilst
paul@29 408
employing normal attribute assignment operations in all other cases. Indeed,
paul@29 409
the StoreAttr instruction, which is used to set attributes in "safe
paul@29 410
containers" would be used exclusively for this purpose; the StoreAttrIndex
paul@29 411
instruction would be used exclusively for all other attribute assignments.
paul@29 412
paul@43 413
To ensure the "sealing" of modules and classes, entries in the attribute
paul@43 414
lookup table would encode whether a class or module is being accessed, so
paul@43 415
that the StoreAttrIndex instruction could reject such accesses.
paul@43 416
paul@28 417
Constant Attribute Values
paul@28 418
-------------------------
paul@28 419
paul@29 420
Where an attribute value is itself regarded as constant, is a "safe container"
paul@29 421
and is used in an operation accessing its own attributes, the value can be
paul@29 422
directly inspected for optimisations or employed in the generated code. For
paul@29 423
the attribute values themselves, only objects of a constant nature may be
paul@28 424
considered suitable for this particular optimisation:
paul@28 425
paul@28 426
  * Classes
paul@28 427
  * Modules
paul@28 428
  * Instances defined as constant literals
paul@28 429
paul@28 430
This is because arbitrary objects (such as most instances) have no
paul@28 431
well-defined form before run-time and cannot be investigated further at
paul@28 432
compile-time or have a representation inserted into the generated code.
paul@29 433
paul@29 434
Class Attributes and Access via Instances
paul@29 435
-----------------------------------------
paul@29 436
paul@29 437
Unlike module attributes, class attributes can be accessed in a number of
paul@29 438
different ways:
paul@29 439
paul@29 440
  * Using the class itself:
paul@29 441
paul@29 442
    C.x = 123
paul@29 443
    cls = C; cls.x = 234
paul@29 444
paul@29 445
  * Using a subclass of the class (for reading attributes):
paul@29 446
paul@29 447
    class D(C):
paul@29 448
        pass
paul@29 449
    D.x # setting D.x would populate D, not C
paul@29 450
paul@29 451
  * Using instances of the class or a subclass of the class (for reading
paul@29 452
    attributes):
paul@29 453
paul@29 454
    c = C()
paul@29 455
    c.x # setting c.x would populate c, not C
paul@29 456
paul@29 457
Since assignments are only achieved using direct references to the class, and
paul@29 458
since class attributes should be defined only within the class initialisation
paul@29 459
process, the properties of class attributes should be consistent with those
paul@29 460
desired.
paul@29 461
paul@29 462
Method Access via Instances
paul@29 463
---------------------------
paul@29 464
paul@29 465
It is desirable to optimise method access, even though most method calls are
paul@29 466
likely to occur via instances. It is possible, given the properties of methods
paul@29 467
as class attributes to investigate the kind of instance that the self
paul@29 468
parameter/local refers to within each method: it should be an instance either
paul@29 469
of the class in which the method is defined or a compatible class, although
paul@29 470
situations exist where this might not be the case:
paul@29 471
paul@29 472
  * Explicit invocation of a method:
paul@29 473
paul@29 474
    d = D() # D is not related to C
paul@29 475
    C.f(d) # calling f(self) in C
paul@29 476
paul@30 477
If blatant usage of incompatible instances were somehow disallowed, it would
paul@30 478
still be necessary to investigate the properties of an instance's class and
paul@30 479
its relationship with other classes. Consider the following example:
paul@30 480
paul@30 481
    class A:
paul@30 482
        def f(self): ...
paul@30 483
paul@30 484
    class B:
paul@30 485
        def f(self): ...
paul@30 486
        def g(self):
paul@30 487
            self.f()
paul@30 488
paul@30 489
    class C(A, B):
paul@30 490
        pass
paul@30 491
paul@30 492
Here, instances of B passed into the method B.g could be assumed to provide
paul@30 493
access to B.f when self.f is resolved at compile-time. However, instances of C
paul@30 494
passed into B.g would instead provide access to A.f when self.f is resolved at
paul@30 495
compile-time (since the method resolution order is C, A, B instead of just B).
paul@30 496
paul@30 497
One solution might be to specialise methods for each instance type, but this
paul@30 498
could be costly. Another less ambitious solution might only involve the
paul@30 499
optimisation of such internal method calls if an unambiguous target can be
paul@30 500
resolved.
paul@30 501
paul@29 502
Optimising Function Invocations
paul@29 503
-------------------------------
paul@29 504
paul@29 505
Where an attribute value is itself regarded as constant and is a function,
paul@29 506
knowledge about the parameters of the function can be employed to optimise the
paul@29 507
preparation of the invocation frame.