micropython

Annotated README.txt

79:1f879e94c49f
2008-05-05 Paul Boddie Introduced parameters to certain methods which permit the retrieval and/or inspection of previous instructions in alternative sequences; this helps in the production of substitutable instruction sequences related to temporary storage access. Added methods which capture generated code for use with such temporary storage instruction sequences, changing the existing methods to employ sequences instead of single instructions.
paul@15 1
Namespace Definition
paul@15 2
====================
paul@15 3
paul@15 4
Module attributes are defined either at the module level or by global
paul@15 5
statements.
paul@15 6
paul@15 7
Class attributes are defined only within class statements.
paul@15 8
paul@15 9
Instance attributes are defined only by assignments to attributes of self
paul@15 10
within __init__ methods.
paul@15 11
paul@42 12
Potential Restrictions
paul@42 13
----------------------
paul@42 14
paul@42 15
Names of classes and functions could be restricted to only refer to those
paul@42 16
objects within the same namespace. If redefinition were to occur, or if
paul@42 17
multiple possibilities were present, these restrictions could be moderated as
paul@42 18
follows:
paul@42 19
paul@42 20
  * Classes assigned to the same name could provide the union of their
paul@42 21
    attributes. This would, however, cause a potential collision of attribute
paul@42 22
    definitions such as methods.
paul@42 23
paul@42 24
  * Functions, if they share compatible signatures, could share parameter list
paul@42 25
    definitions.
paul@42 26
paul@11 27
Data Structures
paul@11 28
===============
paul@11 29
paul@45 30
The fundamental "value type" is a pair of references: one pointing to the
paul@45 31
referenced object represented by the interchangeable value; one referring to
paul@45 32
the context of the referenced object, typically the object through which the
paul@45 33
referenced object was acquired as an attribute.A
paul@45 34
paul@45 35
Value Layout
paul@45 36
------------
paul@45 37
paul@45 38
    0           1
paul@45 39
    object      context
paul@45 40
    reference   reference
paul@45 41
paul@52 42
Acquiring Values
paul@52 43
----------------
paul@52 44
paul@52 45
Values are acquired through name lookups and attribute access, yielding
paul@52 46
the appropriate object reference together with a context reference as
paul@52 47
indicated in the following table:
paul@52 48
paul@52 49
    Type of Access          Context     Notes
paul@52 50
    --------------          -------     -----
paul@52 51
paul@52 52
    Local name              Preserved   Functions provide no context
paul@52 53
paul@52 54
    Global name             Preserved   Modules provide no context
paul@52 55
paul@52 56
    Class-originating       Accessor    Methods acquire the context of their
paul@52 57
    attribute                 -or-      accessor if an instance...
paul@52 58
                            Preserved   or retain the original context if the
paul@52 59
                                        accessor is a class
paul@52 60
paul@52 61
    Instance-originating    Preserved   Methods retain their original context
paul@52 62
    attribute
paul@52 63
paul@53 64
There may be some scope for simplifying the above, to the detriment of Python
paul@52 65
compatibility, since the unbound vs. bound methods situation can be confusing.
paul@52 66
paul@45 67
Objects
paul@45 68
-------
paul@45 69
paul@11 70
Since classes, functions and instances are all "objects", each must support
paul@11 71
certain features and operations in the same way.
paul@11 72
paul@11 73
The __class__ Attribute
paul@11 74
-----------------------
paul@11 75
paul@11 76
All objects support the __class__ attribute:
paul@11 77
paul@11 78
Class: refers to the type class (type.__class__ also refers to the type class)
paul@11 79
Function: refers to the function class
paul@11 80
Instance: refers to the class instantiated to make the object
paul@11 81
paul@11 82
Invocation
paul@11 83
----------
paul@11 84
paul@11 85
The following actions need to be supported:
paul@11 86
paul@11 87
Class: create instance, call __init__ with instance, return object
paul@11 88
Function: call function body, return result
paul@11 89
Instance: call __call__ method, return result
paul@11 90
paul@11 91
Structure Layout
paul@11 92
----------------
paul@11 93
paul@11 94
A suitable structure layout might be something like this:
paul@11 95
paul@59 96
    Identifier  Address     Type        Object      ...
paul@59 97
paul@11 98
    0           1           2           3           4
paul@11 99
    classcode   invocation  __class__   attribute   ...
paul@11 100
                reference   reference   reference
paul@11 101
paul@11 102
Here, the classcode refers to the attribute lookup table for the object. Since
paul@11 103
classes and instances share the same classcode, they might resemble the
paul@11 104
following:
paul@11 105
paul@11 106
Class C:
paul@11 107
paul@11 108
    0           1           2           3           4
paul@11 109
    code for C  __new__     class type  attribute   ...
paul@11 110
                reference   reference   reference
paul@11 111
paul@11 112
Instance of C:
paul@11 113
paul@11 114
    0           1           2           3           4
paul@11 115
    code for C  C.__call__  class C     attribute   ...
paul@11 116
                reference   reference   reference
paul@11 117
                (if exists)
paul@11 118
paul@11 119
The __new__ reference would lead to code consisting of the following
paul@11 120
instructions:
paul@11 121
paul@11 122
    create instance for C
paul@11 123
    call C.__init__(instance, ...)
paul@11 124
    return instance
paul@11 125
paul@11 126
If C has a __call__ attribute, the invocation "slot" of C instances would
paul@11 127
refer to the same thing as C.__call__.
paul@11 128
paul@11 129
For functions, the same general layout applies:
paul@11 130
paul@11 131
Function f:
paul@11 132
paul@11 133
    0           1           2           3           4
paul@11 134
    code for    code        class       attribute   ...
paul@11 135
    function    reference   function    reference
paul@11 136
                            reference
paul@11 137
paul@37 138
Here, the code reference would lead to code for the function. Note that the
paul@37 139
function locals are completely distinct from this structure and are not
paul@37 140
comparable to attributes.
paul@37 141
paul@38 142
For modules, there is no meaningful invocation reference:
paul@37 143
paul@37 144
Module m:
paul@37 145
paul@37 146
    0           1           2           3           4
paul@38 147
    code for m  (unused)    module type attribute   ...
paul@38 148
                            reference   (global)
paul@37 149
                                        reference
paul@11 150
paul@38 151
Both classes and modules have code in their definitions, but this would be
paul@38 152
generated in order and not referenced externally.
paul@38 153
paul@11 154
Invocation Operation
paul@11 155
--------------------
paul@11 156
paul@11 157
Consequently, regardless of the object an invocation is always done as
paul@11 158
follows:
paul@11 159
paul@11 160
    get invocation reference (at object+1)
paul@11 161
    jump to reference
paul@11 162
paul@11 163
Additional preparation is necessary before the above code: positional
paul@11 164
arguments must be saved to the parameter stack, and keyword arguments must be
paul@11 165
resolved and saved to the appropriate position in the parameter stack.
paul@11 166
paul@11 167
Attribute Operations
paul@11 168
--------------------
paul@11 169
paul@53 170
Attribute access needs to go through the attribute lookup table. Some
paul@53 171
optimisations are possible and are described in the appropriate section.
paul@53 172
paul@53 173
One important aspect of attribute access is the appropriate setting of the
paul@53 174
context in the acquired attribute value. From the table describing the
paul@53 175
acquisition of values, it is clear that the principal exception is that where
paul@53 176
a class-originating attribute is accessed on an instance. Consequently, the
paul@53 177
following algorithm could be employed once an attribute has been located:
paul@53 178
paul@53 179
 1. If the attribute's context is a special value, indicating that it should
paul@53 180
    be replaced upon instance access, then proceed to the next step;
paul@53 181
    otherwise, acquire both the context and the object as they are.
paul@53 182
paul@53 183
 2. If the accessor is an instance, use that as the value's context, acquiring
paul@53 184
    only the object from the attribute.
paul@53 185
paul@53 186
Where accesses can be determined ahead of time (as discussed in the
paul@53 187
optimisations section), the above algorithm may not necessarily be employed in
paul@53 188
the generated code for some accesses.
paul@21 189
paul@21 190
Instruction Evaluation Model
paul@21 191
============================
paul@21 192
paul@21 193
Programs use a value stack where evaluated instructions may save their
paul@21 194
results. A value stack pointer indicates the top of this stack. In addition, a
paul@21 195
separate stack is used to record the invocation frames. All stack pointers
paul@21 196
refer to the next address to be used by the stack, not the address of the
paul@21 197
uppermost element.
paul@21 198
paul@21 199
    Frame Stack     Value Stack
paul@21 200
    -----------     -----------     Address of Callable
paul@21 201
                                    -------------------
paul@21 202
    previous        ...
paul@21 203
    current ------> callable -----> identifier
paul@21 204
                    arg1            reference to code
paul@21 205
                    arg2
paul@21 206
                    arg3
paul@21 207
                    local4
paul@21 208
                    local5
paul@21 209
                    ...
paul@21 210
paul@21 211
Loading local names is a matter of performing frame-relative accesses to the
paul@21 212
value stack.
paul@21 213
paul@21 214
Invocations and Argument Evaluation
paul@21 215
-----------------------------------
paul@21 216
paul@21 217
When preparing for an invocation, the caller first sets the invocation frame
paul@21 218
pointer. Then, positional arguments are added to the stack such that the first
paul@21 219
argument positions are filled. A number of stack locations for the remaining
paul@21 220
arguments specified in the program are then reserved. The names of keyword
paul@21 221
arguments are used (in the form of table index values) to consult the
paul@21 222
parameter table and to find the location in which such arguments are to be
paul@21 223
stored.
paul@21 224
paul@21 225
    fn(a, b, d=1, e=2, c=3) -> fn(a, b, c, d, e)
paul@21 226
paul@21 227
    Value Stack
paul@21 228
    -----------
paul@21 229
paul@21 230
    ...             ...             ...             ...
paul@21 231
    fn              fn              fn              fn
paul@21 232
    a               a               a               a
paul@21 233
    b               b               b               b
paul@21 234
    ___             ___             ___         --> 3
paul@21 235
    ___         --> 1               1           |   1
paul@21 236
    ___         |   ___         --> 2           |   2
paul@21 237
    1 -----------   2 -----------   3 -----------
paul@21 238
paul@21 239
Conceptually, the frame can be considered as a collection of attributes, as
paul@21 240
seen in other kinds of structures:
paul@21 241
paul@21 242
Frame for invocation of fn:
paul@21 243
paul@21 244
    0           1           2           3           4           5
paul@21 245
    code        a           b           c           d           e
paul@21 246
    reference
paul@21 247
paul@21 248
However, where arguments are specified positionally, such "attributes" are not
paul@21 249
set using a comparable approach to that employed with other structures.
paul@21 250
Keyword arguments are set using an attribute-like mechanism, though, where the
paul@21 251
position of each argument discovered using the parameter table.
paul@21 252
paul@67 253
Where the target of the invocation is known, the above procedure can be
paul@67 254
optimised slightly by attempting to add keyword argument values directly to
paul@67 255
the stack:
paul@67 256
paul@67 257
    Value Stack
paul@67 258
    -----------
paul@67 259
paul@67 260
    ...             ...             ...             ...             ...
paul@67 261
    fn              fn              fn              fn              fn
paul@67 262
    a               a               a               a               a
paul@67 263
    b               b               b               b               b
paul@67 264
                    ___             ___             ___         --> 3
paul@67 265
                    1               1               1           |   1
paul@67 266
                                    2               1           |   2
paul@67 267
                                                    3 -----------
paul@67 268
paul@67 269
    (reserve for    (add in-place)  (add in-place)  (evaluate)      (store by
paul@67 270
    parameter c)                                                    index)
paul@67 271
paul@67 272
Method Invocations
paul@67 273
------------------
paul@67 274
paul@45 275
Method invocations incorporate an implicit first argument which is obtained
paul@45 276
from the context of the method:
paul@45 277
paul@45 278
    method(a, b, d=1, e=2, c=3) -> method(self, a, b, c, d, e)
paul@45 279
paul@45 280
    Value Stack
paul@45 281
    -----------
paul@45 282
paul@45 283
    ...
paul@45 284
    method
paul@45 285
    context of method
paul@45 286
    a
paul@45 287
    b
paul@45 288
    3
paul@45 289
    1
paul@45 290
    2
paul@45 291
paul@45 292
Although it could be possible to permit any object to be provided as the first
paul@45 293
argument, in order to optimise instance attribute access in methods, we should
paul@45 294
seek to restrict the object type.
paul@45 295
paul@58 296
Verifying Supplied Arguments
paul@58 297
----------------------------
paul@58 298
paul@58 299
In order to ensure a correct invocation, it is also necessary to check the
paul@58 300
number of supplied arguments. If the target of the invocation is known at
paul@58 301
compile-time, no additional instructions need to be emitted; otherwise, the
paul@58 302
generated code must test for the following situations:
paul@58 303
paul@58 304
 1. That the number of supplied arguments is equal to the number of expected
paul@58 305
    parameters.
paul@58 306
paul@58 307
 2. That no keyword argument overwrites an existing positional parameter.
paul@58 308
paul@66 309
Default Arguments
paul@66 310
-----------------
paul@66 311
paul@66 312
Some arguments may have default values which are used if no value is provided
paul@66 313
in an invocation. Such defaults are initialised when the function itself is
paul@66 314
initialised, and are used to fill in any invocation frames that are known at
paul@66 315
compile-time.
paul@66 316
paul@21 317
Tuples, Frames and Allocation
paul@21 318
-----------------------------
paul@21 319
paul@21 320
Using the approach where arguments are treated like attributes in some kind of
paul@21 321
structure, we could choose to allocate frames in places other than a stack.
paul@21 322
This would produce something somewhat similar to a plain tuple object.
paul@23 323
paul@23 324
Optimisations
paul@23 325
=============
paul@23 326
paul@29 327
Some optimisations around constant objects might be possible; these depend on
paul@29 328
the following:
paul@29 329
paul@29 330
  * Reliable tracking of assignments: where assignment operations occur, the
paul@29 331
    target of the assignment should be determined if any hope of optimisation
paul@29 332
    is to be maintained. Where no guarantees can be made about the target of
paul@29 333
    an assignment, no assignment-related information should be written to
paul@29 334
    potential targets.
paul@29 335
paul@29 336
  * Objects acting as "containers" of attributes must be regarded as "safe":
paul@29 337
    where assignments are recorded as occurring on an attribute, it must be
paul@29 338
    guaranteed that no other unforeseen ways exist to assign to such
paul@29 339
    attributes.
paul@29 340
paul@29 341
The discussion below presents certain rules which must be imposed to uphold
paul@29 342
the above requirements.
paul@29 343
paul@30 344
Safe Containers
paul@30 345
---------------
paul@28 346
paul@23 347
Where attributes of modules, classes and instances are only set once and are
paul@23 348
effectively constant, it should be possible to circumvent the attribute lookup
paul@28 349
mechanism and to directly reference the attribute value. This technique may
paul@30 350
only be considered applicable for the following "container" objects, subject
paul@30 351
to the noted constraints:
paul@28 352
paul@30 353
 1. For modules, "safety" is enforced by ensuring that assignments to module
paul@30 354
    attributes are only permitted within the module itself either at the
paul@30 355
    top-level or via names declared as globals. Thus, the following would not
paul@30 356
    be permitted:
paul@28 357
paul@28 358
    another_module.this_module.attr = value
paul@28 359
paul@29 360
    In the above, this_module is a reference to the current module.
paul@28 361
paul@30 362
 2. For classes, "safety" is enforced by ensuring that assignments to class
paul@30 363
    attributes are only permitted within the class definition, outside
paul@30 364
    methods. This would mean that classes would be "sealed" at definition time
paul@30 365
    (like functions).
paul@28 366
paul@28 367
Unlike the property of function locals that they may only sensibly be accessed
paul@28 368
within the function in which they reside, these cases demand additional
paul@28 369
controls or assumptions on or about access to the stored data. Meanwhile, it
paul@28 370
would be difficult to detect eligible attributes on arbitrary instances due to
paul@28 371
the need for some kind of type inference or abstract execution.
paul@28 372
paul@30 373
Constant Attributes
paul@30 374
-------------------
paul@30 375
paul@30 376
When accessed via "safe containers", as described above, any attribute with
paul@30 377
only one recorded assignment on it can be considered a constant attribute and
paul@30 378
this eligible for optimisation, the consequence of which would be the
paul@30 379
replacement of a LoadAttrIndex instruction (which needs to look up an
paul@30 380
attribute using the run-time details of the "container" and the compile-time
paul@30 381
details of the attribute) with a LoadAttr instruction.
paul@30 382
paul@30 383
However, some restrictions exist on assignment operations which may be
paul@30 384
regarded to cause only one assignment in the lifetime of a program:
paul@30 385
paul@30 386
 1. For module attributes, only assignments at the top-level outside loop
paul@30 387
    statements can be reliably assumed to cause only a single assignment.
paul@30 388
paul@30 389
 2. For class attributes, only assignments at the top-level within class
paul@30 390
    definitions and outside loop statements can be reliably assumed to cause
paul@30 391
    only a single assignment.
paul@30 392
paul@30 393
All assignments satisfying the "safe container" requirements, but not the
paul@30 394
requirements immediately above, should each be recorded as causing at least
paul@30 395
one assignment.
paul@28 396
paul@29 397
Additional Controls
paul@29 398
-------------------
paul@29 399
paul@29 400
For the above cases for "container" objects, the following controls would need
paul@29 401
to apply:
paul@29 402
paul@29 403
 1. Modules would need to be immutable after initialisation. However, during
paul@29 404
    initialisation, there remains a possibility of another module attempting
paul@29 405
    to access the original module. For example, if ppp/__init__.py contained
paul@29 406
    the following...
paul@29 407
paul@29 408
    x = 1
paul@29 409
    import ppp.qqq
paul@29 410
    print x
paul@29 411
paul@29 412
    ...and if ppp/qqq.py contained the following...
paul@29 413
paul@29 414
    import ppp
paul@29 415
    ppp.x = 2
paul@29 416
paul@29 417
    ...then the value 2 would be printed. Since modules are objects which are
paul@29 418
    registered globally in a program, it would be possible to set attributes
paul@29 419
    in the above way.
paul@29 420
paul@29 421
 2. Classes would need to be immutable after initialisation. However, since
paul@29 422
    classes are objects, any reference to a class after initialisation could
paul@29 423
    be used to set attributes on the class.
paul@29 424
paul@29 425
Solutions:
paul@29 426
paul@29 427
 1. Insist on global scope for module attribute assignments.
paul@29 428
paul@29 429
 2. Insist on local scope within classes.
paul@29 430
paul@29 431
Both of the above measures need to be enforced at run-time, since an arbitrary
paul@29 432
attribute assignment could be attempted on any kind of object, yet to uphold
paul@29 433
the properties of "safe containers", attempts to change attributes of such
paul@29 434
objects should be denied. Since foreseen attribute assignment operations have
paul@29 435
certain properties detectable at compile-time, it could be appropriate to
paul@29 436
generate special instructions (or modified instructions) during the
paul@29 437
initialisation of modules and classes for such foreseen assignments, whilst
paul@29 438
employing normal attribute assignment operations in all other cases. Indeed,
paul@29 439
the StoreAttr instruction, which is used to set attributes in "safe
paul@29 440
containers" would be used exclusively for this purpose; the StoreAttrIndex
paul@29 441
instruction would be used exclusively for all other attribute assignments.
paul@29 442
paul@43 443
To ensure the "sealing" of modules and classes, entries in the attribute
paul@43 444
lookup table would encode whether a class or module is being accessed, so
paul@43 445
that the StoreAttrIndex instruction could reject such accesses.
paul@43 446
paul@28 447
Constant Attribute Values
paul@28 448
-------------------------
paul@28 449
paul@29 450
Where an attribute value is itself regarded as constant, is a "safe container"
paul@29 451
and is used in an operation accessing its own attributes, the value can be
paul@29 452
directly inspected for optimisations or employed in the generated code. For
paul@29 453
the attribute values themselves, only objects of a constant nature may be
paul@28 454
considered suitable for this particular optimisation:
paul@28 455
paul@28 456
  * Classes
paul@28 457
  * Modules
paul@28 458
  * Instances defined as constant literals
paul@28 459
paul@28 460
This is because arbitrary objects (such as most instances) have no
paul@28 461
well-defined form before run-time and cannot be investigated further at
paul@28 462
compile-time or have a representation inserted into the generated code.
paul@29 463
paul@29 464
Class Attributes and Access via Instances
paul@29 465
-----------------------------------------
paul@29 466
paul@29 467
Unlike module attributes, class attributes can be accessed in a number of
paul@29 468
different ways:
paul@29 469
paul@29 470
  * Using the class itself:
paul@29 471
paul@29 472
    C.x = 123
paul@29 473
    cls = C; cls.x = 234
paul@29 474
paul@29 475
  * Using a subclass of the class (for reading attributes):
paul@29 476
paul@29 477
    class D(C):
paul@29 478
        pass
paul@29 479
    D.x # setting D.x would populate D, not C
paul@29 480
paul@29 481
  * Using instances of the class or a subclass of the class (for reading
paul@29 482
    attributes):
paul@29 483
paul@29 484
    c = C()
paul@29 485
    c.x # setting c.x would populate c, not C
paul@29 486
paul@29 487
Since assignments are only achieved using direct references to the class, and
paul@29 488
since class attributes should be defined only within the class initialisation
paul@29 489
process, the properties of class attributes should be consistent with those
paul@29 490
desired.
paul@29 491
paul@29 492
Method Access via Instances
paul@29 493
---------------------------
paul@29 494
paul@29 495
It is desirable to optimise method access, even though most method calls are
paul@29 496
likely to occur via instances. It is possible, given the properties of methods
paul@29 497
as class attributes to investigate the kind of instance that the self
paul@29 498
parameter/local refers to within each method: it should be an instance either
paul@29 499
of the class in which the method is defined or a compatible class, although
paul@29 500
situations exist where this might not be the case:
paul@29 501
paul@29 502
  * Explicit invocation of a method:
paul@29 503
paul@29 504
    d = D() # D is not related to C
paul@29 505
    C.f(d) # calling f(self) in C
paul@29 506
paul@30 507
If blatant usage of incompatible instances were somehow disallowed, it would
paul@30 508
still be necessary to investigate the properties of an instance's class and
paul@30 509
its relationship with other classes. Consider the following example:
paul@30 510
paul@30 511
    class A:
paul@30 512
        def f(self): ...
paul@30 513
paul@30 514
    class B:
paul@30 515
        def f(self): ...
paul@30 516
        def g(self):
paul@30 517
            self.f()
paul@30 518
paul@30 519
    class C(A, B):
paul@30 520
        pass
paul@30 521
paul@30 522
Here, instances of B passed into the method B.g could be assumed to provide
paul@30 523
access to B.f when self.f is resolved at compile-time. However, instances of C
paul@30 524
passed into B.g would instead provide access to A.f when self.f is resolved at
paul@30 525
compile-time (since the method resolution order is C, A, B instead of just B).
paul@30 526
paul@30 527
One solution might be to specialise methods for each instance type, but this
paul@30 528
could be costly. Another less ambitious solution might only involve the
paul@30 529
optimisation of such internal method calls if an unambiguous target can be
paul@30 530
resolved.
paul@30 531
paul@29 532
Optimising Function Invocations
paul@29 533
-------------------------------
paul@29 534
paul@29 535
Where an attribute value is itself regarded as constant and is a function,
paul@29 536
knowledge about the parameters of the function can be employed to optimise the
paul@29 537
preparation of the invocation frame.