micropython (annotate README.txt in 24df65a47aa8)

Namespace Definition

paul@15

2

====================

Module attributes are defined either at the module level or by global

paul@15

5

statements.

Class attributes are defined only within class statements.

Instance attributes are defined only by assignments to attributes of self

paul@15

10

within __init__ methods.

Potential Restrictions

paul@42

13

----------------------

Names of classes and functions could be restricted to only refer to those

paul@42

16

objects within the same namespace. If redefinition were to occur, or if

paul@42

17

multiple possibilities were present, these restrictions could be moderated as

paul@42

18

follows:

  * Classes assigned to the same name could provide the union of their

paul@42

21

    attributes. This would, however, cause a potential collision of attribute

paul@42

22

    definitions such as methods.

  * Functions, if they share compatible signatures, could share parameter list

paul@42

25

    definitions.

Data Structures

paul@11

28

===============

The fundamental "value type" is a pair of references: one pointing to the

paul@45

31

referenced object represented by the interchangeable value; one referring to

paul@45

32

the context of the referenced object, typically the object through which the

paul@45

33

referenced object was acquired as an attribute.A

Value Layout

paul@45

36

------------

    0           1

paul@45

39

    object      context

paul@45

40

    reference   reference

Acquiring Values

paul@52

43

----------------

Values are acquired through name lookups and attribute access, yielding

paul@52

46

the appropriate object reference together with a context reference as

paul@52

47

indicated in the following table:

    Type of Access          Context     Notes

paul@52

50

    --------------          -------     -----

    Local name              Preserved   Functions provide no context

    Global name             Preserved   Modules provide no context

    Class-originating       Accessor    Methods acquire the context of their

paul@52

57

    attribute                 -or-      accessor if an instance...

paul@52

58

                            Preserved   or retain the original context if the

paul@52

59

                                        accessor is a class

    Instance-originating    Preserved   Methods retain their original context

paul@52

62

    attribute

There may be some scope for simplifying the above, to the detriment of Python

paul@52

65

compatibility, since the unbound vs. bound methods situation can be confusing.

Objects

paul@45

68

-------

Since classes, functions and instances are all "objects", each must support

paul@11

71

certain features and operations in the same way.

The __class__ Attribute

paul@11

74

-----------------------

All objects support the __class__ attribute:

Class: refers to the type class (type.__class__ also refers to the type class)

paul@11

79

Function: refers to the function class

paul@11

80

Instance: refers to the class instantiated to make the object

Invocation

paul@11

83

----------

The following actions need to be supported:

Class: create instance, call __init__ with instance, return object

paul@11

88

Function: call function body, return result

paul@11

89

Instance: call __call__ method, return result

Structure Layout

paul@11

92

----------------

A suitable structure layout might be something like this:

    Identifier  Address     Type        Object      ...

    0           1           2           3           4

paul@11

99

    classcode   invocation  __class__   attribute   ...

paul@11

100

                reference   reference   reference

Here, the classcode refers to the attribute lookup table for the object. Since

paul@11

103

classes and instances share the same classcode, they might resemble the

paul@11

104

following:

Class C:

    0           1           2           3           4

paul@11

109

    code for C  __new__     class type  attribute   ...

paul@11

110

                reference   reference   reference

Instance of C:

    0           1           2           3           4

paul@11

115

    code for C  C.__call__  class C     attribute   ...

paul@11

116

                reference   reference   reference

paul@11

117

                (if exists)

The __new__ reference would lead to code consisting of the following

paul@11

120

instructions:

    create instance for C

paul@11

123

    call C.__init__(instance, ...)

paul@11

124

    return instance

If C has a __call__ attribute, the invocation "slot" of C instances would

paul@11

127

refer to the same thing as C.__call__.

For functions, the same general layout applies:

Function f:

    0           1           2           3           4

paul@11

134

    code for    code        class       attribute   ...

paul@11

135

    function    reference   function    reference

paul@11

136

                            reference

Here, the code reference would lead to code for the function. Note that the

paul@37

139

function locals are completely distinct from this structure and are not

paul@37

140

comparable to attributes.

For modules, there is no meaningful invocation reference:

Module m:

    0           1           2           3           4

paul@38

147

    code for m  (unused)    module type attribute   ...

paul@38

148

                            reference   (global)

paul@37

149

                                        reference

Both classes and modules have code in their definitions, but this would be

paul@38

152

generated in order and not referenced externally.

Invocation Operation

paul@11

155

--------------------

Consequently, regardless of the object an invocation is always done as

paul@11

158

follows:

    get invocation reference (at object+1)

paul@11

161

    jump to reference

Additional preparation is necessary before the above code: positional

paul@11

164

arguments must be saved to the parameter stack, and keyword arguments must be

paul@11

165

resolved and saved to the appropriate position in the parameter stack.

Attribute Operations

paul@11

168

--------------------

Attribute access needs to go through the attribute lookup table. Some

paul@53

171

optimisations are possible and are described in the appropriate section.

One important aspect of attribute access is the appropriate setting of the

paul@53

174

context in the acquired attribute value. From the table describing the

paul@53

175

acquisition of values, it is clear that the principal exception is that where

paul@53

176

a class-originating attribute is accessed on an instance. Consequently, the

paul@53

177

following algorithm could be employed once an attribute has been located:

 1. If the attribute's context is a special value, indicating that it should

paul@53

180

    be replaced upon instance access, then proceed to the next step;

paul@53

181

    otherwise, acquire both the context and the object as they are.

 2. If the accessor is an instance, use that as the value's context, acquiring

paul@53

184

    only the object from the attribute.

Where accesses can be determined ahead of time (as discussed in the

paul@53

187

optimisations section), the above algorithm may not necessarily be employed in

paul@53

188

the generated code for some accesses.

Instruction Evaluation Model

paul@21

191

============================

Programs use a value stack where evaluated instructions may save their

paul@21

194

results. A value stack pointer indicates the top of this stack. In addition, a

paul@21

195

separate stack is used to record the invocation frames. All stack pointers

paul@21

196

refer to the next address to be used by the stack, not the address of the

paul@21

197

uppermost element.

    Frame Stack     Value Stack

paul@21

200

    -----------     -----------     Address of Callable

paul@21

201

                                    -------------------

paul@21

202

    previous        ...

paul@21

203

    current ------> callable -----> identifier

paul@21

204

                    arg1            reference to code

paul@21

205

                    arg2

paul@21

206

                    arg3

paul@21

207

                    local4

paul@21

208

                    local5

paul@21

209

...

Loading local names is a matter of performing frame-relative accesses to the

paul@21

212

value stack.

Invocations and Argument Evaluation

paul@21

215

-----------------------------------

When preparing for an invocation, the caller first sets the invocation frame

paul@21

218

pointer. Then, positional arguments are added to the stack such that the first

paul@21

219

argument positions are filled. A number of stack locations for the remaining

paul@21

220

arguments specified in the program are then reserved. The names of keyword

paul@21

221

arguments are used (in the form of table index values) to consult the

paul@21

222

parameter table and to find the location in which such arguments are to be

paul@21

223

stored.

    fn(a, b, d=1, e=2, c=3) -> fn(a, b, c, d, e)

    Value Stack

paul@21

228

    -----------

    ...             ...             ...             ...

paul@21

231

    fn              fn              fn              fn

paul@21

232

    a               a               a               a

paul@21

233

    b               b               b               b

paul@21

234

    ___             ___             ___         --> 3

paul@21

235

    ___         --> 1               1           |   1

paul@21

236

    ___         |   ___         --> 2           |   2

paul@21

237

    1 -----------   2 -----------   3 -----------

Conceptually, the frame can be considered as a collection of attributes, as

paul@21

240

seen in other kinds of structures:

Frame for invocation of fn:

    0           1           2           3           4           5

paul@21

245

    code        a           b           c           d           e

paul@21

246

    reference

However, where arguments are specified positionally, such "attributes" are not

paul@21

249

set using a comparable approach to that employed with other structures.

paul@21

250

Keyword arguments are set using an attribute-like mechanism, though, where the

paul@21

251

position of each argument discovered using the parameter table.

Method invocations incorporate an implicit first argument which is obtained

paul@45

254

from the context of the method:

    method(a, b, d=1, e=2, c=3) -> method(self, a, b, c, d, e)

    Value Stack

paul@45

259

    -----------

...

paul@45

262

    method

paul@45

263

    context of method

Although it could be possible to permit any object to be provided as the first

paul@45

271

argument, in order to optimise instance attribute access in methods, we should

paul@45

272

seek to restrict the object type.

Verifying Supplied Arguments

paul@58

275

----------------------------

In order to ensure a correct invocation, it is also necessary to check the

paul@58

278

number of supplied arguments. If the target of the invocation is known at

paul@58

279

compile-time, no additional instructions need to be emitted; otherwise, the

paul@58

280

generated code must test for the following situations:

 1. That the number of supplied arguments is equal to the number of expected

paul@58

283

    parameters.

 2. That no keyword argument overwrites an existing positional parameter.

Tuples, Frames and Allocation

paul@21

288

-----------------------------

Using the approach where arguments are treated like attributes in some kind of

paul@21

291

structure, we could choose to allocate frames in places other than a stack.

paul@21

292

This would produce something somewhat similar to a plain tuple object.

Optimisations

paul@23

295

=============

Some optimisations around constant objects might be possible; these depend on

paul@29

298

the following:

  * Reliable tracking of assignments: where assignment operations occur, the

paul@29

301

    target of the assignment should be determined if any hope of optimisation

paul@29

302

    is to be maintained. Where no guarantees can be made about the target of

paul@29

303

    an assignment, no assignment-related information should be written to

paul@29

304

    potential targets.

  * Objects acting as "containers" of attributes must be regarded as "safe":

paul@29

307

    where assignments are recorded as occurring on an attribute, it must be

paul@29

308

    guaranteed that no other unforeseen ways exist to assign to such

paul@29

309

    attributes.

The discussion below presents certain rules which must be imposed to uphold

paul@29

312

the above requirements.

Safe Containers

paul@30

315

---------------

Where attributes of modules, classes and instances are only set once and are

paul@23

318

effectively constant, it should be possible to circumvent the attribute lookup

paul@28

319

mechanism and to directly reference the attribute value. This technique may

paul@30

320

only be considered applicable for the following "container" objects, subject

paul@30

321

to the noted constraints:

 1. For modules, "safety" is enforced by ensuring that assignments to module

paul@30

324

    attributes are only permitted within the module itself either at the

paul@30

325

    top-level or via names declared as globals. Thus, the following would not

paul@30

326

    be permitted:

    another_module.this_module.attr = value

    In the above, this_module is a reference to the current module.

 2. For classes, "safety" is enforced by ensuring that assignments to class

paul@30

333

    attributes are only permitted within the class definition, outside

paul@30

334

    methods. This would mean that classes would be "sealed" at definition time

paul@30

335

    (like functions).

Unlike the property of function locals that they may only sensibly be accessed

paul@28

338

within the function in which they reside, these cases demand additional

paul@28

339

controls or assumptions on or about access to the stored data. Meanwhile, it

paul@28

340

would be difficult to detect eligible attributes on arbitrary instances due to

paul@28

341

the need for some kind of type inference or abstract execution.

Constant Attributes

paul@30

344

-------------------

When accessed via "safe containers", as described above, any attribute with

paul@30

347

only one recorded assignment on it can be considered a constant attribute and

paul@30

348

this eligible for optimisation, the consequence of which would be the

paul@30

349

replacement of a LoadAttrIndex instruction (which needs to look up an

paul@30

350

attribute using the run-time details of the "container" and the compile-time

paul@30

351

details of the attribute) with a LoadAttr instruction.

However, some restrictions exist on assignment operations which may be

paul@30

354

regarded to cause only one assignment in the lifetime of a program:

 1. For module attributes, only assignments at the top-level outside loop

paul@30

357

    statements can be reliably assumed to cause only a single assignment.

 2. For class attributes, only assignments at the top-level within class

paul@30

360

    definitions and outside loop statements can be reliably assumed to cause

paul@30

361

    only a single assignment.

All assignments satisfying the "safe container" requirements, but not the

paul@30

364

requirements immediately above, should each be recorded as causing at least

paul@30

365

one assignment.

Additional Controls

paul@29

368

-------------------

For the above cases for "container" objects, the following controls would need

paul@29

371

to apply:

 1. Modules would need to be immutable after initialisation. However, during

paul@29

374

    initialisation, there remains a possibility of another module attempting

paul@29

375

    to access the original module. For example, if ppp/__init__.py contained

paul@29

376

    the following...

    x = 1

paul@29

379

    import ppp.qqq

paul@29

380

    print x

    ...and if ppp/qqq.py contained the following...

    import ppp

paul@29

385

    ppp.x = 2

    ...then the value 2 would be printed. Since modules are objects which are

paul@29

388

    registered globally in a program, it would be possible to set attributes

paul@29

389

    in the above way.

 2. Classes would need to be immutable after initialisation. However, since

paul@29

392

    classes are objects, any reference to a class after initialisation could

paul@29

393

    be used to set attributes on the class.

Solutions:

 1. Insist on global scope for module attribute assignments.

 2. Insist on local scope within classes.

Both of the above measures need to be enforced at run-time, since an arbitrary

paul@29

402

attribute assignment could be attempted on any kind of object, yet to uphold

paul@29

403

the properties of "safe containers", attempts to change attributes of such

paul@29

404

objects should be denied. Since foreseen attribute assignment operations have

paul@29

405

certain properties detectable at compile-time, it could be appropriate to

paul@29

406

generate special instructions (or modified instructions) during the

paul@29

407

initialisation of modules and classes for such foreseen assignments, whilst

paul@29

408

employing normal attribute assignment operations in all other cases. Indeed,

paul@29

409

the StoreAttr instruction, which is used to set attributes in "safe

paul@29

410

containers" would be used exclusively for this purpose; the StoreAttrIndex

paul@29

411

instruction would be used exclusively for all other attribute assignments.

To ensure the "sealing" of modules and classes, entries in the attribute

paul@43

414

lookup table would encode whether a class or module is being accessed, so

paul@43

415

that the StoreAttrIndex instruction could reject such accesses.

Constant Attribute Values

paul@28

418

-------------------------

Where an attribute value is itself regarded as constant, is a "safe container"

paul@29

421

and is used in an operation accessing its own attributes, the value can be

paul@29

422

directly inspected for optimisations or employed in the generated code. For

paul@29

423

the attribute values themselves, only objects of a constant nature may be

paul@28

424

considered suitable for this particular optimisation:

  * Classes

paul@28

427

  * Modules

paul@28

428

  * Instances defined as constant literals

This is because arbitrary objects (such as most instances) have no

paul@28

431

well-defined form before run-time and cannot be investigated further at

paul@28

432

compile-time or have a representation inserted into the generated code.

Class Attributes and Access via Instances

paul@29

435

-----------------------------------------

Unlike module attributes, class attributes can be accessed in a number of

paul@29

438

different ways:

  * Using the class itself:

    C.x = 123

paul@29

443

    cls = C; cls.x = 234

  * Using a subclass of the class (for reading attributes):

    class D(C):

paul@29

448

        pass

paul@29

449

    D.x # setting D.x would populate D, not C

  * Using instances of the class or a subclass of the class (for reading

paul@29

452

    attributes):

    c = C()

paul@29

455

    c.x # setting c.x would populate c, not C

Since assignments are only achieved using direct references to the class, and

paul@29

458

since class attributes should be defined only within the class initialisation

paul@29

459

process, the properties of class attributes should be consistent with those

paul@29

460

desired.

Method Access via Instances

paul@29

463

---------------------------

It is desirable to optimise method access, even though most method calls are

paul@29

466

likely to occur via instances. It is possible, given the properties of methods

paul@29

467

as class attributes to investigate the kind of instance that the self

paul@29

468

parameter/local refers to within each method: it should be an instance either

paul@29

469

of the class in which the method is defined or a compatible class, although

paul@29

470

situations exist where this might not be the case:

  * Explicit invocation of a method:

    d = D() # D is not related to C

paul@29

475

    C.f(d) # calling f(self) in C

If blatant usage of incompatible instances were somehow disallowed, it would

paul@30

478

still be necessary to investigate the properties of an instance's class and

paul@30

479

its relationship with other classes. Consider the following example:

    class A:

paul@30

482

        def f(self): ...

    class B:

paul@30

485

        def f(self): ...

paul@30

486

        def g(self):

paul@30

487

            self.f()

    class C(A, B):

paul@30

490

        pass

Here, instances of B passed into the method B.g could be assumed to provide

paul@30

493

access to B.f when self.f is resolved at compile-time. However, instances of C

paul@30

494

passed into B.g would instead provide access to A.f when self.f is resolved at

paul@30

495

compile-time (since the method resolution order is C, A, B instead of just B).

One solution might be to specialise methods for each instance type, but this

paul@30

498

could be costly. Another less ambitious solution might only involve the

paul@30

499

optimisation of such internal method calls if an unambiguous target can be

paul@30

500

resolved.

Optimising Function Invocations

paul@29

503

-------------------------------

Where an attribute value is itself regarded as constant and is a function,

paul@29

506

knowledge about the parameters of the function can be employed to optimise the

paul@29

507

preparation of the invocation frame.

micropython

Annotated README.txt