1 Namespace Definition
2 ====================
3
4 Module attributes are defined either at the module level or by global
5 statements.
6
7 Class attributes are defined only within class statements.
8
9 Instance attributes are defined only by assignments to attributes of self
10 within __init__ methods.
11
12 Potential Restrictions
13 ----------------------
14
15 Names of classes and functions could be restricted to only refer to those
16 objects within the same namespace. If redefinition were to occur, or if
17 multiple possibilities were present, these restrictions could be moderated as
18 follows:
19
20 * Classes assigned to the same name could provide the union of their
21 attributes. This would, however, cause a potential collision of attribute
22 definitions such as methods.
23
24 * Functions, if they share compatible signatures, could share parameter list
25 definitions.
26
27 Data Structures
28 ===============
29
30 The fundamental "value type" is a pair of references: one pointing to the
31 referenced object represented by the interchangeable value; one referring to
32 the context of the referenced object, typically the object through which the
33 referenced object was acquired as an attribute.
34
35 Value Layout
36 ------------
37
38 0 1
39 object context
40 reference reference
41
42 Acquiring Values
43 ----------------
44
45 Values are acquired through name lookups and attribute access, yielding
46 the appropriate object reference together with a context reference as
47 indicated in the following table:
48
49 Type of Access Context Notes
50 -------------- ------- -----
51
52 Local name Preserved Functions provide no context
53
54 Global name Preserved Modules provide no context
55
56 Class-originating Accessor Methods acquire the context of their
57 attribute -or- accessor if an instance...
58 Preserved or retain the original context if the
59 accessor is a class
60
61 Instance-originating Preserved Methods retain their original context
62 attribute
63
64 There may be some scope for simplifying the above, to the detriment of Python
65 compatibility, since the unbound vs. bound methods situation can be confusing.
66
67 Manipulating Values
68 -------------------
69
70 According to the table describing value acquisition, different instructions
71 must implement different operations when acquiring values:
72
73 Instruction Purpose Context Operations
74 ----------- ------- ------------------
75
76 LoadConst Load class, function, Combine null context with loaded
77 module, constant object
78
79 LoadAddress Load attribute from Classes, functions and modules
80 known object stored as cause the loaded attribute to be
81 an attribute retrieved unchanged...
82 whereas constants (representing
83 instances) cause the constant to
84 override the attribute's own
85 context
86
87 LoadAttr Load attribute from Combine the instance as context
88 instance stored as an with the object from the attribute
89 attribute
90
91 LoadAttrIndex Load attribute from Classes, functions and modules as
92 unknown object stored the unknown object accessor cause
93 as an attribute the loaded attribute to be
94 retrieved unchanged...
95 whereas instances cause the
96 accessor to override the
97 attribute's own context
98
99 Note that of the above context operations, only the decision logic in
100 connection with LoadAttrIndex is performed at run-time.
101
102 Objects
103 -------
104
105 Since classes, functions and instances are all "objects", each must support
106 certain features and operations in the same way.
107
108 The __class__ Attribute
109 -----------------------
110
111 All objects support the __class__ attribute:
112
113 Class: refers to the type class (type.__class__ also refers to the type class)
114 Function: refers to the function class
115 Instance: refers to the class instantiated to make the object
116
117 Invocation
118 ----------
119
120 The following actions need to be supported:
121
122 Class: create instance, call __init__ with instance, return object
123 Function: call function body, return result
124 Instance: call __call__ method, return result
125
126 Structure Layout
127 ----------------
128
129 A suitable structure layout might be something like this:
130
131 Identifier Address Type Object ...
132
133 0 1 2 3 4
134 classcode invocation __class__ attribute ...
135 reference reference reference
136
137 Here, the classcode refers to the attribute lookup table for the object. Since
138 classes and instances share the same classcode, they might resemble the
139 following:
140
141 Class C:
142
143 0 1 2 3 4
144 code for C __new__ class type attribute ...
145 reference reference reference
146
147 Instance of C:
148
149 0 1 2 3 4
150 code for C C.__call__ class C attribute ...
151 reference reference reference
152 (if exists)
153
154 The __new__ reference would lead to code consisting of the following
155 instructions:
156
157 create instance for C
158 call C.__init__(instance, ...)
159 return instance
160
161 If C has a __call__ attribute, the invocation "slot" of C instances would
162 refer to the same thing as C.__call__.
163
164 For functions, the same general layout applies:
165
166 Function f:
167
168 0 1 2 3 4
169 code for code class attribute ...
170 function reference function reference
171 reference
172
173 Here, the code reference would lead to code for the function. Note that the
174 function locals are completely distinct from this structure and are not
175 comparable to attributes.
176
177 For modules, there is no meaningful invocation reference:
178
179 Module m:
180
181 0 1 2 3 4
182 code for m (unused) module type attribute ...
183 reference (global)
184 reference
185
186 Both classes and modules have code in their definitions, but this would be
187 generated in order and not referenced externally.
188
189 Invocation Operation
190 --------------------
191
192 Consequently, regardless of the object an invocation is always done as
193 follows:
194
195 get invocation reference (at object+1)
196 jump to reference
197
198 Additional preparation is necessary before the above code: positional
199 arguments must be saved to the parameter stack, and keyword arguments must be
200 resolved and saved to the appropriate position in the parameter stack.
201
202 Attribute Operations
203 --------------------
204
205 Attribute access needs to go through the attribute lookup table. Some
206 optimisations are possible and are described in the appropriate section.
207
208 One important aspect of attribute access is the appropriate setting of the
209 context in the acquired attribute value. From the table describing the
210 acquisition of values, it is clear that the principal exception is that where
211 a class-originating attribute is accessed on an instance. Consequently, the
212 following algorithm could be employed once an attribute has been located:
213
214 1. If the attribute's context is a special value, indicating that it should
215 be replaced upon instance access, then proceed to the next step;
216 otherwise, acquire both the context and the object as they are.
217
218 2. If the accessor is an instance, use that as the value's context, acquiring
219 only the object from the attribute.
220
221 Where accesses can be determined ahead of time (as discussed in the
222 optimisations section), the above algorithm may not necessarily be employed in
223 the generated code for some accesses.
224
225 Instruction Evaluation Model
226 ============================
227
228 Programs use a value stack where evaluated instructions may save their
229 results. A value stack pointer indicates the top of this stack. In addition, a
230 separate stack is used to record the invocation frames. All stack pointers
231 refer to the next address to be used by the stack, not the address of the
232 uppermost element.
233
234 Frame Stack Value Stack
235 ----------- ----------- Address of Callable
236 -------------------
237 previous ...
238 current ------> callable -----> identifier
239 arg1 reference to code
240 arg2
241 arg3
242 local4
243 local5
244 ...
245
246 Loading local names is a matter of performing frame-relative accesses to the
247 value stack.
248
249 Invocations and Argument Evaluation
250 -----------------------------------
251
252 When preparing for an invocation, the caller first sets the invocation frame
253 pointer. Then, positional arguments are added to the stack such that the first
254 argument positions are filled. A number of stack locations for the remaining
255 arguments specified in the program are then reserved. The names of keyword
256 arguments are used (in the form of table index values) to consult the
257 parameter table and to find the location in which such arguments are to be
258 stored.
259
260 fn(a, b, d=1, e=2, c=3) -> fn(a, b, c, d, e)
261
262 Value Stack
263 -----------
264
265 ... ... ... ...
266 fn fn fn fn
267 a a a a
268 b b b b
269 ___ ___ ___ --> 3
270 ___ --> 1 1 | 1
271 ___ | ___ --> 2 | 2
272 1 ----------- 2 ----------- 3 -----------
273
274 Conceptually, the frame can be considered as a collection of attributes, as
275 seen in other kinds of structures:
276
277 Frame for invocation of fn:
278
279 0 1 2 3 4 5
280 code a b c d e
281 reference
282
283 However, where arguments are specified positionally, such "attributes" are not
284 set using a comparable approach to that employed with other structures.
285 Keyword arguments are set using an attribute-like mechanism, though, where the
286 position of each argument discovered using the parameter table.
287
288 Where the target of the invocation is known, the above procedure can be
289 optimised slightly by attempting to add keyword argument values directly to
290 the stack:
291
292 Value Stack
293 -----------
294
295 ... ... ... ... ...
296 fn fn fn fn fn
297 a a a a a
298 b b b b b
299 ___ ___ ___ --> 3
300 1 1 1 | 1
301 2 1 | 2
302 3 -----------
303
304 (reserve for (add in-place) (add in-place) (evaluate) (store by
305 parameter c) index)
306
307 Method Invocations
308 ------------------
309
310 Method invocations incorporate an implicit first argument which is obtained
311 from the context of the method:
312
313 method(a, b, d=1, e=2, c=3) -> method(self, a, b, c, d, e)
314
315 Value Stack
316 -----------
317
318 ...
319 method
320 context of method
321 a
322 b
323 3
324 1
325 2
326
327 Although it could be possible to permit any object to be provided as the first
328 argument, in order to optimise instance attribute access in methods, we should
329 seek to restrict the object type.
330
331 Verifying Supplied Arguments
332 ----------------------------
333
334 In order to ensure a correct invocation, it is also necessary to check the
335 number of supplied arguments. If the target of the invocation is known at
336 compile-time, no additional instructions need to be emitted; otherwise, the
337 generated code must test for the following situations:
338
339 1. That the number of supplied arguments is equal to the number of expected
340 parameters.
341
342 2. That no keyword argument overwrites an existing positional parameter.
343
344 Default Arguments
345 -----------------
346
347 Some arguments may have default values which are used if no value is provided
348 in an invocation. Such defaults are initialised when the function itself is
349 initialised, and are used to fill in any invocation frames that are known at
350 compile-time.
351
352 Tuples, Frames and Allocation
353 -----------------------------
354
355 Using the approach where arguments are treated like attributes in some kind of
356 structure, we could choose to allocate frames in places other than a stack.
357 This would produce something somewhat similar to a plain tuple object.
358
359 Optimisations
360 =============
361
362 Some optimisations around constant objects might be possible; these depend on
363 the following:
364
365 * Reliable tracking of assignments: where assignment operations occur, the
366 target of the assignment should be determined if any hope of optimisation
367 is to be maintained. Where no guarantees can be made about the target of
368 an assignment, no assignment-related information should be written to
369 potential targets.
370
371 * Objects acting as "containers" of attributes must be regarded as "safe":
372 where assignments are recorded as occurring on an attribute, it must be
373 guaranteed that no other unforeseen ways exist to assign to such
374 attributes.
375
376 The discussion below presents certain rules which must be imposed to uphold
377 the above requirements.
378
379 Safe Containers
380 ---------------
381
382 Where attributes of modules, classes and instances are only set once and are
383 effectively constant, it should be possible to circumvent the attribute lookup
384 mechanism and to directly reference the attribute value. This technique may
385 only be considered applicable for the following "container" objects, subject
386 to the noted constraints:
387
388 1. For modules, "safety" is enforced by ensuring that assignments to module
389 attributes are only permitted within the module itself either at the
390 top-level or via names declared as globals. Thus, the following would not
391 be permitted:
392
393 another_module.this_module.attr = value
394
395 In the above, this_module is a reference to the current module.
396
397 2. For classes, "safety" is enforced by ensuring that assignments to class
398 attributes are only permitted within the class definition, outside
399 methods. This would mean that classes would be "sealed" at definition time
400 (like functions).
401
402 Unlike the property of function locals that they may only sensibly be accessed
403 within the function in which they reside, these cases demand additional
404 controls or assumptions on or about access to the stored data. Meanwhile, it
405 would be difficult to detect eligible attributes on arbitrary instances due to
406 the need for some kind of type inference or abstract execution.
407
408 Constant Attributes
409 -------------------
410
411 When accessed via "safe containers", as described above, any attribute with
412 only one recorded assignment on it can be considered a constant attribute and
413 this eligible for optimisation, the consequence of which would be the
414 replacement of a LoadAttrIndex instruction (which needs to look up an
415 attribute using the run-time details of the "container" and the compile-time
416 details of the attribute) with a LoadAttr instruction.
417
418 However, some restrictions exist on assignment operations which may be
419 regarded to cause only one assignment in the lifetime of a program:
420
421 1. For module attributes, only assignments at the top-level outside loop
422 statements can be reliably assumed to cause only a single assignment.
423
424 2. For class attributes, only assignments at the top-level within class
425 definitions and outside loop statements can be reliably assumed to cause
426 only a single assignment.
427
428 All assignments satisfying the "safe container" requirements, but not the
429 requirements immediately above, should each be recorded as causing at least
430 one assignment.
431
432 Additional Controls
433 -------------------
434
435 For the above cases for "container" objects, the following controls would need
436 to apply:
437
438 1. Modules would need to be immutable after initialisation. However, during
439 initialisation, there remains a possibility of another module attempting
440 to access the original module. For example, if ppp/__init__.py contained
441 the following...
442
443 x = 1
444 import ppp.qqq
445 print x
446
447 ...and if ppp/qqq.py contained the following...
448
449 import ppp
450 ppp.x = 2
451
452 ...then the value 2 would be printed. Since modules are objects which are
453 registered globally in a program, it would be possible to set attributes
454 in the above way.
455
456 2. Classes would need to be immutable after initialisation. However, since
457 classes are objects, any reference to a class after initialisation could
458 be used to set attributes on the class.
459
460 Solutions:
461
462 1. Insist on global scope for module attribute assignments.
463
464 2. Insist on local scope within classes.
465
466 Both of the above measures need to be enforced at run-time, since an arbitrary
467 attribute assignment could be attempted on any kind of object, yet to uphold
468 the properties of "safe containers", attempts to change attributes of such
469 objects should be denied. Since foreseen attribute assignment operations have
470 certain properties detectable at compile-time, it could be appropriate to
471 generate special instructions (or modified instructions) during the
472 initialisation of modules and classes for such foreseen assignments, whilst
473 employing normal attribute assignment operations in all other cases. Indeed,
474 the StoreAttr instruction, which is used to set attributes in "safe
475 containers" would be used exclusively for this purpose; the StoreAttrIndex
476 instruction would be used exclusively for all other attribute assignments.
477
478 To ensure the "sealing" of modules and classes, entries in the attribute
479 lookup table would encode whether a class or module is being accessed, so
480 that the StoreAttrIndex instruction could reject such accesses.
481
482 Constant Attribute Values
483 -------------------------
484
485 Where an attribute value is itself regarded as constant, is a "safe container"
486 and is used in an operation accessing its own attributes, the value can be
487 directly inspected for optimisations or employed in the generated code. For
488 the attribute values themselves, only objects of a constant nature may be
489 considered suitable for this particular optimisation:
490
491 * Classes
492 * Modules
493 * Instances defined as constant literals
494
495 This is because arbitrary objects (such as most instances) have no
496 well-defined form before run-time and cannot be investigated further at
497 compile-time or have a representation inserted into the generated code.
498
499 Class Attributes and Access via Instances
500 -----------------------------------------
501
502 Unlike module attributes, class attributes can be accessed in a number of
503 different ways:
504
505 * Using the class itself:
506
507 C.x = 123
508 cls = C; cls.x = 234
509
510 * Using a subclass of the class (for reading attributes):
511
512 class D(C):
513 pass
514 D.x # setting D.x would populate D, not C
515
516 * Using instances of the class or a subclass of the class (for reading
517 attributes):
518
519 c = C()
520 c.x # setting c.x would populate c, not C
521
522 Since assignments are only achieved using direct references to the class, and
523 since class attributes should be defined only within the class initialisation
524 process, the properties of class attributes should be consistent with those
525 desired.
526
527 Method Access via Instances
528 ---------------------------
529
530 It is desirable to optimise method access, even though most method calls are
531 likely to occur via instances. It is possible, given the properties of methods
532 as class attributes to investigate the kind of instance that the self
533 parameter/local refers to within each method: it should be an instance either
534 of the class in which the method is defined or a compatible class, although
535 situations exist where this might not be the case:
536
537 * Explicit invocation of a method:
538
539 d = D() # D is not related to C
540 C.f(d) # calling f(self) in C
541
542 If blatant usage of incompatible instances were somehow disallowed, it would
543 still be necessary to investigate the properties of an instance's class and
544 its relationship with other classes. Consider the following example:
545
546 class A:
547 def f(self): ...
548
549 class B:
550 def f(self): ...
551 def g(self):
552 self.f()
553
554 class C(A, B):
555 pass
556
557 Here, instances of B passed into the method B.g could be assumed to provide
558 access to B.f when self.f is resolved at compile-time. However, instances of C
559 passed into B.g would instead provide access to A.f when self.f is resolved at
560 compile-time (since the method resolution order is C, A, B instead of just B).
561
562 One solution might be to specialise methods for each instance type, but this
563 could be costly. Another less ambitious solution might only involve the
564 optimisation of such internal method calls if an unambiguous target can be
565 resolved.
566
567 Optimising Function Invocations
568 -------------------------------
569
570 Where an attribute value is itself regarded as constant and is a function,
571 knowledge about the parameters of the function can be employed to optimise the
572 preparation of the invocation frame.