1 Namespace Definition
2 ====================
3
4 Module attributes are defined either at the module level or by global
5 statements.
6
7 Class attributes are defined only within class statements.
8
9 Instance attributes are defined only by assignments to attributes of self
10 within __init__ methods.
11
12 Potential Restrictions
13 ----------------------
14
15 Names of classes and functions could be restricted to only refer to those
16 objects within the same namespace. If redefinition were to occur, or if
17 multiple possibilities were present, these restrictions could be moderated as
18 follows:
19
20 * Classes assigned to the same name could provide the union of their
21 attributes. This would, however, cause a potential collision of attribute
22 definitions such as methods.
23
24 * Functions, if they share compatible signatures, could share parameter list
25 definitions.
26
27 Data Structures
28 ===============
29
30 The fundamental "value type" is a pair of references: one pointing to the
31 referenced object represented by the interchangeable value; one referring to
32 the context of the referenced object, typically the object through which the
33 referenced object was acquired as an attribute.
34
35 Value Layout
36 ------------
37
38 0 1
39 object context
40 reference reference
41
42 Acquiring Values
43 ----------------
44
45 Values are acquired through name lookups and attribute access, yielding
46 the appropriate object reference together with a context reference as
47 indicated in the following table:
48
49 Type of Access Context Notes
50 -------------- ------- -----
51
52 Local name Preserved Functions provide no context
53
54 Global name Preserved Modules provide no context
55
56 Class-originating Accessor Class accessor preserves the stored
57 attribute -or- context; instance accessor overrides
58 Preserved the stored context if it is null or
59 belongs to the instance's class
60 hierarchy
61
62 Instance-originating Preserved Methods retain their original context
63 attribute
64
65 There may be some scope for simplifying the above, to the detriment of Python
66 compatibility, since the unbound vs. bound methods situation can be confusing.
67
68 Manipulating Values
69 -------------------
70
71 According to the table describing value acquisition, different instructions
72 must implement different operations when acquiring values:
73
74 Instruction Purpose Context Operations
75 ----------- ------- ------------------
76
77 LoadConst Load class, function, Combine null context with loaded
78 module, constant object
79
80 LoadAddress Load attribute from Classes, functions and modules
81 known object stored as cause the loaded attribute to be
82 an attribute retrieved unchanged; whereas
83 constants (representing instances)
84 cause the constant to override the
85 attribute's own context (since all
86 attributes should belong to the
87 constant's class hierarchy)
88
89 LoadAttr Load attribute from Attributes with null contexts or
90 instance stored as an contexts compatible with the
91 attribute instance cause loaded attributes
92 to combine the instance as context
93 with the object from the
94 attribute; other attributes have
95 their context preserved
96
97 LoadAttrIndex Load attribute from Classes, functions and modules as
98 unknown object stored the unknown object accessor cause
99 as an attribute the loaded attribute to be
100 retrieved unchanged; whereas
101 instances cause the LoadAttr rules
102 to apply
103
104 Consequently, a certain amount of run-time testing is required for both
105 LoadAttr and LoadAttrIndex.
106
107 Objects
108 -------
109
110 Since classes, functions and instances are all "objects", each must support
111 certain features and operations in the same way.
112
113 The __class__ Attribute
114 -----------------------
115
116 All objects support the __class__ attribute:
117
118 Class: refers to the type class (type.__class__ also refers to the type class)
119 Function: refers to the function class
120 Instance: refers to the class instantiated to make the object
121
122 Invocation
123 ----------
124
125 The following actions need to be supported:
126
127 Class: create instance, call __init__ with instance, return object
128 Function: call function body, return result
129 Instance: call __call__ method, return result
130
131 Structure Layout
132 ----------------
133
134 A suitable structure layout might be something like this:
135
136 Identifier Address Type Object ...
137
138 0 1 2 3 4
139 classcode invocation __class__ attribute ...
140 reference reference reference
141
142 Here, the classcode refers to the attribute lookup table for the object. Since
143 classes and instances share the same classcode, they might resemble the
144 following:
145
146 Class C:
147
148 0 1 2 3 4
149 code for C __new__ class type attribute ...
150 reference reference reference
151
152 Instance of C:
153
154 0 1 2 3 4
155 code for C C.__call__ class C attribute ...
156 reference reference reference
157 (if exists)
158
159 The __new__ reference would lead to code consisting of the following
160 instructions:
161
162 create instance for C
163 call C.__init__(instance, ...)
164 return instance
165
166 If C has a __call__ attribute, the invocation "slot" of C instances would
167 refer to the same thing as C.__call__.
168
169 For functions, the same general layout applies:
170
171 Function f:
172
173 0 1 2 3 4
174 code for code class attribute ...
175 function reference function reference
176 reference
177
178 Here, the code reference would lead to code for the function. Note that the
179 function locals are completely distinct from this structure and are not
180 comparable to attributes.
181
182 For modules, there is no meaningful invocation reference:
183
184 Module m:
185
186 0 1 2 3 4
187 code for m (unused) module type attribute ...
188 reference (global)
189 reference
190
191 Both classes and modules have code in their definitions, but this would be
192 generated in order and not referenced externally.
193
194 Invocation Operation
195 --------------------
196
197 Consequently, regardless of the object an invocation is always done as
198 follows:
199
200 get invocation reference (at object+1)
201 jump to reference
202
203 Additional preparation is necessary before the above code: positional
204 arguments must be saved to the parameter stack, and keyword arguments must be
205 resolved and saved to the appropriate position in the parameter stack.
206
207 Attribute Operations
208 --------------------
209
210 Attribute access needs to go through the attribute lookup table. Some
211 optimisations are possible and are described in the appropriate section.
212
213 One important aspect of attribute access is the appropriate setting of the
214 context in the acquired attribute value. From the table describing the
215 acquisition of values, it is clear that the principal exception is that where
216 a class-originating attribute is accessed on an instance. Consequently, the
217 following algorithm could be employed once an attribute has been located:
218
219 1. If the attribute's context is a special value, indicating that it should
220 be replaced upon instance access, then proceed to the next step;
221 otherwise, acquire both the context and the object as they are.
222
223 2. If the accessor is an instance, use that as the value's context, acquiring
224 only the object from the attribute.
225
226 Where accesses can be determined ahead of time (as discussed in the
227 optimisations section), the above algorithm may not necessarily be employed in
228 the generated code for some accesses.
229
230 Instruction Evaluation Model
231 ============================
232
233 Programs use a value stack where evaluated instructions may save their
234 results. A value stack pointer indicates the top of this stack. In addition, a
235 separate stack is used to record the invocation frames. All stack pointers
236 refer to the next address to be used by the stack, not the address of the
237 uppermost element.
238
239 Frame Stack Value Stack
240 ----------- ----------- Address of Callable
241 -------------------
242 previous ...
243 current ------> callable -----> identifier
244 arg1 reference to code
245 arg2
246 arg3
247 local4
248 local5
249 ...
250
251 Loading local names is a matter of performing frame-relative accesses to the
252 value stack.
253
254 Invocations and Argument Evaluation
255 -----------------------------------
256
257 When preparing for an invocation, the caller first sets the invocation frame
258 pointer. Then, positional arguments are added to the stack such that the first
259 argument positions are filled. A number of stack locations for the remaining
260 arguments specified in the program are then reserved. The names of keyword
261 arguments are used (in the form of table index values) to consult the
262 parameter table and to find the location in which such arguments are to be
263 stored.
264
265 fn(a, b, d=1, e=2, c=3) -> fn(a, b, c, d, e)
266
267 Value Stack
268 -----------
269
270 ... ... ... ...
271 fn fn fn fn
272 a a a a
273 b b b b
274 ___ ___ ___ --> 3
275 ___ --> 1 1 | 1
276 ___ | ___ --> 2 | 2
277 1 ----------- 2 ----------- 3 -----------
278
279 Conceptually, the frame can be considered as a collection of attributes, as
280 seen in other kinds of structures:
281
282 Frame for invocation of fn:
283
284 0 1 2 3 4 5
285 code a b c d e
286 reference
287
288 However, where arguments are specified positionally, such "attributes" are not
289 set using a comparable approach to that employed with other structures.
290 Keyword arguments are set using an attribute-like mechanism, though, where the
291 position of each argument discovered using the parameter table.
292
293 Where the target of the invocation is known, the above procedure can be
294 optimised slightly by attempting to add keyword argument values directly to
295 the stack:
296
297 Value Stack
298 -----------
299
300 ... ... ... ... ...
301 fn fn fn fn fn
302 a a a a a
303 b b b b b
304 ___ ___ ___ --> 3
305 1 1 1 | 1
306 2 1 | 2
307 3 -----------
308
309 (reserve for (add in-place) (add in-place) (evaluate) (store by
310 parameter c) index)
311
312 Method Invocations
313 ------------------
314
315 Method invocations incorporate an implicit first argument which is obtained
316 from the context of the method:
317
318 method(a, b, d=1, e=2, c=3) -> method(self, a, b, c, d, e)
319
320 Value Stack
321 -----------
322
323 ...
324 method
325 context of method
326 a
327 b
328 3
329 1
330 2
331
332 Although it could be possible to permit any object to be provided as the first
333 argument, in order to optimise instance attribute access in methods, we should
334 seek to restrict the object type.
335
336 Verifying Supplied Arguments
337 ----------------------------
338
339 In order to ensure a correct invocation, it is also necessary to check the
340 number of supplied arguments. If the target of the invocation is known at
341 compile-time, no additional instructions need to be emitted; otherwise, the
342 generated code must test for the following situations:
343
344 1. That the number of supplied arguments is equal to the number of expected
345 parameters.
346
347 2. That no keyword argument overwrites an existing positional parameter.
348
349 Default Arguments
350 -----------------
351
352 Some arguments may have default values which are used if no value is provided
353 in an invocation. Such defaults are initialised when the function itself is
354 initialised, and are used to fill in any invocation frames that are known at
355 compile-time.
356
357 Tuples, Frames and Allocation
358 -----------------------------
359
360 Using the approach where arguments are treated like attributes in some kind of
361 structure, we could choose to allocate frames in places other than a stack.
362 This would produce something somewhat similar to a plain tuple object.
363
364 Optimisations
365 =============
366
367 Some optimisations around constant objects might be possible; these depend on
368 the following:
369
370 * Reliable tracking of assignments: where assignment operations occur, the
371 target of the assignment should be determined if any hope of optimisation
372 is to be maintained. Where no guarantees can be made about the target of
373 an assignment, no assignment-related information should be written to
374 potential targets.
375
376 * Objects acting as "containers" of attributes must be regarded as "safe":
377 where assignments are recorded as occurring on an attribute, it must be
378 guaranteed that no other unforeseen ways exist to assign to such
379 attributes.
380
381 The discussion below presents certain rules which must be imposed to uphold
382 the above requirements.
383
384 Safe Containers
385 ---------------
386
387 Where attributes of modules, classes and instances are only set once and are
388 effectively constant, it should be possible to circumvent the attribute lookup
389 mechanism and to directly reference the attribute value. This technique may
390 only be considered applicable for the following "container" objects, subject
391 to the noted constraints:
392
393 1. For modules, "safety" is enforced by ensuring that assignments to module
394 attributes are only permitted within the module itself either at the
395 top-level or via names declared as globals. Thus, the following would not
396 be permitted:
397
398 another_module.this_module.attr = value
399
400 In the above, this_module is a reference to the current module.
401
402 2. For classes, "safety" is enforced by ensuring that assignments to class
403 attributes are only permitted within the class definition, outside
404 methods. This would mean that classes would be "sealed" at definition time
405 (like functions).
406
407 Unlike the property of function locals that they may only sensibly be accessed
408 within the function in which they reside, these cases demand additional
409 controls or assumptions on or about access to the stored data. Meanwhile, it
410 would be difficult to detect eligible attributes on arbitrary instances due to
411 the need for some kind of type inference or abstract execution.
412
413 Constant Attributes
414 -------------------
415
416 When accessed via "safe containers", as described above, any attribute with
417 only one recorded assignment on it can be considered a constant attribute and
418 this eligible for optimisation, the consequence of which would be the
419 replacement of a LoadAttrIndex instruction (which needs to look up an
420 attribute using the run-time details of the "container" and the compile-time
421 details of the attribute) with a LoadAttr instruction.
422
423 However, some restrictions exist on assignment operations which may be
424 regarded to cause only one assignment in the lifetime of a program:
425
426 1. For module attributes, only assignments at the top-level outside loop
427 statements can be reliably assumed to cause only a single assignment.
428
429 2. For class attributes, only assignments at the top-level within class
430 definitions and outside loop statements can be reliably assumed to cause
431 only a single assignment.
432
433 All assignments satisfying the "safe container" requirements, but not the
434 requirements immediately above, should each be recorded as causing at least
435 one assignment.
436
437 Additional Controls
438 -------------------
439
440 For the above cases for "container" objects, the following controls would need
441 to apply:
442
443 1. Modules would need to be immutable after initialisation. However, during
444 initialisation, there remains a possibility of another module attempting
445 to access the original module. For example, if ppp/__init__.py contained
446 the following...
447
448 x = 1
449 import ppp.qqq
450 print x
451
452 ...and if ppp/qqq.py contained the following...
453
454 import ppp
455 ppp.x = 2
456
457 ...then the value 2 would be printed. Since modules are objects which are
458 registered globally in a program, it would be possible to set attributes
459 in the above way.
460
461 2. Classes would need to be immutable after initialisation. However, since
462 classes are objects, any reference to a class after initialisation could
463 be used to set attributes on the class.
464
465 Solutions:
466
467 1. Insist on global scope for module attribute assignments.
468
469 2. Insist on local scope within classes.
470
471 Both of the above measures need to be enforced at run-time, since an arbitrary
472 attribute assignment could be attempted on any kind of object, yet to uphold
473 the properties of "safe containers", attempts to change attributes of such
474 objects should be denied. Since foreseen attribute assignment operations have
475 certain properties detectable at compile-time, it could be appropriate to
476 generate special instructions (or modified instructions) during the
477 initialisation of modules and classes for such foreseen assignments, whilst
478 employing normal attribute assignment operations in all other cases. Indeed,
479 the StoreAttr instruction, which is used to set attributes in "safe
480 containers" would be used exclusively for this purpose; the StoreAttrIndex
481 instruction would be used exclusively for all other attribute assignments.
482
483 To ensure the "sealing" of modules and classes, entries in the attribute
484 lookup table would encode whether a class or module is being accessed, so
485 that the StoreAttrIndex instruction could reject such accesses.
486
487 Constant Attribute Values
488 -------------------------
489
490 Where an attribute value is itself regarded as constant, is a "safe container"
491 and is used in an operation accessing its own attributes, the value can be
492 directly inspected for optimisations or employed in the generated code. For
493 the attribute values themselves, only objects of a constant nature may be
494 considered suitable for this particular optimisation:
495
496 * Classes
497 * Modules
498 * Instances defined as constant literals
499
500 This is because arbitrary objects (such as most instances) have no
501 well-defined form before run-time and cannot be investigated further at
502 compile-time or have a representation inserted into the generated code.
503
504 Class Attributes and Access via Instances
505 -----------------------------------------
506
507 Unlike module attributes, class attributes can be accessed in a number of
508 different ways:
509
510 * Using the class itself:
511
512 C.x = 123
513 cls = C; cls.x = 234
514
515 * Using a subclass of the class (for reading attributes):
516
517 class D(C):
518 pass
519 D.x # setting D.x would populate D, not C
520
521 * Using instances of the class or a subclass of the class (for reading
522 attributes):
523
524 c = C()
525 c.x # setting c.x would populate c, not C
526
527 Since assignments are only achieved using direct references to the class, and
528 since class attributes should be defined only within the class initialisation
529 process, the properties of class attributes should be consistent with those
530 desired.
531
532 Method Access via Instances
533 ---------------------------
534
535 It is desirable to optimise method access, even though most method calls are
536 likely to occur via instances. It is possible, given the properties of methods
537 as class attributes to investigate the kind of instance that the self
538 parameter/local refers to within each method: it should be an instance either
539 of the class in which the method is defined or a compatible class, although
540 situations exist where this might not be the case:
541
542 * Explicit invocation of a method:
543
544 d = D() # D is not related to C
545 C.f(d) # calling f(self) in C
546
547 If blatant usage of incompatible instances were somehow disallowed, it would
548 still be necessary to investigate the properties of an instance's class and
549 its relationship with other classes. Consider the following example:
550
551 class A:
552 def f(self): ...
553
554 class B:
555 def f(self): ...
556 def g(self):
557 self.f()
558
559 class C(A, B):
560 pass
561
562 Here, instances of B passed into the method B.g could be assumed to provide
563 access to B.f when self.f is resolved at compile-time. However, instances of C
564 passed into B.g would instead provide access to A.f when self.f is resolved at
565 compile-time (since the method resolution order is C, A, B instead of just B).
566
567 One solution might be to specialise methods for each instance type, but this
568 could be costly. Another less ambitious solution might only involve the
569 optimisation of such internal method calls if an unambiguous target can be
570 resolved.
571
572 Optimising Function Invocations
573 -------------------------------
574
575 Where an attribute value is itself regarded as constant and is a function,
576 knowledge about the parameters of the function can be employed to optimise the
577 preparation of the invocation frame.