1 Namespace Definition
2 ====================
3
4 Module attributes are defined either at the module level or by global
5 statements.
6
7 Class attributes are defined only within class statements.
8
9 Instance attributes are defined only by assignments to attributes of self
10 within __init__ methods.
11
12 Potential Restrictions
13 ----------------------
14
15 Names of classes and functions could be restricted to only refer to those
16 objects within the same namespace. If redefinition were to occur, or if
17 multiple possibilities were present, these restrictions could be moderated as
18 follows:
19
20 * Classes assigned to the same name could provide the union of their
21 attributes. This would, however, cause a potential collision of attribute
22 definitions such as methods.
23
24 * Functions, if they share compatible signatures, could share parameter list
25 definitions.
26
27 Data Structures
28 ===============
29
30 The fundamental "value type" is a pair of references: one pointing to the
31 referenced object represented by the interchangeable value; one referring to
32 the context of the referenced object, typically the object through which the
33 referenced object was acquired as an attribute.A
34
35 Value Layout
36 ------------
37
38 0 1
39 object context
40 reference reference
41
42 Acquiring Values
43 ----------------
44
45 Values are acquired through name lookups and attribute access, yielding
46 the appropriate object reference together with a context reference as
47 indicated in the following table:
48
49 Type of Access Context Notes
50 -------------- ------- -----
51
52 Local name Preserved Functions provide no context
53
54 Global name Preserved Modules provide no context
55
56 Class-originating Accessor Methods acquire the context of their
57 attribute -or- accessor if an instance...
58 Preserved or retain the original context if the
59 accessor is a class
60
61 Instance-originating Preserved Methods retain their original context
62 attribute
63
64 There may be some scope for simplifying the above, to the detriment of Python
65 compatibility, since the unbound vs. bound methods situation can be confusing.
66
67 Objects
68 -------
69
70 Since classes, functions and instances are all "objects", each must support
71 certain features and operations in the same way.
72
73 The __class__ Attribute
74 -----------------------
75
76 All objects support the __class__ attribute:
77
78 Class: refers to the type class (type.__class__ also refers to the type class)
79 Function: refers to the function class
80 Instance: refers to the class instantiated to make the object
81
82 Invocation
83 ----------
84
85 The following actions need to be supported:
86
87 Class: create instance, call __init__ with instance, return object
88 Function: call function body, return result
89 Instance: call __call__ method, return result
90
91 Structure Layout
92 ----------------
93
94 A suitable structure layout might be something like this:
95
96 Identifier Address Type Object ...
97
98 0 1 2 3 4
99 classcode invocation __class__ attribute ...
100 reference reference reference
101
102 Here, the classcode refers to the attribute lookup table for the object. Since
103 classes and instances share the same classcode, they might resemble the
104 following:
105
106 Class C:
107
108 0 1 2 3 4
109 code for C __new__ class type attribute ...
110 reference reference reference
111
112 Instance of C:
113
114 0 1 2 3 4
115 code for C C.__call__ class C attribute ...
116 reference reference reference
117 (if exists)
118
119 The __new__ reference would lead to code consisting of the following
120 instructions:
121
122 create instance for C
123 call C.__init__(instance, ...)
124 return instance
125
126 If C has a __call__ attribute, the invocation "slot" of C instances would
127 refer to the same thing as C.__call__.
128
129 For functions, the same general layout applies:
130
131 Function f:
132
133 0 1 2 3 4
134 code for code class attribute ...
135 function reference function reference
136 reference
137
138 Here, the code reference would lead to code for the function. Note that the
139 function locals are completely distinct from this structure and are not
140 comparable to attributes.
141
142 For modules, there is no meaningful invocation reference:
143
144 Module m:
145
146 0 1 2 3 4
147 code for m (unused) module type attribute ...
148 reference (global)
149 reference
150
151 Both classes and modules have code in their definitions, but this would be
152 generated in order and not referenced externally.
153
154 Invocation Operation
155 --------------------
156
157 Consequently, regardless of the object an invocation is always done as
158 follows:
159
160 get invocation reference (at object+1)
161 jump to reference
162
163 Additional preparation is necessary before the above code: positional
164 arguments must be saved to the parameter stack, and keyword arguments must be
165 resolved and saved to the appropriate position in the parameter stack.
166
167 Attribute Operations
168 --------------------
169
170 Attribute access needs to go through the attribute lookup table. Some
171 optimisations are possible and are described in the appropriate section.
172
173 One important aspect of attribute access is the appropriate setting of the
174 context in the acquired attribute value. From the table describing the
175 acquisition of values, it is clear that the principal exception is that where
176 a class-originating attribute is accessed on an instance. Consequently, the
177 following algorithm could be employed once an attribute has been located:
178
179 1. If the attribute's context is a special value, indicating that it should
180 be replaced upon instance access, then proceed to the next step;
181 otherwise, acquire both the context and the object as they are.
182
183 2. If the accessor is an instance, use that as the value's context, acquiring
184 only the object from the attribute.
185
186 Where accesses can be determined ahead of time (as discussed in the
187 optimisations section), the above algorithm may not necessarily be employed in
188 the generated code for some accesses.
189
190 Instruction Evaluation Model
191 ============================
192
193 Programs use a value stack where evaluated instructions may save their
194 results. A value stack pointer indicates the top of this stack. In addition, a
195 separate stack is used to record the invocation frames. All stack pointers
196 refer to the next address to be used by the stack, not the address of the
197 uppermost element.
198
199 Frame Stack Value Stack
200 ----------- ----------- Address of Callable
201 -------------------
202 previous ...
203 current ------> callable -----> identifier
204 arg1 reference to code
205 arg2
206 arg3
207 local4
208 local5
209 ...
210
211 Loading local names is a matter of performing frame-relative accesses to the
212 value stack.
213
214 Invocations and Argument Evaluation
215 -----------------------------------
216
217 When preparing for an invocation, the caller first sets the invocation frame
218 pointer. Then, positional arguments are added to the stack such that the first
219 argument positions are filled. A number of stack locations for the remaining
220 arguments specified in the program are then reserved. The names of keyword
221 arguments are used (in the form of table index values) to consult the
222 parameter table and to find the location in which such arguments are to be
223 stored.
224
225 fn(a, b, d=1, e=2, c=3) -> fn(a, b, c, d, e)
226
227 Value Stack
228 -----------
229
230 ... ... ... ...
231 fn fn fn fn
232 a a a a
233 b b b b
234 ___ ___ ___ --> 3
235 ___ --> 1 1 | 1
236 ___ | ___ --> 2 | 2
237 1 ----------- 2 ----------- 3 -----------
238
239 Conceptually, the frame can be considered as a collection of attributes, as
240 seen in other kinds of structures:
241
242 Frame for invocation of fn:
243
244 0 1 2 3 4 5
245 code a b c d e
246 reference
247
248 However, where arguments are specified positionally, such "attributes" are not
249 set using a comparable approach to that employed with other structures.
250 Keyword arguments are set using an attribute-like mechanism, though, where the
251 position of each argument discovered using the parameter table.
252
253 Where the target of the invocation is known, the above procedure can be
254 optimised slightly by attempting to add keyword argument values directly to
255 the stack:
256
257 Value Stack
258 -----------
259
260 ... ... ... ... ...
261 fn fn fn fn fn
262 a a a a a
263 b b b b b
264 ___ ___ ___ --> 3
265 1 1 1 | 1
266 2 1 | 2
267 3 -----------
268
269 (reserve for (add in-place) (add in-place) (evaluate) (store by
270 parameter c) index)
271
272 Method Invocations
273 ------------------
274
275 Method invocations incorporate an implicit first argument which is obtained
276 from the context of the method:
277
278 method(a, b, d=1, e=2, c=3) -> method(self, a, b, c, d, e)
279
280 Value Stack
281 -----------
282
283 ...
284 method
285 context of method
286 a
287 b
288 3
289 1
290 2
291
292 Although it could be possible to permit any object to be provided as the first
293 argument, in order to optimise instance attribute access in methods, we should
294 seek to restrict the object type.
295
296 Verifying Supplied Arguments
297 ----------------------------
298
299 In order to ensure a correct invocation, it is also necessary to check the
300 number of supplied arguments. If the target of the invocation is known at
301 compile-time, no additional instructions need to be emitted; otherwise, the
302 generated code must test for the following situations:
303
304 1. That the number of supplied arguments is equal to the number of expected
305 parameters.
306
307 2. That no keyword argument overwrites an existing positional parameter.
308
309 Default Arguments
310 -----------------
311
312 Some arguments may have default values which are used if no value is provided
313 in an invocation. Such defaults are initialised when the function itself is
314 initialised, and are used to fill in any invocation frames that are known at
315 compile-time.
316
317 Tuples, Frames and Allocation
318 -----------------------------
319
320 Using the approach where arguments are treated like attributes in some kind of
321 structure, we could choose to allocate frames in places other than a stack.
322 This would produce something somewhat similar to a plain tuple object.
323
324 Optimisations
325 =============
326
327 Some optimisations around constant objects might be possible; these depend on
328 the following:
329
330 * Reliable tracking of assignments: where assignment operations occur, the
331 target of the assignment should be determined if any hope of optimisation
332 is to be maintained. Where no guarantees can be made about the target of
333 an assignment, no assignment-related information should be written to
334 potential targets.
335
336 * Objects acting as "containers" of attributes must be regarded as "safe":
337 where assignments are recorded as occurring on an attribute, it must be
338 guaranteed that no other unforeseen ways exist to assign to such
339 attributes.
340
341 The discussion below presents certain rules which must be imposed to uphold
342 the above requirements.
343
344 Safe Containers
345 ---------------
346
347 Where attributes of modules, classes and instances are only set once and are
348 effectively constant, it should be possible to circumvent the attribute lookup
349 mechanism and to directly reference the attribute value. This technique may
350 only be considered applicable for the following "container" objects, subject
351 to the noted constraints:
352
353 1. For modules, "safety" is enforced by ensuring that assignments to module
354 attributes are only permitted within the module itself either at the
355 top-level or via names declared as globals. Thus, the following would not
356 be permitted:
357
358 another_module.this_module.attr = value
359
360 In the above, this_module is a reference to the current module.
361
362 2. For classes, "safety" is enforced by ensuring that assignments to class
363 attributes are only permitted within the class definition, outside
364 methods. This would mean that classes would be "sealed" at definition time
365 (like functions).
366
367 Unlike the property of function locals that they may only sensibly be accessed
368 within the function in which they reside, these cases demand additional
369 controls or assumptions on or about access to the stored data. Meanwhile, it
370 would be difficult to detect eligible attributes on arbitrary instances due to
371 the need for some kind of type inference or abstract execution.
372
373 Constant Attributes
374 -------------------
375
376 When accessed via "safe containers", as described above, any attribute with
377 only one recorded assignment on it can be considered a constant attribute and
378 this eligible for optimisation, the consequence of which would be the
379 replacement of a LoadAttrIndex instruction (which needs to look up an
380 attribute using the run-time details of the "container" and the compile-time
381 details of the attribute) with a LoadAttr instruction.
382
383 However, some restrictions exist on assignment operations which may be
384 regarded to cause only one assignment in the lifetime of a program:
385
386 1. For module attributes, only assignments at the top-level outside loop
387 statements can be reliably assumed to cause only a single assignment.
388
389 2. For class attributes, only assignments at the top-level within class
390 definitions and outside loop statements can be reliably assumed to cause
391 only a single assignment.
392
393 All assignments satisfying the "safe container" requirements, but not the
394 requirements immediately above, should each be recorded as causing at least
395 one assignment.
396
397 Additional Controls
398 -------------------
399
400 For the above cases for "container" objects, the following controls would need
401 to apply:
402
403 1. Modules would need to be immutable after initialisation. However, during
404 initialisation, there remains a possibility of another module attempting
405 to access the original module. For example, if ppp/__init__.py contained
406 the following...
407
408 x = 1
409 import ppp.qqq
410 print x
411
412 ...and if ppp/qqq.py contained the following...
413
414 import ppp
415 ppp.x = 2
416
417 ...then the value 2 would be printed. Since modules are objects which are
418 registered globally in a program, it would be possible to set attributes
419 in the above way.
420
421 2. Classes would need to be immutable after initialisation. However, since
422 classes are objects, any reference to a class after initialisation could
423 be used to set attributes on the class.
424
425 Solutions:
426
427 1. Insist on global scope for module attribute assignments.
428
429 2. Insist on local scope within classes.
430
431 Both of the above measures need to be enforced at run-time, since an arbitrary
432 attribute assignment could be attempted on any kind of object, yet to uphold
433 the properties of "safe containers", attempts to change attributes of such
434 objects should be denied. Since foreseen attribute assignment operations have
435 certain properties detectable at compile-time, it could be appropriate to
436 generate special instructions (or modified instructions) during the
437 initialisation of modules and classes for such foreseen assignments, whilst
438 employing normal attribute assignment operations in all other cases. Indeed,
439 the StoreAttr instruction, which is used to set attributes in "safe
440 containers" would be used exclusively for this purpose; the StoreAttrIndex
441 instruction would be used exclusively for all other attribute assignments.
442
443 To ensure the "sealing" of modules and classes, entries in the attribute
444 lookup table would encode whether a class or module is being accessed, so
445 that the StoreAttrIndex instruction could reject such accesses.
446
447 Constant Attribute Values
448 -------------------------
449
450 Where an attribute value is itself regarded as constant, is a "safe container"
451 and is used in an operation accessing its own attributes, the value can be
452 directly inspected for optimisations or employed in the generated code. For
453 the attribute values themselves, only objects of a constant nature may be
454 considered suitable for this particular optimisation:
455
456 * Classes
457 * Modules
458 * Instances defined as constant literals
459
460 This is because arbitrary objects (such as most instances) have no
461 well-defined form before run-time and cannot be investigated further at
462 compile-time or have a representation inserted into the generated code.
463
464 Class Attributes and Access via Instances
465 -----------------------------------------
466
467 Unlike module attributes, class attributes can be accessed in a number of
468 different ways:
469
470 * Using the class itself:
471
472 C.x = 123
473 cls = C; cls.x = 234
474
475 * Using a subclass of the class (for reading attributes):
476
477 class D(C):
478 pass
479 D.x # setting D.x would populate D, not C
480
481 * Using instances of the class or a subclass of the class (for reading
482 attributes):
483
484 c = C()
485 c.x # setting c.x would populate c, not C
486
487 Since assignments are only achieved using direct references to the class, and
488 since class attributes should be defined only within the class initialisation
489 process, the properties of class attributes should be consistent with those
490 desired.
491
492 Method Access via Instances
493 ---------------------------
494
495 It is desirable to optimise method access, even though most method calls are
496 likely to occur via instances. It is possible, given the properties of methods
497 as class attributes to investigate the kind of instance that the self
498 parameter/local refers to within each method: it should be an instance either
499 of the class in which the method is defined or a compatible class, although
500 situations exist where this might not be the case:
501
502 * Explicit invocation of a method:
503
504 d = D() # D is not related to C
505 C.f(d) # calling f(self) in C
506
507 If blatant usage of incompatible instances were somehow disallowed, it would
508 still be necessary to investigate the properties of an instance's class and
509 its relationship with other classes. Consider the following example:
510
511 class A:
512 def f(self): ...
513
514 class B:
515 def f(self): ...
516 def g(self):
517 self.f()
518
519 class C(A, B):
520 pass
521
522 Here, instances of B passed into the method B.g could be assumed to provide
523 access to B.f when self.f is resolved at compile-time. However, instances of C
524 passed into B.g would instead provide access to A.f when self.f is resolved at
525 compile-time (since the method resolution order is C, A, B instead of just B).
526
527 One solution might be to specialise methods for each instance type, but this
528 could be costly. Another less ambitious solution might only involve the
529 optimisation of such internal method calls if an unambiguous target can be
530 resolved.
531
532 Optimising Function Invocations
533 -------------------------------
534
535 Where an attribute value is itself regarded as constant and is a function,
536 knowledge about the parameters of the function can be employed to optimise the
537 preparation of the invocation frame.