paul@810 | 1 | = Inspecting Programs = |
paul@810 | 2 | |
paul@861 | 3 | A program's source code is inspected module by module. As |
paul@861 | 4 | [[../Imports|imports]] of modules are encountered, other modules are added to |
paul@861 | 5 | the program. The inspection process for each module involves top-to-bottom |
paul@861 | 6 | traversal of the code using depth-first processing of the abstract syntax |
paul@861 | 7 | tree, with a stack of namespaces being employed to record definitions and |
paul@861 | 8 | names within namespaces are they are visited. Inspection continues until all |
paul@861 | 9 | modules in the program have been imported and inspected. |
paul@810 | 10 | |
paul@810 | 11 | <<TableOfContents(2,3)>> |
paul@810 | 12 | |
paul@810 | 13 | == The Inspection Process == |
paul@810 | 14 | |
paul@810 | 15 | The inspection process is focused on two primary tasks: |
paul@810 | 16 | |
paul@810 | 17 | * Populating namespaces |
paul@810 | 18 | * Identifying the names defined by each namespace |
paul@810 | 19 | * Recording relationships between namespaces |
paul@810 | 20 | * Describing the operations in each namespace |
paul@861 | 21 | * Identifying the names in each namespace and the attributes used by those |
paul@861 | 22 | names |
paul@810 | 23 | * Recording details of assignments and invocations |
paul@810 | 24 | |
paul@861 | 25 | The results of inspection are written out to [[../Cache|cache]] files, one for |
paul@861 | 26 | each module in the program. |
paul@810 | 27 | |
paul@810 | 28 | === Program Units for Inspection === |
paul@810 | 29 | |
paul@861 | 30 | The `inspector` module performs much of the inspection work, relying on the |
paul@861 | 31 | `common` module for certain tasks, with the `modules` module providing the |
paul@861 | 32 | relevant module abstractions including those writing to and reading from cache |
paul@861 | 33 | files, and the `resolving` module completing each module's inspection. The |
paul@861 | 34 | `importer` module coordinates inspection across the whole program. |
paul@810 | 35 | |
paul@810 | 36 | === Name Identification === |
paul@810 | 37 | |
paul@810 | 38 | Names can be introduced to a namespace via several different mechanisms: |
paul@810 | 39 | |
paul@810 | 40 | * Assignments to local names |
paul@810 | 41 | * References to global names |
paul@810 | 42 | * References to built-in names (defined in the built-in modules) |
paul@810 | 43 | * Importing of modules |
paul@810 | 44 | * Importing of names from modules |
paul@810 | 45 | |
paul@861 | 46 | The origins of names are discovered by considering local, global and built-in |
paul@861 | 47 | namespaces. Where a name is '''local''', it is defined in the same namespace. |
paul@861 | 48 | Where a name is '''global''', it is defined in the same module at the module |
paul@861 | 49 | level. Where a name is '''built-in''', it is defined in a module in the |
paul@861 | 50 | `__builtins__` package and exposed via the `__builtins__` namespace. |
paul@810 | 51 | |
paul@861 | 52 | Initial tentative identification of names will sort names into two categories: |
paul@861 | 53 | locals and external names. Global variables employed in function (or method) |
paul@861 | 54 | namespaces may not be defined when a function body is inspected, either within |
paul@861 | 55 | a module or being imported from another module, and so it is not initially |
paul@861 | 56 | possible to more specifically determine the nature of a name. |
paul@810 | 57 | |
paul@861 | 58 | These categories are later refined to distinguish between globals and built-in |
paul@861 | 59 | names as external names. The built-in origin of a name is only suggested when |
paul@861 | 60 | no locals or globals can provide the name, and the final identification of |
paul@861 | 61 | such names, plus other external names introduced as locals or globals via |
paul@861 | 62 | imports, will occur at the end of the inspection activity. Names that are |
paul@861 | 63 | unrecognised by then may be treated like unrecognised built-ins. |
paul@810 | 64 | |
paul@810 | 65 | === Name Restrictions and Resolution === |
paul@810 | 66 | |
paul@861 | 67 | '''Static''' objects, forming the fundamental [[../Structure|structure]] of |
paul@861 | 68 | the program, expose their names through the general structure hierarchy. |
paul@861 | 69 | Classes, which are defined as part of this structure, depend on well-defined |
paul@861 | 70 | names for any base classes they employ. For example: |
paul@810 | 71 | |
paul@810 | 72 | {{{#!python numbers=disable |
paul@810 | 73 | class C(A): # depends on A being a name that can be resolved (and being a class) |
paul@810 | 74 | ... |
paul@810 | 75 | |
paul@810 | 76 | class D(A.B.C): # depends on A.B.C being resolvable (and being a class) |
paul@810 | 77 | ... |
paul@810 | 78 | }}} |
paul@810 | 79 | |
paul@861 | 80 | Base classes must be identifiable and unambiguous. Since base classes may be |
paul@861 | 81 | imported, their identification may not necessarily occur immediately, but it |
paul@861 | 82 | must be possible once all information about a program is known and when all |
paul@861 | 83 | such dependencies are resolved. |
paul@810 | 84 | |
paul@810 | 85 | === Attribute Identification === |
paul@810 | 86 | |
paul@810 | 87 | Attributes are introduced to namespaces as follows: |
paul@810 | 88 | |
paul@861 | 89 | * For classes and modules, the definition of names defines attributes in |
paul@861 | 90 | those namespaces |
paul@861 | 91 | * Functions do not generally support attributes, although |
paul@861 | 92 | [[../Representations#Special_Members|representation-specific attributes]] |
paul@861 | 93 | do exist on functions |
paul@861 | 94 | * For instances, assignments to attributes of the special `self` name, |
paul@861 | 95 | performed in methods, define attributes in all instances of the method's |
paul@861 | 96 | class |
paul@810 | 97 | |
paul@861 | 98 | Attributes are only supported on program objects if they have been defined or |
paul@861 | 99 | '''bound''' as described above. Any attempt to access or set an attribute on |
paul@861 | 100 | an object using a name that was not determined through the above process is |
paul@861 | 101 | considered an invalid operation. Thus, augmenting the attributes available on |
paul@861 | 102 | an object (so-called "monkeypatching") is not possible. |
paul@810 | 103 | |
paul@861 | 104 | When an attribute is accessed, the location of its use is recorded and |
paul@861 | 105 | ultimately associated with assignments of the name involved, just as is done |
paul@861 | 106 | for accesses of the plain name itself, but the attribute details are |
paul@861 | 107 | subsequently collected together for each assignment or '''version''' of the |
paul@861 | 108 | name. This is discussed below. |
paul@810 | 109 | |
paul@810 | 110 | === Inherited Attributes === |
paul@810 | 111 | |
paul@861 | 112 | In order to support inheritance, a process of propagation makes class and |
paul@861 | 113 | instance attributes available to any given class from its ancestor classes |
paul@861 | 114 | according to a depth-first traversal of a class's base class hierarchy. Thus, |
paul@861 | 115 | for each class, given the definition of its base classes, a complete |
paul@861 | 116 | collection of class and instance attribute names can be determined. The actual |
paul@861 | 117 | mechanism of propagation occurs during the consolidation phase of inspection, |
paul@861 | 118 | principally because class bases are not generally immediately identifiable |
paul@861 | 119 | upon completing the inspection of any given class. |
paul@810 | 120 | |
paul@810 | 121 | === Name Assignments and Accesses === |
paul@810 | 122 | |
paul@861 | 123 | Each assignment of a name defines a '''version''' of the name within a |
paul@861 | 124 | namespace. The location of this definition consists of the following: |
paul@810 | 125 | |
paul@810 | 126 | * The namespace in which the assignment appears |
paul@810 | 127 | * The name itself |
paul@810 | 128 | * The version or assignment instance number |
paul@810 | 129 | |
paul@861 | 130 | When a name is used, the location of its use is recorded and is ultimately |
paul@861 | 131 | associated with the assignments that may be providing the name at that |
paul@861 | 132 | location. This permits information about the type of the name to become |
paul@861 | 133 | available for each usage location. The location of each name usage consists of |
paul@861 | 134 | the following: |
paul@810 | 135 | |
paul@810 | 136 | * The namespace in which the name appears |
paul@810 | 137 | * The name itself |
paul@810 | 138 | * The access instance number |
paul@810 | 139 | |
paul@861 | 140 | Name accesses are described as special cases of attribute accesses: where |
paul@861 | 141 | attributes would be indicated, none are specified. |
paul@810 | 142 | |
paul@810 | 143 | === Attribute Accesses === |
paul@810 | 144 | |
paul@861 | 145 | Attribute '''accesses''' are operations involving attributes where those |
paul@861 | 146 | attributes are obtained or set in conjunction with an '''accessor''' |
paul@861 | 147 | expression. They are recorded in terms of location tuples, each describing an |
paul@861 | 148 | access as follows: |
paul@810 | 149 | |
paul@810 | 150 | * The namespace in which the access occurs |
paul@810 | 151 | * Any named accessor or an anonymous accessor |
paul@810 | 152 | * The attribute (or chain of attributes) involved (omitted for name accesses) |
paul@810 | 153 | * The access instance number |
paul@810 | 154 | |
paul@861 | 155 | As with name accesses, the access instance number distinguishes between |
paul@861 | 156 | different accesses employing the same details. |
paul@810 | 157 | |
paul@810 | 158 | Consider the following example: |
paul@810 | 159 | |
paul@810 | 160 | {{{{#!table |
paul@810 | 161 | <style="vertical-align: top"> |
paul@810 | 162 | {{{#!python numbers=disable |
paul@810 | 163 | |
paul@810 | 164 | def f(): |
paul@810 | 165 | p = ... # p version 0 |
paul@810 | 166 | p.a # p.a access 0 |
paul@810 | 167 | if fn().a: # anonymous a access 0 |
paul@810 | 168 | q = ... # q version 0 |
paul@810 | 169 | q.a # q.a access 0 |
paul@810 | 170 | p # p access 0 |
paul@810 | 171 | else: |
paul@810 | 172 | q = ... # q version 1 |
paul@810 | 173 | q.a # q.a access 1 |
paul@810 | 174 | q.b # q.b access 0 |
paul@810 | 175 | p # p access 1 |
paul@810 | 176 | }}} |
paul@810 | 177 | || |
paul@810 | 178 | {{{#!graphviz |
paul@810 | 179 | //format=svg |
paul@810 | 180 | //transform=notugly |
paul@810 | 181 | digraph accesses { |
paul@1001 | 182 | node [shape=box,fontsize="12.0",fontname="sans-serif",tooltip="Names and accesses"]; |
paul@810 | 183 | edge [tooltip="Names and accesses"]; |
paul@810 | 184 | rankdir=TB; |
paul@810 | 185 | |
paul@810 | 186 | p [label="p = ...",style=filled,fillcolor=gold]; |
paul@810 | 187 | |
paul@810 | 188 | subgraph { |
paul@810 | 189 | rank=same; |
paul@810 | 190 | pa0expr [label="p.a",style=filled,fillcolor=cyan]; |
paul@810 | 191 | pa0 [label="p.a #0",style=filled,fillcolor=red]; |
paul@810 | 192 | } |
paul@810 | 193 | |
paul@810 | 194 | subgraph { |
paul@810 | 195 | rank=same; |
paul@810 | 196 | if [label="if",shape=ellipse]; |
paul@810 | 197 | fnaexpr [label="fn().a",style=filled,fillcolor=cyan]; |
paul@810 | 198 | fna [label="{}.a #0",style=filled,fillcolor=red]; |
paul@810 | 199 | } |
paul@810 | 200 | |
paul@810 | 201 | q0 [label="q = ...",style=filled,fillcolor=gold]; |
paul@810 | 202 | |
paul@810 | 203 | subgraph { |
paul@810 | 204 | rank=same; |
paul@810 | 205 | qa0expr [label="q.a",style=filled,fillcolor=cyan]; |
paul@810 | 206 | qa0 [label="q.a #0",style=filled,fillcolor=red]; |
paul@810 | 207 | } |
paul@810 | 208 | |
paul@810 | 209 | subgraph { |
paul@810 | 210 | rank=same; |
paul@810 | 211 | p0expr [label="p",style=filled,fillcolor=cyan]; |
paul@810 | 212 | p0 [label="p.{} #0",style=filled,fillcolor=red]; |
paul@810 | 213 | } |
paul@810 | 214 | |
paul@810 | 215 | else [label="else",shape=ellipse]; |
paul@810 | 216 | q1 [label="q = ...",style=filled,fillcolor=gold]; |
paul@810 | 217 | |
paul@810 | 218 | subgraph { |
paul@810 | 219 | rank=same; |
paul@810 | 220 | qa1expr [label="q.a #1",style=filled,fillcolor=cyan]; |
paul@810 | 221 | qa1 [label="q.a #1",style=filled,fillcolor=red]; |
paul@810 | 222 | } |
paul@810 | 223 | |
paul@810 | 224 | subgraph { |
paul@810 | 225 | rank=same; |
paul@810 | 226 | qb0expr [label="q.b",style=filled,fillcolor=cyan]; |
paul@810 | 227 | qb0 [label="q.b #0",style=filled,fillcolor=red]; |
paul@810 | 228 | } |
paul@810 | 229 | |
paul@810 | 230 | subgraph { |
paul@810 | 231 | rank=same; |
paul@810 | 232 | p1expr [label="p",style=filled,fillcolor=cyan]; |
paul@810 | 233 | p1 [label="p.{} #1",style=filled,fillcolor=red]; |
paul@810 | 234 | } |
paul@810 | 235 | |
paul@810 | 236 | p -> pa0expr -> if -> fnaexpr -> q0 -> qa0expr -> p0expr -> qb0expr; |
paul@810 | 237 | if -> else -> q1 -> qa1expr -> qb0expr; |
paul@810 | 238 | qb0expr -> p1expr; |
paul@810 | 239 | |
paul@810 | 240 | fnaexpr -> fna [dir=none,style=dashed]; |
paul@810 | 241 | pa0expr -> pa0 [dir=none,style=dashed]; |
paul@810 | 242 | qa0expr -> qa0 [dir=none,style=dashed]; |
paul@810 | 243 | p0expr -> p0 [dir=none,style=dashed]; |
paul@810 | 244 | qa1expr -> qa1 [dir=none,style=dashed]; |
paul@810 | 245 | qb0expr -> qb0 [dir=none,style=dashed]; |
paul@810 | 246 | p1expr -> p1 [dir=none,style=dashed]; |
paul@810 | 247 | |
paul@810 | 248 | p -> pa0 [dir=none]; |
paul@810 | 249 | p -> p0 [dir=none]; |
paul@810 | 250 | p -> p1 [dir=none]; |
paul@810 | 251 | |
paul@810 | 252 | q0 -> qa0 [dir=none]; |
paul@810 | 253 | q0 -> qb0 [dir=none]; |
paul@810 | 254 | |
paul@810 | 255 | q1 -> qa1 [dir=none]; |
paul@810 | 256 | q1 -> qb0 [dir=none]; |
paul@810 | 257 | } |
paul@810 | 258 | }}} |
paul@810 | 259 | }}}} |
paul@810 | 260 | |
paul@861 | 261 | Since the names involved may be provided by different name versions, accesses |
paul@861 | 262 | are counted independently. Meanwhile, a non-name or '''anonymous''' accessor |
paul@861 | 263 | may be involved in an access. Such anonymous accessors are independent and do |
paul@861 | 264 | not accumulate attribute usage because they potentially involve completely |
paul@861 | 265 | different objects. |
paul@810 | 266 | |
paul@810 | 267 | || '''Namespace''' || '''Name''' || '''Attribute''' || '''Access number''' || |
paul@810 | 268 | || `module.f` || `p` || `a` || 0 || |
paul@810 | 269 | || `module.f` || `p` || `{}` || 0 || |
paul@810 | 270 | || `module.f` || `p` || `{}` || 1 || |
paul@810 | 271 | || `module.f` || `q` || `a` || 0 || |
paul@810 | 272 | || `module.f` || `q` || `a` || 1 || |
paul@810 | 273 | || `module.f` || `q` || `b` || 0 || |
paul@810 | 274 | || `module.f` || `{}` || `a` || 0 || |
paul@810 | 275 | |
paul@810 | 276 | Accessors may be recorded using a similar location scheme. For example: |
paul@810 | 277 | |
paul@810 | 278 | || '''Namespace''' || '''Name''' || '''Attribute''' || '''Name version''' || |
paul@810 | 279 | || `module.f` || `p` || `{}` || 0 || |
paul@810 | 280 | || `module.f` || `q` || `{}` || 0 || |
paul@810 | 281 | || `module.f` || `q` || `{}` || 1 || |
paul@810 | 282 | |
paul@861 | 283 | Here, the attribute field is left empty and indicates that name definitions |
paul@861 | 284 | are being described. Although the data is superficially similar to name |
paul@861 | 285 | accesses, it should be remembered that accesses employ an access number |
paul@861 | 286 | whereas accessors employ a name version, with such identifiers being different |
paul@861 | 287 | things. |
paul@810 | 288 | |
paul@810 | 289 | === Names and Attribute Usage === |
paul@810 | 290 | |
paul@861 | 291 | Within each scope, the names employed by the program and the attributes used |
paul@861 | 292 | with those names are tracked. As the path of execution diverges and converges |
paul@861 | 293 | again through control-flow constructs, the tracking attempts to maintain the |
paul@861 | 294 | '''attribute usage''' associated with each name, or specifically each |
paul@861 | 295 | '''version''' of each name. |
paul@810 | 296 | |
paul@810 | 297 | {{{{#!table |
paul@810 | 298 | <style="vertical-align: top"> |
paul@810 | 299 | {{{#!python numbers=disable |
paul@810 | 300 | y = ... # y version 0 |
paul@810 | 301 | while cond0: |
paul@810 | 302 | if cond1: |
paul@810 | 303 | y.a1 |
paul@810 | 304 | elif cond2: |
paul@810 | 305 | y = ... # y version 1 |
paul@810 | 306 | y.a2 |
paul@810 | 307 | else: |
paul@810 | 308 | y.a3 |
paul@810 | 309 | |
paul@810 | 310 | # y version 0 is used with a1 or a3 or neither |
paul@810 | 311 | |
paul@810 | 312 | # y version 1 is used with a2 |
paul@810 | 313 | }}} |
paul@810 | 314 | || |
paul@810 | 315 | {{{#!graphviz |
paul@810 | 316 | //format=svg |
paul@810 | 317 | //transform=notugly |
paul@810 | 318 | digraph usage { |
paul@1001 | 319 | node [shape=box,fontsize="12.0",fontname="sans-serif",tooltip="Name and attribute tracking"]; |
paul@810 | 320 | edge [tooltip="Name and attribute tracking"]; |
paul@810 | 321 | rankdir=TB; |
paul@810 | 322 | |
paul@810 | 323 | sety0 [label="y = ...",style=filled,fillcolor=gold]; |
paul@810 | 324 | while [label="while cond0",shape=ellipse]; |
paul@810 | 325 | |
paul@810 | 326 | subgraph { |
paul@810 | 327 | rank=same; |
paul@810 | 328 | if [label="if cond1",shape=ellipse]; |
paul@810 | 329 | ya1 [label="y.a1",style=filled,fillcolor=cyan]; |
paul@810 | 330 | } |
paul@810 | 331 | |
paul@810 | 332 | subgraph { |
paul@810 | 333 | rank=same; |
paul@810 | 334 | elif [label="elif cond2",shape=ellipse]; |
paul@810 | 335 | sety1 [label="y = ...",style=filled,fillcolor=gold]; |
paul@810 | 336 | ya2 [label="y.a2",style=filled,fillcolor=green]; |
paul@810 | 337 | } |
paul@810 | 338 | |
paul@810 | 339 | subgraph { |
paul@810 | 340 | rank=same; |
paul@810 | 341 | ifelse [label="else",shape=ellipse]; |
paul@810 | 342 | ya3 [label="y.a3",style=filled,fillcolor=cyan]; |
paul@810 | 343 | } |
paul@810 | 344 | |
paul@810 | 345 | endif [label="end if",shape=ellipse]; |
paul@810 | 346 | |
paul@810 | 347 | whileelse [label="(else)",shape=ellipse]; |
paul@810 | 348 | |
paul@810 | 349 | endwhile [label="end while",shape=ellipse]; |
paul@810 | 350 | |
paul@810 | 351 | sety0 -> while -> if -> elif -> ifelse -> endif; |
paul@810 | 352 | if -> ya1 -> endif; |
paul@810 | 353 | elif -> sety1 -> ya2 -> endif; |
paul@810 | 354 | ifelse -> ya3 -> endif; |
paul@810 | 355 | endif -> while [style=dashed]; |
paul@810 | 356 | while -> whileelse -> endwhile; |
paul@810 | 357 | } |
paul@810 | 358 | }}} |
paul@810 | 359 | }}}} |
paul@810 | 360 | |
paul@861 | 361 | The outcome of such tracking should be an indication of the attribute usage |
paul@861 | 362 | with each name based on the shortest routes that names can take through the |
paul@861 | 363 | control-flow structure. Such shortest-route usage defines the minimal |
paul@861 | 364 | selection of attributes that can be considered used with a name, and thus such |
paul@861 | 365 | usage defines the broadest selection of object types that can be identified as |
paul@861 | 366 | supporting such attributes. In the above example, the following minimal |
paul@861 | 367 | selections of attributes apply: |
paul@810 | 368 | |
paul@810 | 369 | || '''Name Version''' || '''Minimal Set of Attributes''' || '''Types''' || |
paul@810 | 370 | || `y` (version 0) || ''empty set'' || ''any object'' || |
paul@810 | 371 | || `y` (version 1) || `a2` || ''only objects providing the single attribute'' || |
paul@810 | 372 | |
paul@861 | 373 | The assumption is made that any attribute used with a name is always provided |
paul@861 | 374 | by the object referenced by that name and that the correct execution of the |
paul@861 | 375 | code does not rely on an attribute being absent (and thus raising an |
paul@861 | 376 | exception). By defining usage for each name, the toolchain can determine |
paul@861 | 377 | whether any type can provide the attributes used with a name, producing a |
paul@861 | 378 | compile-time error if no type supports such usage. |
paul@810 | 379 | |
paul@861 | 380 | It is possible that certain routes taken by names might define attribute usage |
paul@861 | 381 | that cannot be supported by types that do support the shortest-route usage, |
paul@861 | 382 | yet it might not be appropriate to forbid usage of such types with such names: |
paul@861 | 383 | the program logic may intend that such types do not visit the regions of the |
paul@861 | 384 | code that employ the attributes that such types cannot support. However, as a |
paul@861 | 385 | consequence, run-time tests will be required to prevent attribute accesses |
paul@861 | 386 | that are inappropriate for such types from taking place. In the above example, |
paul@861 | 387 | the following maximal selections apply: |
paul@810 | 388 | |
paul@810 | 389 | || '''Name Version''' || '''Maximal Set of Attributes''' || '''Types''' || |
paul@810 | 390 | || `y` (version 0) || `a1`, `a3` || ''only objects providing both attributes'' || |
paul@810 | 391 | || `y` (version 1) || `a1`, `a2`, `a3` || ''only objects providing all three attributes'' || |
paul@810 | 392 | |
paul@861 | 393 | Tracking occurs separately in function or method namespaces and at a level |
paul@861 | 394 | combining the static namespaces in a module. This latter combination of |
paul@861 | 395 | namespaces permits the flow of global name details through class namespaces. |
paul@861 | 396 | Tracking employs special '''tracking names''' with which usage is associated, |
paul@861 | 397 | with globals and class attributes employing complete object paths to describe |
paul@861 | 398 | their names, whereas locals merely employ the plain names defined and used in |
paul@861 | 399 | local namespaces. Some examples follow: |
paul@810 | 400 | |
paul@810 | 401 | || '''Namespace''' || '''Name''' || '''Name Scope''' || '''Tracking Name''' || |
paul@810 | 402 | || `__main__` (module) || `x` || global || `__main__.x` || |
paul@810 | 403 | || `__main__.C` (class) || `x` || global || `__main__.x` || |
paul@810 | 404 | || `__main__.C` (class) || `y` || local (class attribute) || `__main__.C.y` || |
paul@810 | 405 | || `__main__.f` (function) || `y` || local || `y` || |
paul@810 | 406 | |
paul@810 | 407 | === Name Initialisation and Aliases === |
paul@810 | 408 | |
paul@861 | 409 | Each name version may be associated with a value upon assignment provided by |
paul@861 | 410 | an expression known as an '''initialiser'''. Such values may be used to inform |
paul@861 | 411 | the interpretation of operations on names, restricting the types involved to |
paul@861 | 412 | the initialised values. Some examples are as follows: |
paul@810 | 413 | |
paul@810 | 414 | {{{#!python numbers=disable |
paul@810 | 415 | x = 123 # initialiser is a constant, indicating a known type |
paul@810 | 416 | x = classname() # initialiser refers to a class, indicating instantiation |
paul@810 | 417 | }}} |
paul@810 | 418 | |
paul@861 | 419 | Names may also be assigned the values of other names, and such a name becomes |
paul@861 | 420 | an '''alias''' for such other names. Aliases can therefore be used to combine |
paul@861 | 421 | attribute usage observations across names. Other forms of aliases involve |
paul@861 | 422 | assigning attributes accessed via other names to any given name. Some examples |
paul@861 | 423 | are as follows: |
paul@810 | 424 | |
paul@810 | 425 | {{{#!python numbers=disable |
paul@810 | 426 | y = x # y |
paul@810 | 427 | y.p # p can be assumed to apply to x |
paul@810 | 428 | z = x.q # z can be restricted to the types deduced for x.q |
paul@810 | 429 | }}} |
paul@810 | 430 | |
paul@861 | 431 | Initialising values are retained for later resolution because their identities |
paul@861 | 432 | are not generally known until certain name resolution activities have taken |
paul@861 | 433 | place. |
paul@810 | 434 | |
paul@810 | 435 | === Invocations === |
paul@810 | 436 | |
paul@861 | 437 | During inspection only limited information is available about the nature of |
paul@861 | 438 | invocations. However, some information will already be available about global |
paul@861 | 439 | names and these may be defined by classes. Thus, invocations that cause the |
paul@861 | 440 | instantiation of classes may be determined even during the inspection phase. |
paul@810 | 441 | |
paul@861 | 442 | Information about class instantiation is most useful in combination with the |
paul@861 | 443 | initialisation of names. When an assignment occurs, any initialising |
paul@861 | 444 | expression that provides enough information can be evaluated to see if it |
paul@861 | 445 | describes instantiation. If so, the nature of the instantiation can be |
paul@861 | 446 | associated with the name and used in the deduction process to constrain any |
paul@861 | 447 | usage of that name and its attributes. Such restrictions on the types |
paul@861 | 448 | associated with names are applied in the deduction phase. |
paul@810 | 449 | |
paul@810 | 450 | === Literals and Constants === |
paul@810 | 451 | |
paul@861 | 452 | Literal values or '''literals''' are specific values that appear in the |
paul@861 | 453 | program, typically representing numbers, character strings, mappings or |
paul@861 | 454 | collections. Literal numbers and strings are effectively '''constants''', |
paul@861 | 455 | meaning that their values are unambiguously defined and can eventually be |
paul@861 | 456 | referenced using a unique identifier that applies throughout the program and |
paul@861 | 457 | which can refer to an initialised object in any generated program. Initially, |
paul@861 | 458 | they are recorded for the namespace in which they appear. For example: |
paul@810 | 459 | |
paul@810 | 460 | || '''Namespace''' || '''Identifier''' || '''Type''' || '''Encoding''' || '''Value''' || |
paul@810 | 461 | || `__main__` || `__main__.$c0` || `__builtins__.str.string` || `iso-8859-15` || `'\xc6\xd8\xc5'` || |
paul@810 | 462 | || `__main__` || `__main__.$c1` || `__builtins__.int.int` || || `123` || |
paul@810 | 463 | |
paul@861 | 464 | Since values may appear again in other namespaces, a mapping is generated from |
paul@861 | 465 | such local identifiers to common global identifiers for constants. Where such |
paul@861 | 466 | an identifier is derived from the value content, this can potentially occur |
paul@861 | 467 | immediately, although in practice (and in general) such a mapping will be |
paul@861 | 468 | generated after all modules have been inspected. |
paul@810 | 469 | |
paul@861 | 470 | Collections and mappings may also be initialised using literal syntax but may |
paul@861 | 471 | contain non-constant information such as names or expressions whose values are |
paul@861 | 472 | unknown before the program is run. Such values can be represented as |
paul@861 | 473 | instantiation operations of the appropriate type in any generated program, and |
paul@861 | 474 | the instances of the type concerned can be associated with any names to which |
paul@861 | 475 | such literals are assigned. Special names are used to refer to literal types |
paul@861 | 476 | for collections and mappings. For example: |
paul@810 | 477 | |
paul@810 | 478 | || '''Identifier''' || '''Type''' || |
paul@810 | 479 | || `$Ldict` || `__builtins__.dict.dict` || |
paul@810 | 480 | || `$Llist` || `__builtins__.list.list` || |
paul@810 | 481 | || `$Ltuple` || `__builtins__.tuple.tuple` || |
paul@810 | 482 | |
paul@861 | 483 | Such special names merely serve as convenient placeholders for the types of |
paul@861 | 484 | any literals mentioned in a module. |
paul@810 | 485 | |
paul@810 | 486 | == Consolidating Inspection Details == |
paul@810 | 487 | |
paul@861 | 488 | As briefly discussed above, certain activities occur after information has |
paul@861 | 489 | been collected from an input program. Within each module, names that are |
paul@861 | 490 | external to namespaces are recorded in a '''name reference''' collection. |
paul@861 | 491 | These references identify unrecognised names but do not generally define their |
paul@861 | 492 | origins. Examples of name references are as follows: |
paul@810 | 493 | |
paul@810 | 494 | || '''Name''' || '''Identity''' || |
paul@810 | 495 | || `__main__.repr` || `<deferred>:__builtins__.repr` || |
paul@810 | 496 | || `__main__.sys` || `<module>:sys` || |
paul@810 | 497 | |
paul@861 | 498 | In order to obtain the final identities, deferred references may need to be |
paul@861 | 499 | resolved, yielding concrete references: |
paul@810 | 500 | |
paul@810 | 501 | || '''Name''' || '''Identity''' || |
paul@810 | 502 | || `__main__.repr` || `<function>:__builtins__.identity.repr` || |
paul@810 | 503 | || `__main__.sys` || `<module>:sys` || |
paul@810 | 504 | |
paul@861 | 505 | This process of resolution cannot be performed immediately with only the |
paul@861 | 506 | knowledge available in a single module. Only with all program modules loaded |
paul@861 | 507 | can such resolution occur. |
paul@810 | 508 | |
paul@810 | 509 | === Identifying Deferred References === |
paul@810 | 510 | |
paul@861 | 511 | After all modules are loaded, and with all objects present in the program at |
paul@861 | 512 | this point, each deferred reference defined within a module should ultimately |
paul@861 | 513 | yield an identity. Any failure to do so indicates usage of a name not defined |
paul@861 | 514 | in the program. The process of identifying them by resolving what each of them |
paul@861 | 515 | points to, is illustrated by the following example: |
paul@810 | 516 | |
paul@810 | 517 | || '''Reference''' || '''Object Path to Search''' || |
paul@810 | 518 | || `<depends>:__builtins__.repr` || `__builtins__.repr` || |
paul@810 | 519 | || `<depends>:__builtins__.identity.repr` || `__builtins__.identity.repr` || |
paul@810 | 520 | || `<function>:__builtins__.identity.repr` || ''identification has occurred'' || |
paul@810 | 521 | |
paul@861 | 522 | Initially, the origin information from the reference is used to find the |
paul@861 | 523 | details of the referenced object. However, in this case, the details yield |
paul@861 | 524 | another deferred reference that has not yet been resolved. Consequently, the |
paul@861 | 525 | origin information must be used to find the details of the object referenced |
paul@861 | 526 | in this case, finally yielding a reference with concrete object information. |
paul@861 | 527 | Deferred references are mutated to concrete references, thus changing the |
paul@861 | 528 | nature of such references throughout the accumulated data. |
paul@810 | 529 | |
paul@810 | 530 | === Identifying Module Dependencies === |
paul@810 | 531 | |
paul@861 | 532 | During deferred reference resolution, relationships are catalogued between |
paul@861 | 533 | modules in which deferred references are found and those providing the |
paul@861 | 534 | corresponding referenced objects. In addition, module ordering dependencies |
paul@861 | 535 | are established on the basis that some kinds of objects must be initialised |
paul@861 | 536 | before being used in other parts of the program. As a result, the modules |
paul@861 | 537 | providing such objects must themselves be initialised before the modules |
paul@861 | 538 | referencing such objects. More information on this topic can be found in the |
paul@861 | 539 | [[../Imports#Import_Sequencing|import sequencing]] documentation. |
paul@810 | 540 | |
paul@810 | 541 | === Resolving Module Details === |
paul@810 | 542 | |
paul@861 | 543 | With all dependencies resolved, further work can be done with the details |
paul@861 | 544 | within each module. The `resolving` module provides the functionality to each |
paul@861 | 545 | module instance to perform such work. |
paul@810 | 546 | |
paul@810 | 547 | 1. Class bases can be identified. |
paul@861 | 548 | 1. Special names (referring to built-in objects employed by certain |
paul@861 | 549 | operations) are resolved. |
paul@861 | 550 | 1. Each access involving an external name is investigated to see if it |
paul@861 | 551 | provides an effectively constant object (as described below). |
paul@861 | 552 | 1. Invocation references are converted to provide instantiation information, |
paul@861 | 553 | if appropriate. |
paul@861 | 554 | 1. Initialisers are investigated to see if they provide concrete object |
paul@861 | 555 | details and can thus indicate the nature of a name. |
paul@810 | 556 | 1. Constant value literals are associated with the appropriate types. |
paul@810 | 557 | |
paul@861 | 558 | Thus, attribute details for each module and its classes are finalised. |
paul@861 | 559 | Meanwhile, modules that are still hidden are removed from the program. |
paul@810 | 560 | |
paul@810 | 561 | ==== Investigating Accesses ==== |
paul@810 | 562 | |
paul@861 | 563 | External names will not have been resolved during the inspection process, but |
paul@861 | 564 | with information about the whole program, it becomes possible to identify |
paul@861 | 565 | names and to determine whether attribute accesses involving those names can |
paul@861 | 566 | also be identified. Thus, name references from each namespace are investigated |
paul@861 | 567 | in turn as follows. |
paul@810 | 568 | |
paul@861 | 569 | Each name referencing a static object is considered a constant access of that |
paul@861 | 570 | object. However, given a number of accompanying attributes describing an |
paul@861 | 571 | access, each attribute is added in turn, and the resulting object path |
paul@861 | 572 | identified. If the path identifies a constant object, the next attribute in |
paul@861 | 573 | the access chain is added and the resulting object path identified, and so on. |
paul@861 | 574 | The maximal collection of attributes identifying a constant object is then |
paul@861 | 575 | recorded as a constant access, with any remaining attributes in the access |
paul@861 | 576 | chain retained for traversal in a later phase. |
paul@810 | 577 | |
paul@861 | 578 | Names yielding constant accesses do not need to have their identity deduced |
paul@861 | 579 | through the application of attribute usage. |
paul@810 | 580 | |
paul@810 | 581 | ==== Investigating Initialisers ==== |
paul@810 | 582 | |
paul@861 | 583 | During inspection, initialisers will have been recorded in terms of expression |
paul@861 | 584 | results such as names, invocations, readily-identifiable instances, and |
paul@861 | 585 | attribute accesses. With details of the whole program plus constant accesses |
paul@861 | 586 | having been made available, such results may be definitively identified and |
paul@861 | 587 | associated with initialised names. |
paul@810 | 588 | |
paul@861 | 589 | Alias information may also be refined at this point by identifying attribute |
paul@861 | 590 | accesses that are used to initialise names. |
paul@810 | 591 | |
paul@810 | 592 | === Finalising Program Details === |
paul@810 | 593 | |
paul@861 | 594 | With class contents and relationships identified, class attribute inheritance |
paul@861 | 595 | and instance attribute availability can be defined. Some classes may depend on |
paul@861 | 596 | external state for their initialisation, and so additional module dependencies |
paul@861 | 597 | may be defined at this stage. The `importer` module contains the functionality |
paul@861 | 598 | to conduct these whole-program activities. |
paul@810 | 599 | |
paul@810 | 600 | == Inspection Outcome == |
paul@810 | 601 | |
paul@861 | 602 | Once inspection is complete, the available knowledge concerns the whole |
paul@861 | 603 | program and not merely individual modules or parts of the program. However, |
paul@861 | 604 | only limited efforts have been made until this point, notably in the |
paul@861 | 605 | acquisition of statically-defined object details when referencing names or |
paul@861 | 606 | attributes, to integrate the different modules. The [[../Deduction|deduction]] |
paul@861 | 607 | process is concerned with such integration. |