1 = Deducing Program Behaviour = 2 3 With information from [[../Inspection|inspection]] available, the isolated 4 observations about operations in a program may now be combined with knowledge 5 about the program's structure to produce deductions about the nature of each 6 operation. 7 8 <<TableOfContents(2,3)>> 9 10 == The Deduction Process == 11 12 The deduction process takes observations made during the 13 [[../Inspection|inspection process]] and attempts to form deductions about the 14 behaviour of the program primarily in terms of the nature of the attribute 15 '''accesses''', with their corresponding '''accessors''', featuring in the 16 program. Where attributes are used in conjunction with names, accessors are 17 name versions; where attributes are used in conjunction with other 18 expressions, accessors are '''anonymous'''. 19 20 === Indexes === 21 22 For efficiency, indexes are defined to establish relationships between the 23 following things: 24 25 {{{#!table 26 '''Indexes''' || '''Details''' 27 == 28 `access_index` || Which accessors (name versions) are involved with access 29 .. operations 30 == 31 `location_index` || Which attribute usage patterns are supported by accessors 32 .. (name versions) 33 == 34 `attr_class_types`<<BR>>`attr_instance_types`<<BR>>`attr_module_types` 35 || Which types support which attribute names 36 == 37 `assigned_attrs` 38 || Which usage patterns involve attribute assignment 39 == 40 `reference_assignments` 41 || Which accesses involve assignments 42 == 43 `reference_invocations` 44 || Which accesses involve invocations 45 == 46 `alias_index` 47 || Which names are aliases for other names, accesses or invocations 48 }}} 49 50 The objective of deduction is to combine these indexes to establish new 51 relationships between the different participants of these basic index 52 relationships. 53 54 === Building Indexes === 55 56 The building of indexes from inspection data is approximately as follows: 57 58 {{{#!graphviz 59 //format=svg 60 //transform=notugly 61 digraph indexes { 62 node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Indexes"]; 63 edge [tooltip="Indexes"]; 64 rankdir=LR; 65 66 all_attr_accesses [label="all_attr_accesses\n(anonymous accesses)"]; 67 attr_usage [label="attr_usage\n(attribute usage)"]; 68 location_index [label="location_index\n(usage by accessor)"]; 69 70 all_attrs [label="all_class_attrs | all_instance_attrs | all_module attrs | (all attributes)",shape=record]; 71 attr_types [label="attr_class_types | attr_instance_types | attr_module_types | (types by attribute name)",shape=record]; 72 73 attr_accessors [label="attr_accessors\n(accessors by access)"]; 74 access_index [label="access_index\n(accessor locations by access location)"]; 75 76 all_attr_access_modifiers [label="all_attr_access_modifiers\n(operations/modifiers by access)"]; 77 reference_assignments [label="reference_assignments\n(assignment accesses)"]; 78 reference_invocations [label="reference_invocations\n(invocation accesses)"]; 79 assigned_attrs [label="assigned_attrs\n(assignment accesses by access)"]; 80 81 all_aliased_names [label="all_aliased_names\n(accesses by alias)"]; 82 alias_index [label="alias_index\n(access locations by accessor/alias location)"]; 83 84 init_usage_index [label="init_usage_index",shape=ellipse]; 85 init_attr_type_indexes [label="init_attr_type_indexes",shape=ellipse]; 86 init_accessors [label="init_accessors",shape=ellipse]; 87 init_accesses [label="init_accesses",shape=ellipse]; 88 init_aliases [label="init_aliases",shape=ellipse]; 89 90 all_attr_accesses -> init_usage_index; 91 attr_usage -> init_usage_index; 92 init_usage_index -> location_index; 93 94 all_attrs -> init_attr_type_indexes -> attr_types; 95 96 attr_accessors -> init_accessors -> access_index; 97 98 all_attr_access_modifiers -> init_accesses; 99 init_accesses -> reference_assignments; 100 init_accesses -> reference_invocations; 101 init_accesses -> assigned_attrs; 102 103 all_aliased_names -> init_aliases -> alias_index; 104 } 105 }}} 106 107 === Deriving Types from Indexes and Accesses === 108 109 The indexes are employed in deduction approximately as follows: 110 111 {{{#!graphviz 112 //format=svg 113 //transform=notugly 114 digraph deduction { 115 node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Deduction"]; 116 edge [tooltip="Deduction"]; 117 rankdir=LR; 118 119 all_attr_accesses [label="all_attr_accesses\n(anonymous accesses)"]; 120 location_index [label="location_index\n(usage by accessor)"]; 121 122 attr_types [label="attr_class_types | attr_instance_types | attr_module_types | (types by attribute name)",shape=record]; 123 124 all_initialised_names [label="all_initialised_names\n(types by accessor)"]; 125 126 access_index [label="access_index\n(accessor locations by access location)"]; 127 128 alias_index [label="alias_index\n(access locations by accessor/alias location)"]; 129 130 record_types_for_usage [label="record_types_for_usage",shape=ellipse]; 131 record_types_for_attribute [label="record_types_for_attribute",shape=ellipse]; 132 133 accessor_types [label="accessor_class_types | accessor_instance_types | accessor_module_types | (types by accessor)",shape=record]; 134 provider_types [label="provider_class_types | provider_instance_types | provider_module_types | (provider types by accessor)",shape=record]; 135 136 location_index -> record_types_for_usage; 137 attr_types -> record_types_for_usage; 138 all_initialised_names -> record_types_for_usage; 139 access_index -> record_types_for_usage; 140 alias_index -> record_types_for_usage; 141 142 record_types_for_usage -> provider_types; 143 record_types_for_usage -> accessor_types; 144 145 attr_types -> record_types_for_attribute; 146 all_attr_accesses -> record_types_for_attribute; 147 148 record_types_for_attribute -> provider_types; 149 record_types_for_attribute -> accessor_types; 150 } 151 }}} 152 153 === Converting Usage to Types === 154 155 A particularly important operation in the deduction process is that of 156 converting attribute usage information to a set of types supporting such 157 usage. Taking the mapping of attribute names to types, each attribute name 158 provided by a usage observation can be applied, progressively narrowing the 159 set of types available to support the complete attribute usage collection. 160 161 {{{#!graphviz 162 //format=svg 163 //transform=notugly 164 digraph usage_to_types { 165 node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Usage to types"]; 166 edge [tooltip="Usage to types"]; 167 rankdir=LR; 168 169 usage [label="(a, b, c)",style=filled,fillcolor=darkorange]; 170 171 subgraph { 172 rank=same; 173 attr_a [label="attribute a",style=filled,fillcolor=gold]; 174 attr_b [label="attribute b",style=filled,fillcolor=gold]; 175 attr_c [label="attribute c",style=filled,fillcolor=gold]; 176 } 177 178 index [label="types by attribute name",shape=ellipse]; 179 180 subgraph { 181 rank=same; 182 type_P [label="type P"]; 183 type_Q [label="type Q",style=filled,fillcolor=gold]; 184 type_R [label="type R"]; 185 type_S [label="type S"]; 186 } 187 188 deduction [label="(a, b, c) needs type Q",style=filled,fillcolor=darkorange]; 189 190 usage -> attr_a; 191 usage -> attr_b; 192 usage -> attr_c; 193 194 attr_a -> attr_b -> attr_c [style=dashed,dir=none]; 195 attr_a -> index -> type_P [style=dashed,dir=none]; 196 type_P -> type_Q -> type_R -> type_S [style=dashed,dir=none]; 197 198 attr_a -> type_P; 199 attr_a -> type_Q; 200 attr_b -> type_Q; 201 attr_b -> type_R; 202 attr_c -> type_Q; 203 attr_c -> type_S; 204 205 type_Q -> deduction; 206 } 207 }}} 208 209 The types supporting attribute usage are the attribute '''providers'''. Where 210 providers are classes, the affected accessors in the program may also be 211 instances, since instances also support access to attributes of the 212 instantiated class (and its ancestors). 213 214 Indexes mapping attributes to types must combine consideration of class and 215 instance attributes in order to correctly identify instance providers. 216 Consider the following example: 217 218 {{{#!graphviz 219 //format=svg 220 //transform=notugly 221 digraph instance_providers { 222 node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Instance providers"]; 223 edge [tooltip="Instance providers"]; 224 rankdir=LR; 225 226 usage [label="(a, b, c)",style=filled,fillcolor=darkorange]; 227 228 subgraph { 229 rank=same; 230 combined [label="combined attributes",shape=ellipse]; 231 attr_a [label="attribute a",style=filled,fillcolor=gold]; 232 attr_c [label="attribute c",style=filled,fillcolor=gold]; 233 attr_b [label="attribute b",style=filled,fillcolor=gold]; 234 } 235 236 subgraph { 237 rank=same; 238 class_C [label="class C"]; 239 attr_Ca [label="attribute C.a",style=filled,fillcolor=gold]; 240 attr_Cc [label="attribute C.c",style=filled,fillcolor=gold]; 241 instance_C [label="instance of C"]; 242 attr_Ib [label="attribute (C).b",style=filled,fillcolor=gold]; 243 } 244 245 type_C [label="type C",style=filled,fillcolor=darkorange]; 246 247 combined -> attr_a -> attr_c -> attr_b [style=dashed,dir=none]; 248 class_C -> attr_Ca -> attr_Cc [style=dashed,dir=none]; 249 instance_C -> attr_Ib [style=dashed,dir=none]; 250 251 usage -> attr_a -> attr_Ca; 252 usage -> attr_b -> attr_Ib; 253 usage -> attr_c -> attr_Cc; 254 255 attr_Ca -> type_C; 256 attr_Cc -> type_C; 257 attr_Ib -> type_C; 258 } 259 }}} 260 261 To recognise support by instance accessors for the usage pattern concerned, 262 attribute details must be obtained from classes as well as instances. Note 263 that the identified type cannot support such usage purely by providing class 264 or instance attributes alone. 265 266 === Attribute Assignments === 267 268 Since attribute access operations may occur as part of assignments, the nature 269 of accesses is recorded during inspection. Combining the indexed information 270 for assignment accesses allows the deducer to determine the most pessimistic 271 effects on the program structure of such assignments. 272 273 Taking each attribute usage set employed by accessors involved in assignments, 274 the types are deduced for such accessors, and each individual attribute known 275 to be used in such assignments is then applied to the deduced types, 276 '''mutating''' them in such a way that deductions may no longer assume a fixed 277 identity for such attributes when obtained from these types. 278 279 {{{#!graphviz 280 //format=svg 281 //transform=notugly 282 digraph assignments { 283 node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Attribute assignments"]; 284 edge [tooltip="Attribute assignments"]; 285 rankdir=LR; 286 287 subgraph { 288 rank=same; 289 usage [label="(a, b, c)",style=filled,fillcolor=darkorange]; 290 assigned [label="b is assigned",style=filled,fillcolor=darkorange]; 291 } 292 293 deduction [label="(a, b, c) needs type Q",style=filled,fillcolor=gold]; 294 295 subgraph { 296 rank=same; 297 type_Q [label="type Q",style=filled,fillcolor=gold]; 298 attr_Qa [label="attribute Q.a"]; 299 attr_Qb [label="attribute Q.b\n(mutated)",style=filled,fillcolor=darkorange]; 300 attr_Qc [label="attribute Q.c"]; 301 } 302 303 type_Q -> attr_Qa -> attr_Qb -> attr_Qc [style=dashed,dir=none]; 304 usage -> deduction -> type_Q; 305 assigned -> attr_Qb; 306 } 307 }}} 308 309 === Refining Type Deductions === 310 311 In the context of a specific access, the types may be resolved further: 312 313 * Any name whose initialisation could be determined during inspection can be 314 associated with its initialised type 315 * Any name referring to a constant object can be associated with the type of 316 that object 317 * Usage of `self` in methods can result in only compatible class and instance 318 types being retained from the types obtained from usage deductions 319 320 === Reference Identification === 321 322 The basic information about every accessor and accessed attribute in a program 323 can now be combined to produce specific '''references''' to identities 324 consistent with the inspection observations. Several different kinds of 325 accessors and access situations exist: 326 327 * Name-based accesses involving attribute usage 328 * Aliases to names, possibly accompanied by accesses 329 * Anonymous accesses involving individual attributes 330 * Constant or previously-identified names, possibly accompanied by accesses 331 332 === Aliases === 333 334 Names initialised using other names or attribute accesses, or using the 335 invocation of either of these things, are known as '''aliases'''. Information 336 about aliases originates during inspection when such names are initialised 337 with expression results within the program. During the resolution activity 338 finalising the inspected details, this initialisation information is used to 339 define relationships between aliases and other names and accesses. 340 341 During deduction, attribute accesses are investigated and type information 342 computed for both attribute accesses and accessors. The relationships defined 343 between accesses and aliases can then be employed to propagate such deduced 344 information to the aliases and to define appropriate type characteristics for 345 them. 346 347 Where aliases are used as the basis of attribute accesses, this propagation 348 refines the previously deduced information about the aliases and may also 349 refine the details of the accesses with which the alias is involved. 350 351 === Invocation Target Suitability === 352 353 Having identified accessed attributes, invocation information can be applied 354 in cases where such attributes immediately participate in an invocation, 355 comparing the specified argument details against the parameter details defined 356 for the identified attribute, which must be a callable object for this 357 technique to work. Where arguments do not appear to be suitable - either the 358 number of arguments is incorrect or any keyword argument refer to non-existent 359 parameters - the attribute providing the parameter details can be considered 360 unsuitable for the access. 361 362 === Classifying Accessors === 363 364 Each accessor, being a particular version of a name, will have type 365 information associated with it as a consequence of the deduction process. Such 366 information takes the following forms: 367 368 * Provider types, indicating which types may provide the attributes used by 369 the accessor 370 * Accessor types, indicating which types will actually appear as the accessor 371 372 This information can be processed in a number of ways to produce the following: 373 374 * All types (from all kinds of type) of providers able to provide attributes 375 via the accessor 376 * All types (from all kinds of type) of accessors compatible with the 377 accessor 378 * The most general types of accessors compatible with the accessor 379 380 Where many types may be associated with an accessor, identifying the most 381 general types involves eliminating those which are derived from others. Given 382 that descendant types may not remove support for attributes provided by their 383 ancestors, then where an ancestor type has been identified as a possible 384 accessor, it should follow that all of its descendants may also have been 385 identified as possible accessors. Since these descendants should be 386 compatible, identifying them individually is unnecessary: merely specifying 387 that the common ancestor or ''any'' descendant could provide an accessor is 388 sufficient. 389 390 ==== Defining Guards ==== 391 392 A '''guard''' is a constraint supported by a run-time test indicating the 393 compliance of an accessor with a particular type. Thus, where a single 394 accessor type has been identified, a guard may be established for an accessor 395 that tests the type of the accessor against a specific type. Where a single 396 ''general'' accessor type is identified, a guard is established that tests the 397 accessor against a "common" type: that is, an ancestor type with which other 398 descendant types may comply. 399 400 === Classifying Accesses === 401 402 At the point of classifying accesses, information is available about the 403 following: 404 405 * The accessors potentially involved in each access 406 * The types of accessors and the types providing attributes via those 407 accessors 408 * Any guards applying to the accessors 409 * Whether an access is constrained by certain program characteristics and is 410 thus guaranteed to be as deduced 411 * The possible attributes referenced by the access 412 413 This information can be processed in a number of ways to produce the 414 following: 415 416 * The types of accessors, both general and specific, applying to each access 417 * The attributes that can be provided by each access, consolidating existing 418 referenced attribute details 419 * The general types providing the attributes 420 421 Since more than one accessor may be involved, information from all accessors 422 must be combined to determine whether guard information still applies to the 423 access, or whether it is possible for an accessor to be used that has an 424 uncertain type at run-time. 425 426 ==== Defining Tests ==== 427 428 A '''test''' at the access level is a necessary check performed on an accessor 429 before an access to determine whether it belongs to a certain type or group of 430 types. 431 432 Where guards applying to accessors apply by extension to an access, it may not 433 be enough to assume that the the attributes exposed by the accessor are the 434 same as those considered acceptable through deduction. Therefore, attributes 435 are obtained for the access using the applicable guard types as accessors. If 436 this set of attributes does not include potentially accessible attributes 437 (perhaps because the guard types are broadly defined and do not support all 438 attribute usage), a test must be generated. 439 440 Where a single attribute provider can be identified, a test for a specific 441 type is generated. Where a single general type can be identified as a 442 provider, a test against a "common" type is generated. Tests involving the 443 built-in `object` are not generated since it is the root of all classes and 444 such tests would merely identify an accessor as a class. In all cases where a 445 single, specific type cannot be tested for, the general attribute validation 446 mechanism must be used instead. 447 448 == Preparing Access Descriptions == 449 450 The accumulated deduced knowledge is directed towards producing access 451 descriptions or plans which characterise each access in terms of the 452 following: 453 454 * The initial accessor, being the starting object for attribute accesses, 455 unless a static accessor has been identified 456 * Details of any test required on the initial accessor 457 * Details of any type employed by the test 458 * Any identified static accessor (to be used as the initial accessor) 459 * Attributes needing to be traversed from the accessor that yield unambiguous 460 objects 461 * Access modes for each of the unambiguously-traversed attributes 462 * Remaining attributes needing to be tested and traversed (after having 463 traversed the above attributes) 464 * Details of the context 465 * Any test to apply to the context (to ensure its validity) 466 * The method of obtaining the first attribute 467 * The method of obtaining the final attribute 468 * Any identified static final attribute 469 * The kinds of objects providing the final attribute 470 471 === Characterising the Accessor === 472 473 For a given access location, the referenced attribute details established 474 during deduction are used to determine... 475 476 * Whether the initial accessor is static, originating from a constant access 477 or involving an identifiable static object 478 * Whether the initial accessor is dynamic but has a known, deduced identity 479 480 Some useful information about the accessor and about the actual provider of 481 the first accessed attribute can be defined: 482 483 || || '''Class''' || '''Instance''' || '''Module''' || 484 || '''Accessor''' || Class accessor || Instance accessor || Module accessor || 485 || '''Provider''' || Class provider || Instance provider || || 486 487 Depending on which of these criteria are satisfied, some properties of the 488 access operation can be established: 489 490 * Object-relative accesses occur with class accessors or module accessors or 491 when attributes are provided by instances 492 * Class-relative accesses occur with instance accessors when attributes are 493 provided by classes 494 495 Object-relative accesses merely involve obtaining attributes directly from 496 accessors. Class-relative accesses involve obtaining the class of each 497 accessor and then obtaining an attribute from that class. 498 499 === Defining the First Access Method === 500 501 For dynamic or unidentified accessors, the above access properties define the 502 access method on the first attribute in the chain of attributes provided. 503 However, any redefinition of the accessor to a static accessor may influence 504 the first method. For static accessors, the first method is always 505 object-relative since classes and modules do not offer transparent mechanisms 506 to expose the attributes on their own classes. 507 508 Static and identified, dynamic accessors should already be known to support 509 the specified attributes. Other accessors require an access method to be used 510 that also checks whether the accessor supports a given attribute. 511 512 === Redefining the Accessor === 513 514 With knowledge about the initial accessor, the attributes involved in the 515 access operation are then considered in the context of the accessor. For each 516 attribute name in the chain described for an access, an attempt is made to 517 obtain the details of the attribute of the given name from the accessor, 518 determining whether these details can be used as an accessor to obtain 519 subsequent attribute details. 520 521 Where more than one attribute identity is obtained, traversal is terminated: 522 all remaining attributes must be traversed at run-time. If an attribute 523 identified during traversal is static, the first access method changes 524 accordingly. 525 526 === Defining the Final Access Method === 527 528 An access can also involve an assignment to the accessed attribute or the 529 invocation of the accessed attribute. Such operations change the nature of the 530 access method used on the final attribute in a chain of attribute names. 531 532 === Identifying Context Information === 533 534 Final attribute accesses involving callables need to yield context information 535 that can subsequently be used to invoke those callables. Where the nature of 536 an accessed attribute is not known, a simplistic attempt can be made to look 537 up all attributes stored using the attribute name in the program. 538 539 Of particular interest are the following situations: 540 541 * Where class attributes are being accessed via instances, whether the 542 attributes are all methods that can be bound upon access 543 * Where class attributes may be accessed via instances, whether any 544 attributes could be methods 545 546 Such considerations dictate whether the context information originates from 547 the attribute or from the accessor and whether any run-time test is required 548 to determine this. 549 550 == Preparing Instruction Plans == 551 552 Instruction plans are sequences of program operations, encoded in a 553 higher-level form than actual program instructions, that describe the steps 554 needed to access attributes. Such sequences are produced from the details 555 provided by individual access plans. 556 557 === Original Accessor === 558 559 The expression providing the object from which the first attribute is obtained 560 (or the only attribute if only one is specified) is known as the original 561 accessor. Where this accessor can be identified, the expression describing it 562 in the program can potentially be replaced with a static reference, and 563 subsequent mentions of the accessor can potentially be replaced with such 564 references or names used as expressions. 565 566 || '''Access Plan Information''' || '''Original Accessor''' || 567 || Static accessor identified || Identified accessor || 568 || Named accessor access, not invocation || Indicated name || 569 || Named accessor invocation, accessor known to provide the attribute || Indicated name || 570 || Named accessor invocation, accessor not known to provide the attribute || Accessor expression || 571 || Other accessors || Accessor expression || 572 573 By using names or static references, the need to store the result of 574 evaluating an accessor expression is eliminated because such labels can be 575 inserted in the instruction sequence as required. 576 577 === First Access Operation === 578 579 Although it may be possible to convert accesses into single instructions that 580 obtain attributes directly, many accesses will involve access operations that 581 must consult an accessor object and obtain an attribute from that object, at 582 least for the first attribute in a chain of attributes. This occurs when the 583 access plan indicates the following situations: 584 585 * Final method is an access (meaning that an attribute cannot be directly 586 obtained) 587 * Final method is an assignment (requiring the object whose attribute will be 588 updated) 589 * Attributes (identified or otherwise) need traversing 590 591 === Accessor Nature === 592 593 Attribute assignments involve a single '''target accessor''' and potentially 594 many other accessors, depending on how many distinct expressions are evaluated 595 to yield the value to be set in the assignment. Such a target accessor will 596 usually be derived from the evaluation of an expression, and in some cases the 597 expression will be the result of an opaque operation such as the invocation of 598 a function. In such cases, the target accessor is stored in a temporary 599 variable for subsequent use in access operations. 600 601 === Context === 602 603 === Access Tests === 604 605 === Traversing Identified Attributes === 606 607 === Traversing Unidentified Attributes === 608 609 === Final Access Operation === 610 611 === Context Testing === 612 613 == Deduction Products == 614 615 The deduction process should produce a complete catalogue of accessor and 616 access references that may then be consulted by the 617 [[../Translation|translation]] process needing to know the nature of any 618 operation within the program. Central to the translation process's 619 understanding of references is the '''attribute access plan''' for each 620 reference which characterises each access and provides the basis for the 621 formulation of the '''instruction plan''' used to replicate it in the final 622 program.