1 = Deducing Program Behaviour = 2 3 With information from [[../Inspection|inspection]] available, the isolated observations about operations in a program may now be combined with knowledge about the program's structure to produce deductions about the nature of each operation. 4 5 <<TableOfContents(2,3)>> 6 7 == The Deduction Process == 8 9 The deduction process takes observations made during the [[../Inspection|inspection process]] and attempts to form deductions about the behaviour of the program primarily in terms of the nature of the attribute '''accesses''', with their corresponding '''accessors''', featuring in the program. Where attributes are used in conjunction with names, accessors are name versions; where attributes are used in conjunction with other expressions, accessors are '''anonymous'''. 10 11 === Indexes === 12 13 For efficiency, indexes are defined to establish relationships between the following things: 14 15 {{{#!table 16 '''Indexes''' || '''Details''' 17 == 18 `access_index` || Which accessors (name versions) are involved with access operations 19 == 20 `location_index` || Which attribute usage patterns are supported by accessors (name versions) 21 == 22 `attr_class_types`<<BR>>`attr_instance_types`<<BR>>`attr_module_types` 23 || Which types support which attribute names 24 == 25 `assigned_attrs` 26 || Which usage patterns involve attribute assignment 27 == 28 `reference_assignments` 29 || Which accesses involve assignments 30 == 31 `reference_invocations` 32 || Which accesses involve invocations 33 == 34 `alias_index` 35 || Which names are aliases for other names, accesses or invocations 36 }}} 37 38 The objective of deduction is to combine these indexes to establish new relationships between the different participants of these basic index relationships. 39 40 === Building Indexes === 41 42 The building of indexes from inspection data is approximately as follows: 43 44 {{{#!graphviz 45 //format=svg 46 //transform=notugly 47 digraph indexes { 48 node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Indexes"]; 49 edge [tooltip="Indexes"]; 50 rankdir=LR; 51 52 all_attr_accesses [label="all_attr_accesses\n(anonymous accesses)"]; 53 attr_usage [label="attr_usage\n(attribute usage)"]; 54 location_index [label="location_index\n(usage by accessor)"]; 55 56 all_attrs [label="all_class_attrs | all_instance_attrs | all_module attrs | (all attributes)",shape=record]; 57 attr_types [label="attr_class_types | attr_instance_types | attr_module_types | (types by attribute name)",shape=record]; 58 59 attr_accessors [label="attr_accessors\n(accessors by access)"]; 60 access_index [label="access_index\n(accessor locations by access location)"]; 61 62 all_attr_access_modifiers [label="all_attr_access_modifiers\n(operations/modifiers by access)"]; 63 reference_assignments [label="reference_assignments\n(assignment accesses)"]; 64 reference_invocations [label="reference_invocations\n(invocation accesses)"]; 65 assigned_attrs [label="assigned_attrs\n(assignment accesses by access)"]; 66 67 all_aliased_names [label="all_aliased_names\n(accesses by alias)"]; 68 alias_index [label="alias_index\n(access locations by accessor/alias location)"]; 69 70 init_usage_index [label="init_usage_index",shape=ellipse]; 71 init_attr_type_indexes [label="init_attr_type_indexes",shape=ellipse]; 72 init_accessors [label="init_accessors",shape=ellipse]; 73 init_accesses [label="init_accesses",shape=ellipse]; 74 init_aliases [label="init_aliases",shape=ellipse]; 75 76 all_attr_accesses -> init_usage_index; 77 attr_usage -> init_usage_index; 78 init_usage_index -> location_index; 79 80 all_attrs -> init_attr_type_indexes -> attr_types; 81 82 attr_accessors -> init_accessors -> access_index; 83 84 all_attr_access_modifiers -> init_accesses; 85 init_accesses -> reference_assignments; 86 init_accesses -> reference_invocations; 87 init_accesses -> assigned_attrs; 88 89 all_aliased_names -> init_aliases -> alias_index; 90 } 91 }}} 92 93 === Deriving Types from Indexes and Accesses === 94 95 The indexes are employed in deduction approximately as follows: 96 97 {{{#!graphviz 98 //format=svg 99 //transform=notugly 100 digraph deduction { 101 node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Deduction"]; 102 edge [tooltip="Deduction"]; 103 rankdir=LR; 104 105 all_attr_accesses [label="all_attr_accesses\n(anonymous accesses)"]; 106 location_index [label="location_index\n(usage by accessor)"]; 107 108 attr_types [label="attr_class_types | attr_instance_types | attr_module_types | (types by attribute name)",shape=record]; 109 110 all_initialised_names [label="all_initialised_names\n(types by accessor)"]; 111 112 access_index [label="access_index\n(accessor locations by access location)"]; 113 114 alias_index [label="alias_index\n(access locations by accessor/alias location)"]; 115 116 record_types_for_usage [label="record_types_for_usage",shape=ellipse]; 117 record_types_for_attribute [label="record_types_for_attribute",shape=ellipse]; 118 119 accessor_types [label="accessor_class_types | accessor_instance_types | accessor_module_types | (types by accessor)",shape=record]; 120 provider_types [label="provider_class_types | provider_instance_types | provider_module_types | (provider types by accessor)",shape=record]; 121 122 location_index -> record_types_for_usage; 123 attr_types -> record_types_for_usage; 124 all_initialised_names -> record_types_for_usage; 125 access_index -> record_types_for_usage; 126 alias_index -> record_types_for_usage; 127 128 record_types_for_usage -> provider_types; 129 record_types_for_usage -> accessor_types; 130 131 attr_types -> record_types_for_attribute; 132 all_attr_accesses -> record_types_for_attribute; 133 134 record_types_for_attribute -> provider_types; 135 record_types_for_attribute -> accessor_types; 136 } 137 }}} 138 139 === Converting Usage to Types === 140 141 A particularly important operation in the deduction process is that of converting attribute usage information to a set of types supporting such usage. Taking the mapping of attribute names to types, each attribute name provided by a usage observation can be applied, progressively narrowing the set of types available to support the complete attribute usage collection. 142 143 {{{#!graphviz 144 //format=svg 145 //transform=notugly 146 digraph usage_to_types { 147 node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Usage to types"]; 148 edge [tooltip="Usage to types"]; 149 rankdir=LR; 150 151 usage [label="(a, b, c)",style=filled,fillcolor=darkorange]; 152 153 subgraph { 154 rank=same; 155 attr_a [label="attribute a",style=filled,fillcolor=gold]; 156 attr_b [label="attribute b",style=filled,fillcolor=gold]; 157 attr_c [label="attribute c",style=filled,fillcolor=gold]; 158 } 159 160 index [label="types by attribute name",shape=ellipse]; 161 162 subgraph { 163 rank=same; 164 type_P [label="type P"]; 165 type_Q [label="type Q",style=filled,fillcolor=gold]; 166 type_R [label="type R"]; 167 type_S [label="type S"]; 168 } 169 170 deduction [label="(a, b, c) needs type Q",style=filled,fillcolor=darkorange]; 171 172 usage -> attr_a; 173 usage -> attr_b; 174 usage -> attr_c; 175 176 attr_a -> attr_b -> attr_c [style=dashed,dir=none]; 177 attr_a -> index -> type_P [style=dashed,dir=none]; 178 type_P -> type_Q -> type_R -> type_S [style=dashed,dir=none]; 179 180 attr_a -> type_P; 181 attr_a -> type_Q; 182 attr_b -> type_Q; 183 attr_b -> type_R; 184 attr_c -> type_Q; 185 attr_c -> type_S; 186 187 type_Q -> deduction; 188 } 189 }}} 190 191 The types supporting attribute usage are the attribute '''providers'''. Where providers are classes, the affected accessors in the program may also be instances, since instances also support access to attributes of the instantiated class (and its ancestors). 192 193 Indexes mapping attributes to types must combine consideration of class and instance attributes in order to correctly identify instance providers. Consider the following example: 194 195 {{{#!graphviz 196 //format=svg 197 //transform=notugly 198 digraph instance_providers { 199 node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Instance providers"]; 200 edge [tooltip="Instance providers"]; 201 rankdir=LR; 202 203 usage [label="(a, b, c)",style=filled,fillcolor=darkorange]; 204 205 subgraph { 206 rank=same; 207 combined [label="combined attributes",shape=ellipse]; 208 attr_a [label="attribute a",style=filled,fillcolor=gold]; 209 attr_c [label="attribute c",style=filled,fillcolor=gold]; 210 attr_b [label="attribute b",style=filled,fillcolor=gold]; 211 } 212 213 subgraph { 214 rank=same; 215 class_C [label="class C"]; 216 attr_Ca [label="attribute C.a",style=filled,fillcolor=gold]; 217 attr_Cc [label="attribute C.c",style=filled,fillcolor=gold]; 218 instance_C [label="instance of C"]; 219 attr_Ib [label="attribute (C).b",style=filled,fillcolor=gold]; 220 } 221 222 type_C [label="type C",style=filled,fillcolor=darkorange]; 223 224 combined -> attr_a -> attr_c -> attr_b [style=dashed,dir=none]; 225 class_C -> attr_Ca -> attr_Cc [style=dashed,dir=none]; 226 instance_C -> attr_Ib [style=dashed,dir=none]; 227 228 usage -> attr_a -> attr_Ca; 229 usage -> attr_b -> attr_Ib; 230 usage -> attr_c -> attr_Cc; 231 232 attr_Ca -> type_C; 233 attr_Cc -> type_C; 234 attr_Ib -> type_C; 235 } 236 }}} 237 238 To recognise support by instance accessors for the usage pattern concerned, attribute details must be obtained from classes as well as instances. Note that the identified type cannot support such usage purely by providing class or instance attributes alone. 239 240 === Attribute Assignments === 241 242 Since attribute access operations may occur as part of assignments, the nature of accesses is recorded during inspection. Combining the indexed information for assignment accesses allows the deducer to determine the most pessimistic effects on the program structure of such assignments. 243 244 Taking each attribute usage set employed by accessors involved in assignments, the types are deduced for such accessors, and each individual attribute known to be used in such assignments is then applied to the deduced types, '''mutating''' them in such a way that deductions may no longer assume a fixed identity for such attributes when obtained from these types. 245 246 {{{#!graphviz 247 //format=svg 248 //transform=notugly 249 digraph assignments { 250 node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Attribute assignments"]; 251 edge [tooltip="Attribute assignments"]; 252 rankdir=LR; 253 254 subgraph { 255 rank=same; 256 usage [label="(a, b, c)",style=filled,fillcolor=darkorange]; 257 assigned [label="b is assigned",style=filled,fillcolor=darkorange]; 258 } 259 260 deduction [label="(a, b, c) needs type Q",style=filled,fillcolor=gold]; 261 262 subgraph { 263 rank=same; 264 type_Q [label="type Q",style=filled,fillcolor=gold]; 265 attr_Qa [label="attribute Q.a"]; 266 attr_Qb [label="attribute Q.b\n(mutated)",style=filled,fillcolor=darkorange]; 267 attr_Qc [label="attribute Q.c"]; 268 } 269 270 type_Q -> attr_Qa -> attr_Qb -> attr_Qc [style=dashed,dir=none]; 271 usage -> deduction -> type_Q; 272 assigned -> attr_Qb; 273 } 274 }}} 275 276 === Refining Type Deductions === 277 278 In the context of a specific access, the types may be resolved further: 279 280 * Any name whose initialisation could be determined during inspection can be associated with its initialised type 281 * Any name referring to a constant object can be associated with the type of that object 282 * Usage of `self` in methods can result in only compatible class and instance types being retained from the types obtained from usage deductions 283 284 === Reference Identification === 285 286 The basic information about every accessor and accessed attribute in a program can now be combined to produce specific '''references''' to identities consistent with the inspection observations. Several different kinds of accessors and access situations exist: 287 288 * Name-based accesses involving attribute usage 289 * Aliases to names, possibly accompanied by accesses 290 * Anonymous accesses involving individual attributes 291 * Constant or previously-identified names, possibly accompanied by accesses 292 293 === Aliases === 294 295 Names initialised using other names or attribute accesses, or using the invocation of either of these things, are known as '''aliases'''. Information about aliases originates during inspection when such names are initialised with expression results within the program. During the resolution activity finalising the inspected details, this initialisation information is used to define relationships between aliases and other names and accesses. 296 297 During deduction, attribute accesses are investigated and type information computed for both attribute accesses and accessors. The relationships defined between accesses and aliases can then be employed to propagate such deduced information to the aliases and to define appropriate type characteristics for them. 298 299 Where aliases are used as the basis of attribute accesses, this propagation refines the previously deduced information about the aliases and may also refine the details of the accesses with which the alias is involved. 300 301 === Invocation Target Suitability === 302 303 Having identified accessed attributes, invocation information can be applied in cases where such attributes immediately participate in an invocation, comparing the specified argument details against the parameter details defined for the identified attribute, which must be a callable object for this technique to work. Where arguments do not appear to be suitable - either the number of arguments is incorrect or any keyword argument refer to non-existent parameters - the attribute providing the parameter details can be considered unsuitable for the access. 304 305 === Classifying Accessors === 306 307 Each accessor, being a particular version of a name, will have type information associated with it as a consequence of the deduction process. Such information takes the following forms: 308 309 * Provider types, indicating which types may provide the attributes used by the accessor 310 * Accessor types, indicating which types will actually appear as the accessor 311 312 This information can be processed in a number of ways to produce the following: 313 314 * All types (from all kinds of type) of providers able to provide attributes via the accessor 315 * All types (from all kinds of type) of accessors compatible with the accessor 316 * The most general types of accessors compatible with the accessor 317 318 Where many types may be associated with an accessor, identifying the most general types involves eliminating those which are derived from others. Given that descendant types may not remove support for attributes provided by their ancestors, then where an ancestor type has been identified as a possible accessor, it should follow that all of its descendants may also have been identified as possible accessors. Since these descendants should be compatible, identifying them individually is unnecessary: merely specifying that the common ancestor or ''any'' descendant could provide an accessor is sufficient. 319 320 ==== Defining Guards ==== 321 322 A '''guard''' is a constraint supported by a run-time test indicating the compliance of an accessor with a particular type. Thus, where a single accessor type has been identified, a guard may be established for an accessor that tests the type of the accessor against a specific type. Where a single ''general'' accessor type is identified, a guard is established that tests the accessor against a "common" type: that is, an ancestor type with which other descendant types may comply. 323 324 === Classifying Accesses === 325 326 At the point of classifying accesses, information is available about the following: 327 328 * The accessors potentially involved in each access 329 * The types of accessors and the types providing attributes via those accessors 330 * Any guards applying to the accessors 331 * Whether an access is constrained by certain program characteristics and is thus guaranteed to be as deduced 332 * The possible attributes referenced by the access 333 334 This information can be processed in a number of ways to produce the following: 335 336 * The types of accessors, both general and specific, applying to each access 337 * The attributes that can be provided by each access, consolidating existing referenced attribute details 338 * The general types providing the attributes 339 340 Since more than one accessor may be involved, information from all accessors must be combined to determine whether guard information still applies to the access, or whether it is possible for an accessor to be used that has an uncertain type at run-time. 341 342 ==== Defining Tests ==== 343 344 A '''test''' at the access level is a necessary check performed on an accessor before an access to determine whether it belongs to a certain type or group of types. 345 346 Where guards applying to accessors apply by extension to an access, it may not be enough to assume that the the attributes exposed by the accessor are the same as those considered acceptable through deduction. Therefore, attributes are obtained for the access using the applicable guard types as accessors. If this set of attributes does not include potentially accessible attributes (perhaps because the guard types are broadly defined and do not support all attribute usage), a test must be generated. 347 348 Where a single attribute provider can be identified, a test for a specific type is generated. Where a single general type can be identified as a provider, a test against a "common" type is generated. Tests involving the built-in `object` are not generated since it is the root of all classes and such tests would merely identify an accessor as a class. In all cases where a single, specific type cannot be tested for, the general attribute validation mechanism must be used instead. 349 350 == Preparing Access Descriptions == 351 352 The accumulated deduced knowledge is directed towards producing access descriptions or plans which characterise each access in terms of the following: 353 354 * The initial accessor, being the starting object for attribute accesses, unless a static accessor has been identified 355 * Details of any test required on the initial accessor 356 * Details of any type employed by the test 357 * Any identified static accessor (to be used as the initial accessor) 358 * Attributes needing to be traversed from the accessor that yield unambiguous objects 359 * Access modes for each of the unambiguously-traversed attributes 360 * Remaining attributes needing to be tested and traversed (after having traversed the above attributes) 361 * Details of the context 362 * Any test to apply to the context (to ensure its validity) 363 * The method of obtaining the first attribute 364 * The method of obtaining the final attribute 365 * Any identified static final attribute 366 * The kinds of objects providing the final attribute 367 368 === Characterising the Accessor === 369 370 For a given access location, the referenced attribute details established during deduction are used to determine... 371 372 * Whether the initial accessor is static, originating from a constant access or involving an identifiable static object 373 * Whether the initial accessor is dynamic but has a known, deduced identity 374 375 Some useful information about the accessor and about the actual provider of the first accessed attribute can be defined: 376 377 || || '''Class''' || '''Instance''' || '''Module''' || 378 || '''Accessor''' || Class accessor || Instance accessor || Module accessor || 379 || '''Provider''' || Class provider || Instance provider || || 380 381 Depending on which of these criteria are satisfied, some properties of the access operation can be established: 382 383 * Object-relative accesses occur with class accessors or module accessors or when attributes are provided by instances 384 * Class-relative accesses occur with instance accessors when attributes are provided by classes 385 386 Object-relative accesses merely involve obtaining attributes directly from accessors. Class-relative accesses involve obtaining the class of each accessor and then obtaining an attribute from that class. 387 388 === Defining the First Access Method === 389 390 For dynamic or unidentified accessors, the above access properties define the access method on the first attribute in the chain of attributes provided. However, any redefinition of the accessor to a static accessor may influence the first method. For static accessors, the first method is always object-relative since classes and modules do not offer transparent mechanisms to expose the attributes on their own classes. 391 392 Static and identified, dynamic accessors should already be known to support the specified attributes. Other accessors require an access method to be used that also checks whether the accessor supports a given attribute. 393 394 === Redefining the Accessor === 395 396 With knowledge about the initial accessor, the attributes involved in the access operation are then considered in the context of the accessor. For each attribute name in the chain described for an access, an attempt is made to obtain the details of the attribute of the given name from the accessor, determining whether these details can be used as an accessor to obtain subsequent attribute details. 397 398 Where more than one attribute identity is obtained, traversal is terminated: all remaining attributes must be traversed at run-time. If an attribute identified during traversal is static, the first access method changes accordingly. 399 400 === Defining the Final Access Method === 401 402 An access can also involve an assignment to the accessed attribute or the invocation of the accessed attribute. Such operations change the nature of the access method used on the final attribute in a chain of attribute names. 403 404 === Identifying Context Information === 405 406 Final attribute accesses involving callables need to yield context information that can subsequently be used to invoke those callables. Where the nature of an accessed attribute is not known, a simplistic attempt can be made to look up all attributes stored using the attribute name in the program. 407 408 Of particular interest are the following situations: 409 410 * Where class attributes are being accessed via instances, whether the attributes are all methods that can be bound upon access 411 * Where class attributes may be accessed via instances, whether any attributes could be methods 412 413 Such considerations dictate whether the context information originates from the attribute or from the accessor and whether any run-time test is required to determine this. 414 415 == Preparing Instruction Plans == 416 417 Instruction plans are sequences of program operations, encoded in a higher-level form than actual program instructions, that describe the steps needed to access attributes. Such sequences are produced from the details provided by individual access plans. 418 419 === Original Accessor === 420 421 The expression providing the object from which the first attribute is obtained (or the only attribute if only one is specified) is known as the original accessor. Where this accessor can be identified, the expression describing it in the program can potentially be replaced with a static reference, and subsequent mentions of the accessor can potentially be replaced with such references or names used as expressions. 422 423 || '''Access Plan Information''' || '''Original Accessor''' || 424 || Static accessor identified || Identified accessor || 425 || Named accessor access, not invocation || Indicated name || 426 || Named accessor invocation, accessor known to provide the attribute || Indicated name || 427 || Named accessor invocation, accessor not known to provide the attribute || Accessor expression || 428 || Other accessors || Accessor expression || 429 430 By using names or static references, the need to store the result of evaluating an accessor expression is eliminated because such labels can be inserted in the instruction sequence as required. 431 432 === First Access Operation === 433 434 Although it may be possible to convert accesses into single instructions that obtain attributes directly, many accesses will involve access operations that must consult an accessor object and obtain an attribute from that object, at least for the first attribute in a chain of attributes. This occurs when the access plan indicates the following situations: 435 436 * Final method is an access (meaning that an attribute cannot be directly obtained) 437 * Final method is an assignment (requiring the object whose attribute will be updated) 438 * Attributes (identified or otherwise) need traversing 439 440 === Accessor Nature === 441 442 Attribute assignments involve a single '''target accessor''' and potentially many other accessors, depending on how many distinct expressions are evaluated to yield the value to be set in the assignment. Such a target accessor will usually be derived from the evaluation of an expression, and in some cases the expression will be the result of an opaque operation such as the invocation of a function. In such cases, the target accessor is stored in a temporary variable for subsequent use in access operations. 443 444 === Context === 445 446 === Access Tests === 447 448 === Traversing Identified Attributes === 449 450 === Traversing Unidentified Attributes === 451 452 === Final Access Operation === 453 454 === Context Testing === 455 456 == Deduction Products == 457 458 The deduction process should produce a complete catalogue of accessor and access references that may then be consulted by the [[../Translation|translation]] process needing to know the nature of any operation within the program. Central to the translation process's understanding of references is the '''attribute access plan''' for each reference which characterises each access and provides the basis for the formulation of the '''instruction plan''' used to replicate it in the final program.