Lichen

docs/wiki/Deduction

861:f745d151a441
2018-08-17 Paul Boddie Wrapped text in the documentation for more convenient "offline" editing.
     1 = Deducing Program Behaviour =     2      3 With information from [[../Inspection|inspection]] available, the isolated     4 observations about operations in a program may now be combined with knowledge     5 about the program's structure to produce deductions about the nature of each     6 operation.     7      8 <<TableOfContents(2,3)>>     9     10 == The Deduction Process ==    11     12 The deduction process takes observations made during the    13 [[../Inspection|inspection process]] and attempts to form deductions about the    14 behaviour of the program primarily in terms of the nature of the attribute    15 '''accesses''', with their corresponding '''accessors''', featuring in the    16 program. Where attributes are used in conjunction with names, accessors are    17 name versions; where attributes are used in conjunction with other    18 expressions, accessors are '''anonymous'''.    19     20 === Indexes ===    21     22 For efficiency, indexes are defined to establish relationships between the    23 following things:    24     25 {{{#!table    26 '''Indexes''' || '''Details'''    27 ==    28 `access_index` || Which accessors (name versions) are involved with access    29                .. operations    30 ==    31 `location_index` || Which attribute usage patterns are supported by accessors    32                  .. (name versions)    33 ==    34 `attr_class_types`<<BR>>`attr_instance_types`<<BR>>`attr_module_types`    35 || Which types support which attribute names    36 ==    37 `assigned_attrs`    38 || Which usage patterns involve attribute assignment    39 ==    40 `reference_assignments`    41 || Which accesses involve assignments    42 ==    43 `reference_invocations`    44 || Which accesses involve invocations    45 ==    46 `alias_index`    47 || Which names are aliases for other names, accesses or invocations    48 }}}    49     50 The objective of deduction is to combine these indexes to establish new    51 relationships between the different participants of these basic index    52 relationships.    53     54 === Building Indexes ===    55     56 The building of indexes from inspection data is approximately as follows:    57     58 {{{#!graphviz    59 //format=svg    60 //transform=notugly    61 digraph indexes {    62   node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Indexes"];    63   edge [tooltip="Indexes"];    64   rankdir=LR;    65     66   all_attr_accesses [label="all_attr_accesses\n(anonymous accesses)"];    67   attr_usage [label="attr_usage\n(attribute usage)"];    68   location_index [label="location_index\n(usage by accessor)"];    69     70   all_attrs [label="all_class_attrs | all_instance_attrs | all_module attrs | (all attributes)",shape=record];    71   attr_types [label="attr_class_types | attr_instance_types | attr_module_types | (types by attribute name)",shape=record];    72     73   attr_accessors [label="attr_accessors\n(accessors by access)"];    74   access_index [label="access_index\n(accessor locations by access location)"];    75     76   all_attr_access_modifiers [label="all_attr_access_modifiers\n(operations/modifiers by access)"];    77   reference_assignments [label="reference_assignments\n(assignment accesses)"];    78   reference_invocations [label="reference_invocations\n(invocation accesses)"];    79   assigned_attrs [label="assigned_attrs\n(assignment accesses by access)"];    80     81   all_aliased_names [label="all_aliased_names\n(accesses by alias)"];    82   alias_index [label="alias_index\n(access locations by accessor/alias location)"];    83     84   init_usage_index [label="init_usage_index",shape=ellipse];    85   init_attr_type_indexes [label="init_attr_type_indexes",shape=ellipse];    86   init_accessors [label="init_accessors",shape=ellipse];    87   init_accesses [label="init_accesses",shape=ellipse];    88   init_aliases [label="init_aliases",shape=ellipse];    89     90   all_attr_accesses -> init_usage_index;    91   attr_usage -> init_usage_index;    92   init_usage_index -> location_index;    93     94   all_attrs -> init_attr_type_indexes -> attr_types;    95     96   attr_accessors -> init_accessors -> access_index;    97     98   all_attr_access_modifiers -> init_accesses;    99   init_accesses -> reference_assignments;   100   init_accesses -> reference_invocations;   101   init_accesses -> assigned_attrs;   102    103   all_aliased_names -> init_aliases -> alias_index;   104 }   105 }}}   106    107 === Deriving Types from Indexes and Accesses ===   108    109 The indexes are employed in deduction approximately as follows:   110    111 {{{#!graphviz   112 //format=svg   113 //transform=notugly   114 digraph deduction {   115   node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Deduction"];   116   edge [tooltip="Deduction"];   117   rankdir=LR;   118    119   all_attr_accesses [label="all_attr_accesses\n(anonymous accesses)"];   120   location_index [label="location_index\n(usage by accessor)"];   121    122   attr_types [label="attr_class_types | attr_instance_types | attr_module_types | (types by attribute name)",shape=record];   123    124   all_initialised_names [label="all_initialised_names\n(types by accessor)"];   125    126   access_index [label="access_index\n(accessor locations by access location)"];   127    128   alias_index [label="alias_index\n(access locations by accessor/alias location)"];   129    130   record_types_for_usage [label="record_types_for_usage",shape=ellipse];   131   record_types_for_attribute [label="record_types_for_attribute",shape=ellipse];   132    133   accessor_types [label="accessor_class_types | accessor_instance_types | accessor_module_types | (types by accessor)",shape=record];   134   provider_types [label="provider_class_types | provider_instance_types | provider_module_types | (provider types by accessor)",shape=record];   135    136   location_index -> record_types_for_usage;   137   attr_types -> record_types_for_usage;   138   all_initialised_names -> record_types_for_usage;   139   access_index -> record_types_for_usage;   140   alias_index -> record_types_for_usage;    141    142   record_types_for_usage -> provider_types;   143   record_types_for_usage -> accessor_types;   144    145   attr_types -> record_types_for_attribute;   146   all_attr_accesses -> record_types_for_attribute;   147    148   record_types_for_attribute -> provider_types;   149   record_types_for_attribute -> accessor_types;   150 }   151 }}}   152    153 === Converting Usage to Types ===   154    155 A particularly important operation in the deduction process is that of   156 converting attribute usage information to a set of types supporting such   157 usage. Taking the mapping of attribute names to types, each attribute name   158 provided by a usage observation can be applied, progressively narrowing the   159 set of types available to support the complete attribute usage collection.   160    161 {{{#!graphviz   162 //format=svg   163 //transform=notugly   164 digraph usage_to_types {   165   node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Usage to types"];   166   edge [tooltip="Usage to types"];   167   rankdir=LR;   168    169   usage [label="(a, b, c)",style=filled,fillcolor=darkorange];   170    171   subgraph {   172     rank=same;   173     attr_a [label="attribute a",style=filled,fillcolor=gold];   174     attr_b [label="attribute b",style=filled,fillcolor=gold];   175     attr_c [label="attribute c",style=filled,fillcolor=gold];   176   }   177    178   index [label="types by attribute name",shape=ellipse];   179    180   subgraph {   181     rank=same;   182     type_P [label="type P"];   183     type_Q [label="type Q",style=filled,fillcolor=gold];   184     type_R [label="type R"];   185     type_S [label="type S"];   186   }   187    188   deduction [label="(a, b, c) needs type Q",style=filled,fillcolor=darkorange];   189    190   usage -> attr_a;   191   usage -> attr_b;   192   usage -> attr_c;   193    194   attr_a -> attr_b -> attr_c [style=dashed,dir=none];   195   attr_a -> index -> type_P [style=dashed,dir=none];   196   type_P -> type_Q -> type_R -> type_S [style=dashed,dir=none];   197    198   attr_a -> type_P;   199   attr_a -> type_Q;   200   attr_b -> type_Q;   201   attr_b -> type_R;   202   attr_c -> type_Q;   203   attr_c -> type_S;   204    205   type_Q -> deduction;   206 }   207 }}}   208    209 The types supporting attribute usage are the attribute '''providers'''. Where   210 providers are classes, the affected accessors in the program may also be   211 instances, since instances also support access to attributes of the   212 instantiated class (and its ancestors).   213    214 Indexes mapping attributes to types must combine consideration of class and   215 instance attributes in order to correctly identify instance providers.   216 Consider the following example:   217    218 {{{#!graphviz   219 //format=svg   220 //transform=notugly   221 digraph instance_providers {   222   node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Instance providers"];   223   edge [tooltip="Instance providers"];   224   rankdir=LR;   225    226   usage [label="(a, b, c)",style=filled,fillcolor=darkorange];   227    228   subgraph {   229     rank=same;   230     combined [label="combined attributes",shape=ellipse];   231     attr_a [label="attribute a",style=filled,fillcolor=gold];   232     attr_c [label="attribute c",style=filled,fillcolor=gold];   233     attr_b [label="attribute b",style=filled,fillcolor=gold];   234   }   235    236   subgraph {   237     rank=same;   238     class_C [label="class C"];   239     attr_Ca [label="attribute C.a",style=filled,fillcolor=gold];   240     attr_Cc [label="attribute C.c",style=filled,fillcolor=gold];   241     instance_C [label="instance of C"];   242     attr_Ib [label="attribute (C).b",style=filled,fillcolor=gold];   243   }   244    245   type_C [label="type C",style=filled,fillcolor=darkorange];   246    247   combined -> attr_a -> attr_c -> attr_b [style=dashed,dir=none];   248   class_C -> attr_Ca -> attr_Cc [style=dashed,dir=none];   249   instance_C -> attr_Ib [style=dashed,dir=none];   250    251   usage -> attr_a -> attr_Ca;   252   usage -> attr_b -> attr_Ib;   253   usage -> attr_c -> attr_Cc;   254    255   attr_Ca -> type_C;   256   attr_Cc -> type_C;   257   attr_Ib -> type_C;   258 }   259 }}}   260    261 To recognise support by instance accessors for the usage pattern concerned,   262 attribute details must be obtained from classes as well as instances. Note   263 that the identified type cannot support such usage purely by providing class   264 or instance attributes alone.   265    266 === Attribute Assignments ===   267    268 Since attribute access operations may occur as part of assignments, the nature   269 of accesses is recorded during inspection. Combining the indexed information   270 for assignment accesses allows the deducer to determine the most pessimistic   271 effects on the program structure of such assignments.   272    273 Taking each attribute usage set employed by accessors involved in assignments,   274 the types are deduced for such accessors, and each individual attribute known   275 to be used in such assignments is then applied to the deduced types,   276 '''mutating''' them in such a way that deductions may no longer assume a fixed   277 identity for such attributes when obtained from these types.   278    279 {{{#!graphviz   280 //format=svg   281 //transform=notugly   282 digraph assignments {   283   node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Attribute assignments"];   284   edge [tooltip="Attribute assignments"];   285   rankdir=LR;   286    287   subgraph {   288     rank=same;   289     usage [label="(a, b, c)",style=filled,fillcolor=darkorange];   290     assigned [label="b is assigned",style=filled,fillcolor=darkorange];   291   }   292    293   deduction [label="(a, b, c) needs type Q",style=filled,fillcolor=gold];   294    295   subgraph {   296     rank=same;   297     type_Q [label="type Q",style=filled,fillcolor=gold];   298     attr_Qa [label="attribute Q.a"];   299     attr_Qb [label="attribute Q.b\n(mutated)",style=filled,fillcolor=darkorange];   300     attr_Qc [label="attribute Q.c"];   301   }   302    303   type_Q -> attr_Qa -> attr_Qb -> attr_Qc [style=dashed,dir=none];   304   usage -> deduction -> type_Q;   305   assigned -> attr_Qb;   306 }   307 }}}   308    309 === Refining Type Deductions ===   310    311 In the context of a specific access, the types may be resolved further:   312    313  * Any name whose initialisation could be determined during inspection can be   314    associated with its initialised type   315  * Any name referring to a constant object can be associated with the type of   316    that object   317  * Usage of `self` in methods can result in only compatible class and instance   318    types being retained from the types obtained from usage deductions   319    320 === Reference Identification ===   321    322 The basic information about every accessor and accessed attribute in a program   323 can now be combined to produce specific '''references''' to identities   324 consistent with the inspection observations. Several different kinds of   325 accessors and access situations exist:   326    327  * Name-based accesses involving attribute usage   328  * Aliases to names, possibly accompanied by accesses   329  * Anonymous accesses involving individual attributes   330  * Constant or previously-identified names, possibly accompanied by accesses   331    332 === Aliases ===   333    334 Names initialised using other names or attribute accesses, or using the   335 invocation of either of these things, are known as '''aliases'''. Information   336 about aliases originates during inspection when such names are initialised   337 with expression results within the program. During the resolution activity   338 finalising the inspected details, this initialisation information is used to   339 define relationships between aliases and other names and accesses.   340    341 During deduction, attribute accesses are investigated and type information   342 computed for both attribute accesses and accessors. The relationships defined   343 between accesses and aliases can then be employed to propagate such deduced   344 information to the aliases and to define appropriate type characteristics for   345 them.   346    347 Where aliases are used as the basis of attribute accesses, this propagation   348 refines the previously deduced information about the aliases and may also   349 refine the details of the accesses with which the alias is involved.   350    351 === Invocation Target Suitability ===   352    353 Having identified accessed attributes, invocation information can be applied   354 in cases where such attributes immediately participate in an invocation,   355 comparing the specified argument details against the parameter details defined   356 for the identified attribute, which must be a callable object for this   357 technique to work. Where arguments do not appear to be suitable - either the   358 number of arguments is incorrect or any keyword argument refer to non-existent   359 parameters - the attribute providing the parameter details can be considered   360 unsuitable for the access.   361    362 === Classifying Accessors ===   363    364 Each accessor, being a particular version of a name, will have type   365 information associated with it as a consequence of the deduction process. Such   366 information takes the following forms:   367    368  * Provider types, indicating which types may provide the attributes used by   369    the accessor   370  * Accessor types, indicating which types will actually appear as the accessor   371    372 This information can be processed in a number of ways to produce the following:   373    374  * All types (from all kinds of type) of providers able to provide attributes   375    via the accessor   376  * All types (from all kinds of type) of accessors compatible with the   377    accessor   378  * The most general types of accessors compatible with the accessor   379    380 Where many types may be associated with an accessor, identifying the most   381 general types involves eliminating those which are derived from others. Given   382 that descendant types may not remove support for attributes provided by their   383 ancestors, then where an ancestor type has been identified as a possible   384 accessor, it should follow that all of its descendants may also have been   385 identified as possible accessors. Since these descendants should be   386 compatible, identifying them individually is unnecessary: merely specifying   387 that the common ancestor or ''any'' descendant could provide an accessor is   388 sufficient.   389    390 ==== Defining Guards ====   391    392 A '''guard''' is a constraint supported by a run-time test indicating the   393 compliance of an accessor with a particular type. Thus, where a single   394 accessor type has been identified, a guard may be established for an accessor   395 that tests the type of the accessor against a specific type. Where a single   396 ''general'' accessor type is identified, a guard is established that tests the   397 accessor against a "common" type: that is, an ancestor type with which other   398 descendant types may comply.   399    400 === Classifying Accesses ===   401    402 At the point of classifying accesses, information is available about the   403 following:   404    405  * The accessors potentially involved in each access   406  * The types of accessors and the types providing attributes via those   407    accessors   408  * Any guards applying to the accessors   409  * Whether an access is constrained by certain program characteristics and is   410    thus guaranteed to be as deduced   411  * The possible attributes referenced by the access   412    413 This information can be processed in a number of ways to produce the   414 following:   415    416  * The types of accessors, both general and specific, applying to each access   417  * The attributes that can be provided by each access, consolidating existing   418    referenced attribute details   419  * The general types providing the attributes   420    421 Since more than one accessor may be involved, information from all accessors   422 must be combined to determine whether guard information still applies to the   423 access, or whether it is possible for an accessor to be used that has an   424 uncertain type at run-time.   425    426 ==== Defining Tests ====   427    428 A '''test''' at the access level is a necessary check performed on an accessor   429 before an access to determine whether it belongs to a certain type or group of   430 types.   431    432 Where guards applying to accessors apply by extension to an access, it may not   433 be enough to assume that the the attributes exposed by the accessor are the   434 same as those considered acceptable through deduction. Therefore, attributes   435 are obtained for the access using the applicable guard types as accessors. If   436 this set of attributes does not include potentially accessible attributes   437 (perhaps because the guard types are broadly defined and do not support all   438 attribute usage), a test must be generated.   439    440 Where a single attribute provider can be identified, a test for a specific   441 type is generated. Where a single general type can be identified as a   442 provider, a test against a "common" type is generated. Tests involving the   443 built-in `object` are not generated since it is the root of all classes and   444 such tests would merely identify an accessor as a class. In all cases where a   445 single, specific type cannot be tested for, the general attribute validation   446 mechanism must be used instead.   447    448 == Preparing Access Descriptions ==   449    450 The accumulated deduced knowledge is directed towards producing access   451 descriptions or plans which characterise each access in terms of the   452 following:   453    454  * The initial accessor, being the starting object for attribute accesses,   455    unless a static accessor has been identified   456  * Details of any test required on the initial accessor   457  * Details of any type employed by the test   458  * Any identified static accessor (to be used as the initial accessor)   459  * Attributes needing to be traversed from the accessor that yield unambiguous   460    objects   461  * Access modes for each of the unambiguously-traversed attributes   462  * Remaining attributes needing to be tested and traversed (after having   463    traversed the above attributes)   464  * Details of the context   465  * Any test to apply to the context (to ensure its validity)   466  * The method of obtaining the first attribute   467  * The method of obtaining the final attribute   468  * Any identified static final attribute   469  * The kinds of objects providing the final attribute   470    471 === Characterising the Accessor ===   472    473 For a given access location, the referenced attribute details established   474 during deduction are used to determine...   475    476  * Whether the initial accessor is static, originating from a constant access   477    or involving an identifiable static object   478  * Whether the initial accessor is dynamic but has a known, deduced identity   479    480 Some useful information about the accessor and about the actual provider of   481 the first accessed attribute can be defined:   482    483 || || '''Class''' || '''Instance''' || '''Module''' ||   484 || '''Accessor''' || Class accessor || Instance accessor || Module accessor ||   485 || '''Provider''' || Class provider || Instance provider || ||   486    487 Depending on which of these criteria are satisfied, some properties of the   488 access operation can be established:   489    490  * Object-relative accesses occur with class accessors or module accessors or   491    when attributes are provided by instances   492  * Class-relative accesses occur with instance accessors when attributes are   493    provided by classes   494    495 Object-relative accesses merely involve obtaining attributes directly from   496 accessors. Class-relative accesses involve obtaining the class of each   497 accessor and then obtaining an attribute from that class.   498    499 === Defining the First Access Method ===   500    501 For dynamic or unidentified accessors, the above access properties define the   502 access method on the first attribute in the chain of attributes provided.   503 However, any redefinition of the accessor to a static accessor may influence   504 the first method. For static accessors, the first method is always   505 object-relative since classes and modules do not offer transparent mechanisms   506 to expose the attributes on their own classes.   507    508 Static and identified, dynamic accessors should already be known to support   509 the specified attributes. Other accessors require an access method to be used   510 that also checks whether the accessor supports a given attribute.   511    512 === Redefining the Accessor ===   513    514 With knowledge about the initial accessor, the attributes involved in the   515 access operation are then considered in the context of the accessor. For each   516 attribute name in the chain described for an access, an attempt is made to   517 obtain the details of the attribute of the given name from the accessor,   518 determining whether these details can be used as an accessor to obtain   519 subsequent attribute details.   520    521 Where more than one attribute identity is obtained, traversal is terminated:   522 all remaining attributes must be traversed at run-time. If an attribute   523 identified during traversal is static, the first access method changes   524 accordingly.   525    526 === Defining the Final Access Method ===   527    528 An access can also involve an assignment to the accessed attribute or the   529 invocation of the accessed attribute. Such operations change the nature of the   530 access method used on the final attribute in a chain of attribute names.   531    532 === Identifying Context Information ===   533    534 Final attribute accesses involving callables need to yield context information   535 that can subsequently be used to invoke those callables. Where the nature of   536 an accessed attribute is not known, a simplistic attempt can be made to look   537 up all attributes stored using the attribute name in the program.   538    539 Of particular interest are the following situations:   540    541  * Where class attributes are being accessed via instances, whether the   542    attributes are all methods that can be bound upon access   543  * Where class attributes may be accessed via instances, whether any   544    attributes could be methods   545    546 Such considerations dictate whether the context information originates from   547 the attribute or from the accessor and whether any run-time test is required   548 to determine this.   549    550 == Preparing Instruction Plans ==   551    552 Instruction plans are sequences of program operations, encoded in a   553 higher-level form than actual program instructions, that describe the steps   554 needed to access attributes. Such sequences are produced from the details   555 provided by individual access plans.   556    557 === Original Accessor ===   558    559 The expression providing the object from which the first attribute is obtained   560 (or the only attribute if only one is specified) is known as the original   561 accessor. Where this accessor can be identified, the expression describing it   562 in the program can potentially be replaced with a static reference, and   563 subsequent mentions of the accessor can potentially be replaced with such   564 references or names used as expressions.   565    566 || '''Access Plan Information''' || '''Original Accessor''' ||   567 || Static accessor identified || Identified accessor ||   568 || Named accessor access, not invocation || Indicated name ||   569 || Named accessor invocation, accessor known to provide the attribute || Indicated name ||   570 || Named accessor invocation, accessor not known to provide the attribute || Accessor expression ||   571 || Other accessors || Accessor expression ||   572    573 By using names or static references, the need to store the result of   574 evaluating an accessor expression is eliminated because such labels can be   575 inserted in the instruction sequence as required.   576    577 === First Access Operation ===   578    579 Although it may be possible to convert accesses into single instructions that   580 obtain attributes directly, many accesses will involve access operations that   581 must consult an accessor object and obtain an attribute from that object, at   582 least for the first attribute in a chain of attributes. This occurs when the   583 access plan indicates the following situations:   584    585  * Final method is an access (meaning that an attribute cannot be directly   586    obtained)   587  * Final method is an assignment (requiring the object whose attribute will be   588    updated)   589  * Attributes (identified or otherwise) need traversing   590    591 === Accessor Nature ===   592    593 Attribute assignments involve a single '''target accessor''' and potentially   594 many other accessors, depending on how many distinct expressions are evaluated   595 to yield the value to be set in the assignment. Such a target accessor will   596 usually be derived from the evaluation of an expression, and in some cases the   597 expression will be the result of an opaque operation such as the invocation of   598 a function. In such cases, the target accessor is stored in a temporary   599 variable for subsequent use in access operations.   600    601 === Context ===   602    603 === Access Tests ===   604    605 === Traversing Identified Attributes ===   606    607 === Traversing Unidentified Attributes ===   608    609 === Final Access Operation ===   610    611 === Context Testing ===   612    613 == Deduction Products ==   614    615 The deduction process should produce a complete catalogue of accessor and   616 access references that may then be consulted by the   617 [[../Translation|translation]] process needing to know the nature of any   618 operation within the program. Central to the translation process's   619 understanding of references is the '''attribute access plan''' for each   620 reference which characterises each access and provides the basis for the   621 formulation of the '''instruction plan''' used to replicate it in the final   622 program.