Lichen (file docs/wiki/Deduction at 9adc6e8a6fed)

     1 = Deducing Program Behaviour =     2      3 With information from [[../Inspection|inspection]] available, the isolated observations about operations in a program may now be combined with knowledge about the program's structure to produce deductions about the nature of each operation.     4      5 <<TableOfContents(2,3)>>     6      7 == The Deduction Process ==     8      9 The deduction process takes observations made during the [[../Inspection|inspection process]] and attempts to form deductions about the behaviour of the program primarily in terms of the nature of the attribute '''accesses''', with their corresponding '''accessors''', featuring in the program. Where attributes are used in conjunction with names, accessors are name versions; where attributes are used in conjunction with other expressions, accessors are '''anonymous'''.    10     11 === Indexes ===    12     13 For efficiency, indexes are defined to establish relationships between the following things:    14     15 {{{#!table    16 '''Indexes''' || '''Details'''    17 ==    18 `access_index` || Which accessors (name versions) are involved with access operations    19 ==    20 `location_index` || Which attribute usage patterns are supported by accessors (name versions)    21 ==    22 `attr_class_types`<<BR>>`attr_instance_types`<<BR>>`attr_module_types`    23 || Which types support which attribute names    24 ==    25 `assigned_attrs`    26 || Which usage patterns involve attribute assignment    27 ==    28 `reference_assignments`    29 || Which accesses involve assignments    30 ==    31 `reference_invocations`    32 || Which accesses involve invocations    33 ==    34 `alias_index`    35 || Which names are aliases for other names, accesses or invocations    36 }}}    37     38 The objective of deduction is to combine these indexes to establish new relationships between the different participants of these basic index relationships.    39     40 === Building Indexes ===    41     42 The building of indexes from inspection data is approximately as follows:    43     44 {{{#!graphviz    45 //format=svg    46 //transform=notugly    47 digraph indexes {    48   node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Indexes"];    49   edge [tooltip="Indexes"];    50   rankdir=LR;    51     52   all_attr_accesses [label="all_attr_accesses\n(anonymous accesses)"];    53   attr_usage [label="attr_usage\n(attribute usage)"];    54   location_index [label="location_index\n(usage by accessor)"];    55     56   all_attrs [label="all_class_attrs | all_instance_attrs | all_module attrs | (all attributes)",shape=record];    57   attr_types [label="attr_class_types | attr_instance_types | attr_module_types | (types by attribute name)",shape=record];    58     59   attr_accessors [label="attr_accessors\n(accessors by access)"];    60   access_index [label="access_index\n(accessor locations by access location)"];    61     62   all_attr_access_modifiers [label="all_attr_access_modifiers\n(operations/modifiers by access)"];    63   reference_assignments [label="reference_assignments\n(assignment accesses)"];    64   reference_invocations [label="reference_invocations\n(invocation accesses)"];    65   assigned_attrs [label="assigned_attrs\n(assignment accesses by access)"];    66     67   all_aliased_names [label="all_aliased_names\n(accesses by alias)"];    68   alias_index [label="alias_index\n(access locations by accessor/alias location)"];    69     70   init_usage_index [label="init_usage_index",shape=ellipse];    71   init_attr_type_indexes [label="init_attr_type_indexes",shape=ellipse];    72   init_accessors [label="init_accessors",shape=ellipse];    73   init_accesses [label="init_accesses",shape=ellipse];    74   init_aliases [label="init_aliases",shape=ellipse];    75     76   all_attr_accesses -> init_usage_index;    77   attr_usage -> init_usage_index;    78   init_usage_index -> location_index;    79     80   all_attrs -> init_attr_type_indexes -> attr_types;    81     82   attr_accessors -> init_accessors -> access_index;    83     84   all_attr_access_modifiers -> init_accesses;    85   init_accesses -> reference_assignments;    86   init_accesses -> reference_invocations;    87   init_accesses -> assigned_attrs;    88     89   all_aliased_names -> init_aliases -> alias_index;    90 }    91 }}}    92     93 === Deriving Types from Indexes and Accesses ===    94     95 The indexes are employed in deduction approximately as follows:    96     97 {{{#!graphviz    98 //format=svg    99 //transform=notugly   100 digraph deduction {   101   node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Deduction"];   102   edge [tooltip="Deduction"];   103   rankdir=LR;   104    105   all_attr_accesses [label="all_attr_accesses\n(anonymous accesses)"];   106   location_index [label="location_index\n(usage by accessor)"];   107    108   attr_types [label="attr_class_types | attr_instance_types | attr_module_types | (types by attribute name)",shape=record];   109    110   all_initialised_names [label="all_initialised_names\n(types by accessor)"];   111    112   access_index [label="access_index\n(accessor locations by access location)"];   113    114   alias_index [label="alias_index\n(access locations by accessor/alias location)"];   115    116   record_types_for_usage [label="record_types_for_usage",shape=ellipse];   117   record_types_for_attribute [label="record_types_for_attribute",shape=ellipse];   118    119   accessor_types [label="accessor_class_types | accessor_instance_types | accessor_module_types | (types by accessor)",shape=record];   120   provider_types [label="provider_class_types | provider_instance_types | provider_module_types | (provider types by accessor)",shape=record];   121    122   location_index -> record_types_for_usage;   123   attr_types -> record_types_for_usage;   124   all_initialised_names -> record_types_for_usage;   125   access_index -> record_types_for_usage;   126   alias_index -> record_types_for_usage;    127    128   record_types_for_usage -> provider_types;   129   record_types_for_usage -> accessor_types;   130    131   attr_types -> record_types_for_attribute;   132   all_attr_accesses -> record_types_for_attribute;   133    134   record_types_for_attribute -> provider_types;   135   record_types_for_attribute -> accessor_types;   136 }   137 }}}   138    139 === Converting Usage to Types ===   140    141 A particularly important operation in the deduction process is that of converting attribute usage information to a set of types supporting such usage. Taking the mapping of attribute names to types, each attribute name provided by a usage observation can be applied, progressively narrowing the set of types available to support the complete attribute usage collection.   142    143 {{{#!graphviz   144 //format=svg   145 //transform=notugly   146 digraph usage_to_types {   147   node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Usage to types"];   148   edge [tooltip="Usage to types"];   149   rankdir=LR;   150    151   usage [label="(a, b, c)",style=filled,fillcolor=darkorange];   152    153   subgraph {   154     rank=same;   155     attr_a [label="attribute a",style=filled,fillcolor=gold];   156     attr_b [label="attribute b",style=filled,fillcolor=gold];   157     attr_c [label="attribute c",style=filled,fillcolor=gold];   158   }   159    160   index [label="types by attribute name",shape=ellipse];   161    162   subgraph {   163     rank=same;   164     type_P [label="type P"];   165     type_Q [label="type Q",style=filled,fillcolor=gold];   166     type_R [label="type R"];   167     type_S [label="type S"];   168   }   169    170   deduction [label="(a, b, c) needs type Q",style=filled,fillcolor=darkorange];   171    172   usage -> attr_a;   173   usage -> attr_b;   174   usage -> attr_c;   175    176   attr_a -> attr_b -> attr_c [style=dashed,dir=none];   177   attr_a -> index -> type_P [style=dashed,dir=none];   178   type_P -> type_Q -> type_R -> type_S [style=dashed,dir=none];   179    180   attr_a -> type_P;   181   attr_a -> type_Q;   182   attr_b -> type_Q;   183   attr_b -> type_R;   184   attr_c -> type_Q;   185   attr_c -> type_S;   186    187   type_Q -> deduction;   188 }   189 }}}   190    191 The types supporting attribute usage are the attribute '''providers'''. Where providers are classes, the affected accessors in the program may also be instances, since instances also support access to attributes of the instantiated class (and its ancestors).   192    193 Indexes mapping attributes to types must combine consideration of class and instance attributes in order to correctly identify instance providers. Consider the following example:   194    195 {{{#!graphviz   196 //format=svg   197 //transform=notugly   198 digraph instance_providers {   199   node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Instance providers"];   200   edge [tooltip="Instance providers"];   201   rankdir=LR;   202    203   usage [label="(a, b, c)",style=filled,fillcolor=darkorange];   204    205   subgraph {   206     rank=same;   207     combined [label="combined attributes",shape=ellipse];   208     attr_a [label="attribute a",style=filled,fillcolor=gold];   209     attr_c [label="attribute c",style=filled,fillcolor=gold];   210     attr_b [label="attribute b",style=filled,fillcolor=gold];   211   }   212    213   subgraph {   214     rank=same;   215     class_C [label="class C"];   216     attr_Ca [label="attribute C.a",style=filled,fillcolor=gold];   217     attr_Cc [label="attribute C.c",style=filled,fillcolor=gold];   218     instance_C [label="instance of C"];   219     attr_Ib [label="attribute (C).b",style=filled,fillcolor=gold];   220   }   221    222   type_C [label="type C",style=filled,fillcolor=darkorange];   223    224   combined -> attr_a -> attr_c -> attr_b [style=dashed,dir=none];   225   class_C -> attr_Ca -> attr_Cc [style=dashed,dir=none];   226   instance_C -> attr_Ib [style=dashed,dir=none];   227    228   usage -> attr_a -> attr_Ca;   229   usage -> attr_b -> attr_Ib;   230   usage -> attr_c -> attr_Cc;   231    232   attr_Ca -> type_C;   233   attr_Cc -> type_C;   234   attr_Ib -> type_C;   235 }   236 }}}   237    238 To recognise support by instance accessors for the usage pattern concerned, attribute details must be obtained from classes as well as instances. Note that the identified type cannot support such usage purely by providing class or instance attributes alone.   239    240 === Attribute Assignments ===   241    242 Since attribute access operations may occur as part of assignments, the nature of accesses is recorded during inspection. Combining the indexed information for assignment accesses allows the deducer to determine the most pessimistic effects on the program structure of such assignments.   243    244 Taking each attribute usage set employed by accessors involved in assignments, the types are deduced for such accessors, and each individual attribute known to be used in such assignments is then applied to the deduced types, '''mutating''' them in such a way that deductions may no longer assume a fixed identity for such attributes when obtained from these types.   245    246 {{{#!graphviz   247 //format=svg   248 //transform=notugly   249 digraph assignments {   250   node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Attribute assignments"];   251   edge [tooltip="Attribute assignments"];   252   rankdir=LR;   253    254   subgraph {   255     rank=same;   256     usage [label="(a, b, c)",style=filled,fillcolor=darkorange];   257     assigned [label="b is assigned",style=filled,fillcolor=darkorange];   258   }   259    260   deduction [label="(a, b, c) needs type Q",style=filled,fillcolor=gold];   261    262   subgraph {   263     rank=same;   264     type_Q [label="type Q",style=filled,fillcolor=gold];   265     attr_Qa [label="attribute Q.a"];   266     attr_Qb [label="attribute Q.b\n(mutated)",style=filled,fillcolor=darkorange];   267     attr_Qc [label="attribute Q.c"];   268   }   269    270   type_Q -> attr_Qa -> attr_Qb -> attr_Qc [style=dashed,dir=none];   271   usage -> deduction -> type_Q;   272   assigned -> attr_Qb;   273 }   274 }}}   275    276 === Refining Type Deductions ===   277    278 In the context of a specific access, the types may be resolved further:   279    280  * Any name whose initialisation could be determined during inspection can be associated with its initialised type   281  * Any name referring to a constant object can be associated with the type of that object   282  * Usage of `self` in methods can result in only compatible class and instance types being retained from the types obtained from usage deductions   283    284 === Reference Identification ===   285    286 The basic information about every accessor and accessed attribute in a program can now be combined to produce specific '''references''' to identities consistent with the inspection observations. Several different kinds of accessors and access situations exist:   287    288  * Name-based accesses involving attribute usage   289  * Aliases to names, possibly accompanied by accesses   290  * Anonymous accesses involving individual attributes   291  * Constant or previously-identified names, possibly accompanied by accesses   292    293 === Aliases ===   294    295 Names initialised using other names or attribute accesses, or using the invocation of either of these things, are known as '''aliases'''. Information about aliases originates during inspection when such names are initialised with expression results within the program. During the resolution activity finalising the inspected details, this initialisation information is used to define relationships between aliases and other names and accesses.   296    297 During deduction, attribute accesses are investigated and type information computed for both attribute accesses and accessors. The relationships defined between accesses and aliases can then be employed to propagate such deduced information to the aliases and to define appropriate type characteristics for them.   298    299 Where aliases are used as the basis of attribute accesses, this propagation refines the previously deduced information about the aliases and may also refine the details of the accesses with which the alias is involved.   300    301 === Invocation Target Suitability ===   302    303 Having identified accessed attributes, invocation information can be applied in cases where such attributes immediately participate in an invocation, comparing the specified argument details against the parameter details defined for the identified attribute, which must be a callable object for this technique to work. Where arguments do not appear to be suitable - either the number of arguments is incorrect or any keyword argument refer to non-existent parameters - the attribute providing the parameter details can be considered unsuitable for the access.   304    305 === Classifying Accessors ===   306    307 Each accessor, being a particular version of a name, will have type information associated with it as a consequence of the deduction process. Such information takes the following forms:   308    309  * Provider types, indicating which types may provide the attributes used by the accessor   310  * Accessor types, indicating which types will actually appear as the accessor   311    312 This information can be processed in a number of ways to produce the following:   313    314  * All types (from all kinds of type) of providers able to provide attributes via the accessor   315  * All types (from all kinds of type) of accessors compatible with the accessor   316  * The most general types of accessors compatible with the accessor   317    318 Where many types may be associated with an accessor, identifying the most general types involves eliminating those which are derived from others. Given that descendant types may not remove support for attributes provided by their ancestors, then where an ancestor type has been identified as a possible accessor, it should follow that all of its descendants may also have been identified as possible accessors. Since these descendants should be compatible, identifying them individually is unnecessary: merely specifying that the common ancestor or ''any'' descendant could provide an accessor is sufficient.   319    320 ==== Defining Guards ====   321    322 A '''guard''' is a constraint supported by a run-time test indicating the compliance of an accessor with a particular type. Thus, where a single accessor type has been identified, a guard may be established for an accessor that tests the type of the accessor against a specific type. Where a single ''general'' accessor type is identified, a guard is established that tests the accessor against a "common" type: that is, an ancestor type with which other descendant types may comply.   323    324 === Classifying Accesses ===   325    326 At the point of classifying accesses, information is available about the following:   327    328  * The accessors potentially involved in each access   329  * The types of accessors and the types providing attributes via those accessors   330  * Any guards applying to the accessors   331  * Whether an access is constrained by certain program characteristics and is thus guaranteed to be as deduced   332  * The possible attributes referenced by the access   333    334 This information can be processed in a number of ways to produce the following:   335    336  * The types of accessors, both general and specific, applying to each access   337  * The attributes that can be provided by each access, consolidating existing referenced attribute details   338  * The general types providing the attributes   339    340 Since more than one accessor may be involved, information from all accessors must be combined to determine whether guard information still applies to the access, or whether it is possible for an accessor to be used that has an uncertain type at run-time.   341    342 ==== Defining Tests ====   343    344 A '''test''' at the access level is a necessary check performed on an accessor before an access to determine whether it belongs to a certain type or group of types.   345    346 Where guards applying to accessors apply by extension to an access, it may not be enough to assume that the the attributes exposed by the accessor are the same as those considered acceptable through deduction. Therefore, attributes are obtained for the access using the applicable guard types as accessors. If this set of attributes does not include potentially accessible attributes (perhaps because the guard types are broadly defined and do not support all attribute usage), a test must be generated.   347    348 Where a single attribute provider can be identified, a test for a specific type is generated. Where a single general type can be identified as a provider, a test against a "common" type is generated. Tests involving the built-in `object` are not generated since it is the root of all classes and such tests would merely identify an accessor as a class. In all cases where a single, specific type cannot be tested for, the general attribute validation mechanism must be used instead.   349    350 == Preparing Access Descriptions ==   351    352 The accumulated deduced knowledge is directed towards producing access descriptions or plans which characterise each access in terms of the following:   353    354  * The initial accessor, being the starting object for attribute accesses, unless a static accessor has been identified   355  * Details of any test required on the initial accessor   356  * Details of any type employed by the test   357  * Any identified static accessor (to be used as the initial accessor)   358  * Attributes needing to be traversed from the accessor that yield unambiguous objects   359  * Access modes for each of the unambiguously-traversed attributes   360  * Remaining attributes needing to be tested and traversed (after having traversed the above attributes)   361  * Details of the context   362  * Any test to apply to the context (to ensure its validity)   363  * The method of obtaining the first attribute   364  * The method of obtaining the final attribute   365  * Any identified static final attribute   366  * The kinds of objects providing the final attribute   367    368 === Characterising the Accessor ===   369    370 For a given access location, the referenced attribute details established during deduction are used to determine...   371    372  * Whether the initial accessor is static, originating from a constant access or involving an identifiable static object   373  * Whether the initial accessor is dynamic but has a known, deduced identity   374    375 Some useful information about the accessor and about the actual provider of the first accessed attribute can be defined:   376    377 || || '''Class''' || '''Instance''' || '''Module''' ||   378 || '''Accessor''' || Class accessor || Instance accessor || Module accessor ||   379 || '''Provider''' || Class provider || Instance provider || ||   380    381 Depending on which of these criteria are satisfied, some properties of the access operation can be established:   382    383  * Object-relative accesses occur with class accessors or module accessors or when attributes are provided by instances   384  * Class-relative accesses occur with instance accessors when attributes are provided by classes   385    386 Object-relative accesses merely involve obtaining attributes directly from accessors. Class-relative accesses involve obtaining the class of each accessor and then obtaining an attribute from that class.   387    388 === Defining the First Access Method ===   389    390 For dynamic or unidentified accessors, the above access properties define the access method on the first attribute in the chain of attributes provided. However, any redefinition of the accessor to a static accessor may influence the first method. For static accessors, the first method is always object-relative since classes and modules do not offer transparent mechanisms to expose the attributes on their own classes.   391    392 Static and identified, dynamic accessors should already be known to support the specified attributes. Other accessors require an access method to be used that also checks whether the accessor supports a given attribute.   393    394 === Redefining the Accessor ===   395    396 With knowledge about the initial accessor, the attributes involved in the access operation are then considered in the context of the accessor. For each attribute name in the chain described for an access, an attempt is made to obtain the details of the attribute of the given name from the accessor, determining whether these details can be used as an accessor to obtain subsequent attribute details.   397    398 Where more than one attribute identity is obtained, traversal is terminated: all remaining attributes must be traversed at run-time. If an attribute identified during traversal is static, the first access method changes accordingly.   399    400 === Defining the Final Access Method ===   401    402 An access can also involve an assignment to the accessed attribute or the invocation of the accessed attribute. Such operations change the nature of the access method used on the final attribute in a chain of attribute names.   403    404 === Identifying Context Information ===   405    406 Final attribute accesses involving callables need to yield context information that can subsequently be used to invoke those callables. Where the nature of an accessed attribute is not known, a simplistic attempt can be made to look up all attributes stored using the attribute name in the program.   407    408 Of particular interest are the following situations:   409    410  * Where class attributes are being accessed via instances, whether the attributes are all methods that can be bound upon access   411  * Where class attributes may be accessed via instances, whether any attributes could be methods   412    413 Such considerations dictate whether the context information originates from the attribute or from the accessor and whether any run-time test is required to determine this.   414    415 == Preparing Instruction Plans ==   416    417 Instruction plans are sequences of program operations, encoded in a higher-level form than actual program instructions, that describe the steps needed to access attributes. Such sequences are produced from the details provided by individual access plans.   418    419 === Original Accessor ===   420    421 The expression providing the object from which the first attribute is obtained (or the only attribute if only one is specified) is known as the original accessor. Where this accessor can be identified, the expression describing it in the program can potentially be replaced with a static reference, and subsequent mentions of the accessor can potentially be replaced with such references or names used as expressions.   422    423 || '''Access Plan Information''' || '''Original Accessor''' ||   424 || Static accessor identified || Identified accessor ||   425 || Named accessor access, not invocation || Indicated name ||   426 || Named accessor invocation, accessor known to provide the attribute || Indicated name ||   427 || Named accessor invocation, accessor not known to provide the attribute || Accessor expression ||   428 || Other accessors || Accessor expression ||   429    430 By using names or static references, the need to store the result of evaluating an accessor expression is eliminated because such labels can be inserted in the instruction sequence as required.   431    432 === First Access Operation ===   433    434 Although it may be possible to convert accesses into single instructions that obtain attributes directly, many accesses will involve access operations that must consult an accessor object and obtain an attribute from that object, at least for the first attribute in a chain of attributes. This occurs when the access plan indicates the following situations:   435    436  * Final method is an access (meaning that an attribute cannot be directly obtained)   437  * Final method is an assignment (requiring the object whose attribute will be updated)   438  * Attributes (identified or otherwise) need traversing   439    440 === Accessor Nature ===   441    442 Attribute assignments involve a single '''target accessor''' and potentially many other accessors, depending on how many distinct expressions are evaluated to yield the value to be set in the assignment. Such a target accessor will usually be derived from the evaluation of an expression, and in some cases the expression will be the result of an opaque operation such as the invocation of a function. In such cases, the target accessor is stored in a temporary variable for subsequent use in access operations.   443    444 === Context ===   445    446 === Access Tests ===   447    448 === Traversing Identified Attributes ===   449    450 === Traversing Unidentified Attributes ===   451    452 === Final Access Operation ===   453    454 === Context Testing ===   455    456 == Deduction Products ==   457    458 The deduction process should produce a complete catalogue of accessor and access references that may then be consulted by the [[../Translation|translation]] process needing to know the nature of any operation within the program. Central to the translation process's understanding of references is the '''attribute access plan''' for each reference which characterises each access and provides the basis for the formulation of the '''instruction plan''' used to replicate it in the final program.