1 = Optimisation = 2 3 In the process of generating an output representation of a Lichen program, the 4 process of optimisation is concerned with positioning members within program 5 structures. Attributes are positioned in object structures, and '''object 6 tables''' are defined so that attributes can be located in such objects. 7 Similarly, parameters, whose positions are determined by their appearance in 8 function and other callable signatures, have positioning information defined 9 in '''parameter tables''' that can be used to position arguments in parameter 10 arrays when their names are used in argument lists. 11 12 Also performed during optimisation is the consolidation of constant 13 information, initially discussed [[../Inspection#Literals_and_Constants|in the 14 context of inspection]]. 15 16 <<TableOfContents(2,3)>> 17 18 == Populating Objects == 19 20 A program will have objects that are classes, modules and instances, and each 21 such object will provide a number of attributes. Each object's attributes are 22 stored in an array. For simplicity, each attribute having a given name will 23 always be positioned at the same place in any array provided by an object 24 featuring an attribute with that name. Even different object types will use 25 the same position. 26 27 Consider an attribute called `elephant` provided by a number of types: 28 29 || '''Type''' ||<-5> '''Attributes''' || 30 || class C || `__class__` || `ant` || `dog` ||<style="background-color: #ddd"> `elephant` || `cat` || 31 || class D || `__class__` || `ant` || `duck` ||<style="background-color: #ddd"> `elephant` || `horse` || 32 || class E || `__class__` || `ant` || `dog` || `giraffe` || `horse` || 33 || module M || `__class__` || `rhino` || `hippo` ||<style="background-color: #ddd"> `elephant` || `zebra` || 34 35 Where `elephant` is provided as an attribute, it will always appear in the 36 same position - in this case as the fourth attribute - in any object attribute 37 array. 38 39 Consequently, any operation involving an object of unknown identity that 40 employs `elephant` can employ the same position information to determine 41 whether the attribute is supported and to retrieve or modify its value. Where 42 an object does not support `elephant`, it may use the same attribute position 43 for another attribute. Determining whether objects support attributes is done 44 using tables, described below. 45 46 It should be noted that the positions of attributes in object structures are 47 the same as the positions of attribute identifiers in object tables. With 48 attribute positions allocated, table definition is trivial. 49 50 === Allocation Considerations === 51 52 Such common positioning of attributes demands a degree of coordination between 53 objects. If `elephant` is positioned in a given place in one object, then 54 given the constraint of it only ever being positioned in a single location for 55 all objects, it may not then be positioned in a different place in another 56 object. Thus, where two attributes co-exist in an object, their positions must 57 be different and will affect all objects supporting those attributes, 58 regardless of whether they support them both. For example: 59 60 || '''Type''' ||<-5> '''Attributes''' || 61 || class C || `__class__` ||<style="background-color: #ddd"> `ant` || `dog` ||<style="background-color: #ddd"> `elephant` || `cat` || 62 || class D || `__class__` ||<style="background-color: #ddd"> `ant` || `duck` ||<style="background-color: #ddd"> `elephant` || `horse` || 63 || class E || `__class__` ||<style="background-color: #ddd"> `ant` || `dog` || `giraffe` || `horse` || 64 || module M || `__class__` || `rhino` || `hippo` ||<style="background-color: #ddd"> `elephant` || `zebra` || 65 || module N || `__class__` || || ||<style="background-color: #ddd"> `elephant` || `zebra` || 66 67 Here, module N still positions `elephant` in the fourth position. If 68 `elephant` were moved to the second position, it would conflict with `ant` or 69 `rhino` in those objects supporting those attributes. Such coordination 70 therefore has an impact on allocation strategies. Care has to be taken to 71 allocate attributes in such a way that small objects (having few attributes) 72 do not have their attributes positioned far from the start of their attribute 73 arrays, because failing to do so effectively makes those objects large, 74 inefficiently-represented objects. 75 76 A reasonable solution is to consider the following when allocating attribute 77 positions: 78 79 * Attribute frequency (or ubiquity) 80 * Object size 81 * Object kind (whether the object is a class, module or instance) 82 83 More frequently-occurring (or ubiquitous) attributes may need to appear in 84 both large and small objects and should probably be allocated in lower 85 positions (`__class__` being an extreme example of needing to be allocated for 86 all objects). Attributes featured in small objects should also be given 87 priority for lower positions due to the desirability of keeping such objects 88 small. Meanwhile, classes and modules only appear once in a program, whereas 89 there may be countless instances allocated during the lifetime of a program's 90 execution. Therefore, attributes featured in instances should have priority 91 over other attributes for lower positions in object structures. 92 93 == Populating Signatures == 94 95 The positions of parameters in callable signatures are determined by the 96 definitions of those signatures in the source program. When callables are 97 invoked using an argument list, arguments that are not specified using 98 keywords are merely copied into the parameter array in the same position. 99 However, keyword arguments will need to have their positions looked up in the 100 appropriate parameter table for the callable involved. 101 102 Consequently, no allocation needs to be performed on the parameters 103 themselves: they have specific positions for each callable. However, just as 104 attribute names must yield positions when accessing attributes on objects, a 105 similar mechanism that takes parameter names and yields positions must be 106 provided. It is instead the positions of parameter details that must be 107 allocated in structures to be consulted as if parameter names were attribute 108 names, with attributes providing parameter position details. 109 110 Consider the following functions: 111 112 {{{#!python numbers=disable 113 def f(a, b, c, d): 114 ... 115 116 def g(p, q, r): 117 ... 118 119 def h(d, p, v): 120 ... 121 }}} 122 123 In order to find the position of each parameter using its name, the following 124 table could be provided: 125 126 || '''Table''' ||<-4> '''Parameters''' || 127 || function f || `a` at 1 || `b` at 2 || `c` at 3 || `d` at 4 || 128 || function g || `q` at 2 ||<style="background-color: #ddd"> `p` at 1 || `r` at 3 || || 129 || function h || `v` at 2 ||<style="background-color: #ddd"> `p` at 2 || || `d` at 1 || 130 131 Such a table behaves like an object structure (or an object table) with 132 parameters acting like attributes in such a structure. Here, the attributes 133 yield the positions of parameters within parameter arrays. Note how `p` always 134 resides in the same location but may yield different positions depending on 135 the actual callable involved (as is also observable with `d`). 136 137 === Allocation Considerations === 138 139 Parameter table allocation involves similar considerations to those 140 influencing object table allocation. In order to keep parameter tables small, 141 more frequently appearing parameters should be positioned earlier in arrays. A 142 specific consideration of importance is that of the number of tables 143 generated. There may be many callables with the same signature, and such 144 callables therefore do not need their own parameter tables since they will 145 merely be duplicates. An attempt is therefore made to identify distinct 146 signatures and to generate parameter tables only for these signatures, instead 147 of providing a one-to-one mapping between callables and tables. 148 149 == Populating Tables == 150 151 With names allocated to positions in each kind of table, the straightforward 152 task of generating such tables proceeds as follows. 153 154 === Object Tables and Attribute Codes === 155 156 Object tables consist of locations directly corresponding to structure member 157 locations. Each location in a table will correspond to a specific name, but 158 since potentially many names may have the same position, a table must provide 159 identifying details for the name that is actually supported. 160 161 In object tables, such identifying details take the form of '''attribute 162 codes'''. Each attribute name is mapped to a distinct identifier, and upon 163 consulting an object table, the lookup process must read the stored attribute 164 code and compare it to the code for the attribute it intends to access. If 165 these codes match, then it can be assumed that the object involved supports 166 the named attribute. Otherwise, a different attribute (or even no attribute at 167 all) resides at that location in the object structure, making the access 168 inappropriate. 169 170 A more comprehensive object table resembles the following: 171 172 || '''Type''' ||<-5> '''Attributes (Codes)''' || 173 || class C || `__class__` (1) || `ant` (2) || `dog` (4) || `elephant` (6) || `cat` (3) || 174 || class D || `__class__` (1) || `ant` (2) || `duck` (5) || `elephant` (6) || `horse` (9) || 175 || class E || `__class__` (1) || `ant` (2) || `dog` (4) || `giraffe` (7) || `horse` (9) || 176 || module M || `__class__` (1) || `rhino` (10) || `hippo` (8) || `elephant` (6) || `zebra` (11) || 177 || module N || `__class__` (1) || || || `elephant` (6) || `zebra` (11) || 178 179 The following attribute codes would be employed: 180 181 || '''Attribute Name''' || '''Attribute Code''' || 182 || `__class__` || 1 || 183 || `ant` || 2 || 184 || `cat` || 3 || 185 || `dog` || 4 || 186 || `duck` || 5 || 187 || `elephant` || 6 || 188 || `giraffe` || 7 || 189 || `hippo` || 8 || 190 || `horse` || 9 || 191 || `rhino` || 10 || 192 || `zebra` || 11 || 193 194 === Parameter Tables and Parameter Codes === 195 196 Parameter tables consist of locations yielding parameter position information. 197 Each location in a table will correspond to a particular name, but since 198 potentially many names may have the same position, a table must provide 199 identifying details for the name that is actually supported. 200 201 Just as with object tables, in parameter tables, such identifying details take 202 the form of '''parameter codes'''. Each parameter name is mapped to a distinct 203 identifier, and upon consulting a parameter table, the lookup process must 204 read the stored parameter code and compare it to the code for the parameter it 205 is attempting to position in a parameter list. If these codes match, then it 206 can be assumed that the signature supports the named parameter. Otherwise, a 207 different parameter (or even no parameter at all) resides at that location in 208 the parameter table, making any attempt to set such a parameter inappropriate. 209 210 Since parameter tables provide both identifying information and parameter 211 positions, a more comprehensive parameter table resembles the following: 212 213 || '''Table''' ||<-4> '''Parameters (Codes)''' || 214 || function f || `a` at 1 (1) || `b` at 2 (2) || `c` at 3 (3) || `d` at 4 (4) || 215 || function g || `q` at 2 (6) ||<style="background-color: #ddd"> `p` at 1 (5) || `r` at 3 (7) || || 216 || function h || `v` at 2 (8) ||<style="background-color: #ddd"> `p` at 2 (5) || || `d` at 1 (4) || 217 218 The following parameter codes would be employed: 219 220 || '''Parameter Name''' || '''Parameter Code''' || 221 || `a` || 1 || 222 || `b` || 2 || 223 || `c` || 3 || 224 || `d` || 4 || 225 || `p` || 5 || 226 || `q` || 6 || 227 || `r` || 7 || 228 || `v` || 8 || 229 230 == Consolidating Constants == 231 232 With knowledge about all constants used in a program, it becomes possible to 233 prepare a catalogue of distinct constants and to assign each of them a unique 234 name for convenient referencing in the generated program code. All constants 235 are inspected in turn and a content digest prepared for each of them, 236 establishing a mapping from values to such digests which act as global 237 identifiers. Since local names are associated with constants, a mapping is 238 also established from each local name to the corresponding global identifier. 239 240 Considering the [[../Inspection#Literals_and_Constants|previous example]], the 241 following mappings would be established, from values to global identifiers: 242 243 || '''Type''' || '''Encoding''' || '''Value''' || '''Global Identifier (Content Digest)''' || 244 || `__builtins__.str.string` || `iso-8859-15` || `'\xc6\xd8\xc5'` || `OeyoRfPI__XqwJcPgTDTItX9v__OM` || 245 || `__builtins__.int.int` || || `123` || `qPRjUheGUngBhhVaHVqYNbBu4m8` || 246 247 And from local names or identifiers to global identifiers: 248 249 || '''Identifier''' || '''Global Identifier (Content Digest)''' || 250 || `__main__.$c0` || `OeyoRfPI__XqwJcPgTDTItX9v__OM` || 251 || `__main__.$c1` || `qPRjUheGUngBhhVaHVqYNbBu4m8` || 252 253 == Optimisation Products == 254 255 The optimisation process should produce catalogues of attribute and parameter 256 codes plus positioning information for object tables, object structures and 257 parameter tables. It should also produce a catalogue of distinct constant 258 identifiers.