1 = Design Decisions = 2 3 The Lichen language design involves some different choices to those taken in Python's design. Many of these choices are motivated by the following criteria: 4 5 * To simplify the language and to make what programs do easier to understand and to predict 6 * To make analysis of programs easier, particularly [[../Deduction|deductions]] about the nature of the code 7 * To simplify and otherwise reduce the [[../Representations|representations]] employed and the operations performed at run-time 8 9 Lichen is in many ways a restricted form of Python. In particular, restrictions on the attribute names supported by each object help to clearly define the object types in a program, allowing us to identify those objects when they are used. Consequently, optimisations that can be employed in a Lichen program become possible in situations where they would have been difficult or demanding to employ in a Python program. 10 11 Some design choices evoke memories of earlier forms of Python. Removing nested scopes simplifies the [[../Inspection|inspection]] of programs and run-time [[../Representations|representations]] and mechanisms. Other choices seek to remedy difficult or defective aspects of Python, notably the behaviour of Python's [[../Imports|import]] system. 12 13 <<TableOfContents(2,3)>> 14 15 == Attributes == 16 17 {{{#!table 18 '''Lichen''' || '''Python''' || '''Rationale''' 19 == 20 Objects have a fixed set of attribute names 21 || Objects can gain and lose attributes at run-time 22 || Having fixed sets of attributes helps identify object types 23 == 24 Instance attributes may not shadow class attributes 25 || Instance attributes may shadow class attributes 26 || Forbidding shadowing simplifies access operations 27 == 28 Attributes are simple members of object structures 29 || Dynamic handling and computation of attributes is supported 30 || Forbidding dynamic attributes simplifies access operations 31 }}} 32 33 === Fixed Attribute Names === 34 35 Attribute names are bound for classes through assignment in the class namespace, for modules in the module namespace, and for instances in methods through assignment to `self`. Class and instance attributes are propagated to descendant classes and instances of descendant classes respectively. Once bound, attributes can be modified, but new attributes cannot be bound by other means, such as the assignment of an attribute to an arbitrary object that would not already support such an attribute. 36 37 {{{#!python numbers=disable 38 class C: 39 a = 123 40 def __init__(self): 41 self.x = 234 42 43 C.b = 456 # not allowed (b not bound in C) 44 C().y = 567 # not allowed (y not bound for C instances) 45 }}} 46 47 Permitting the addition of attributes to objects would then require that such addition attempts be associated with particular objects, leading to a potentially iterative process involving object type deduction and modification, also causing imprecise results. 48 49 === No Shadowing === 50 51 Instances may not define attributes that are provided by classes. 52 53 {{{#!python numbers=disable 54 class C: 55 a = 123 56 def shadow(self): 57 self.a = 234 # not allowed (attribute shadows class attribute) 58 }}} 59 60 Permitting this would oblige instances to support attributes that, when missing, are provided by consulting their classes but, when not missing, may also be provided directly by the instances themselves. 61 62 === No Dynamic Attributes === 63 64 Instance attributes cannot be provided dynamically, such that any missing attribute would be supplied by a special method call to determine the attribute's presence and to retrieve its value. 65 66 {{{#!python numbers=disable 67 class C: 68 def __getattr__(self, name): # not supported 69 if name == "missing": 70 return 123 71 }}} 72 73 Permitting this would require object types to potentially support any attribute, undermining attempts to use attributes to identify objects. 74 75 == Naming == 76 77 {{{#!table 78 '''Lichen''' || '''Python''' || '''Rationale''' 79 == 80 Names may be local, global or built-in: nested namespaces must be initialised explicitly 81 || Names may also be non-local, permitting closures 82 || Limited name scoping simplifies program inspection and run-time mechanisms 83 == 84 `self` is a reserved name and is optional in method parameter lists 85 || `self` is a naming convention, but the first method parameter must always refer to the accessed object 86 || Reserving `self` assists deduction; making it optional is a consequence of the method binding behaviour 87 == 88 Instance attributes can be initialised using `.name` parameter notation 89 || [[https://stackoverflow.com/questions/1389180/automatically-initialize-instance-variables|Workarounds]] involving decorators and introspection are required for similar brevity 90 || Initialiser notation eliminates duplication in program code and is convenient 91 }}} 92 93 === Traditional Local, Global and Built-In Scopes Only === 94 95 Namespaces reside within a hierarchy within modules: classes containing classes or functions; functions containing other functions. Built-in names are exposed in all namespaces, global names are defined at the module level and are exposed in all namespaces within the module, locals are confined to the namespace in which they are defined. 96 97 However, locals are not inherited by namespaces from surrounding or enclosing namespaces. 98 99 {{{#!python numbers=disable 100 def f(x): 101 def g(y): 102 return x + y # not permitted: x is not inherited from f in Lichen (it is in Python) 103 return g 104 105 def h(x): 106 def i(y, x=x): # x is initialised but held in the namespace of i 107 return x + y # succeeds: x is defined 108 return i 109 }}} 110 111 Needing to access outer namespaces in order to access any referenced names complicates the way in which such dynamic namespaces would need to be managed. Although the default initialisation technique demonstrated above could be automated, explicit initialisation makes programs easier to follow and avoids mistakes involving globals having the same name. 112 113 === Reserved Self === 114 115 The `self` name can be omitted in method signatures, but in methods it is always initialised to the instance on which the method is operating. 116 117 {{{#!python numbers=disable 118 class C: 119 def f(y): # y is not the instance 120 self.x = y # self is the instance 121 }}} 122 123 The assumption in methods is that `self` must always be referring to an instance of the containing class or of a descendant class. This means that `self` cannot be initialised to another kind of value, which Python permits through the explicit invocation of a method with the inclusion of the affected instance as the first argument. Consequently, `self` becomes optional in the signature because it is not assigned in the same way as the other parameters. 124 125 === Instance Attribute Initialisers === 126 127 In parameter lists, a special notation can be used to indicate that the given name is an instance attribute that will be assigned the argument value corresponding to the parameter concerned. 128 129 {{{#!python numbers=disable 130 class C: 131 def f(self, .a, .b, c): # .a and .b indicate instance attributes 132 self.c = c # a traditional assignment using a parameter 133 }}} 134 135 To use the notation, such dot-qualified parameters must appear only in the parameter lists of methods, not plain functions. The qualified parameters are represented as locals having the same name, and assignments to the corresponding instance attributes are inserted into the generated code. 136 137 {{{#!python numbers=disable 138 class C: 139 def f1(self, .a, .b): # equivalent to f2, below 140 pass 141 142 def f2(self, a, b): 143 self.a = a 144 self.b = b 145 146 def g(self, .a, .b, a): # not permitted: a appears twice 147 pass 148 }}} 149 150 Naturally, `self`, being a reserved name in methods, can also be omitted from such parameter lists. Moreover, such initialising parameters can have default values. 151 152 {{{#!python numbers=disable 153 class C: 154 def __init__(.a=1, .b=2): 155 pass 156 157 c1 = C() 158 c2 = C(3, 4) 159 print c1.a, c1.b # 1 2 160 print c2.a, c2.b # 3 4 161 }}} 162 163 == Inheritance and Binding == 164 165 {{{#!table 166 '''Lichen''' || '''Python''' || '''Rationale''' 167 == 168 Class attributes are propagated to class hierarchy members during initialisation: rebinding class attributes does not affect descendant class attributes 169 || Class attributes are propagated live to class hierarchy members and must be looked up by the run-time system if not provided by a given class 170 || Initialisation-time propagation simplifies access operations and attribute table storage 171 == 172 Unbound methods must be bound using a special function taking an instance 173 || Unbound methods may be called using an instance as first argument 174 || Forbidding instances as first arguments simplifies the invocation mechanism 175 == 176 Functions assigned to class attributes do not become unbound methods 177 || Functions assigned to class attributes become unbound methods 178 || Removing method assignment simplifies deduction: methods are always defined in place 179 == 180 Base classes must be well-defined 181 || Base classes may be expressions 182 || Well-defined base classes are required to establish a well-defined hierarchy of types 183 == 184 Classes may not be defined in functions 185 || Classes may be defined in any kind of namespace 186 || Forbidding classes in functions prevents the definition of countless class variants that are awkward to analyse 187 }}} 188 189 === Inherited Class Attributes === 190 191 Class attributes that are changed for a class do not change for that class's descendants. 192 193 {{{#!python numbers=disable 194 class C: 195 a = 123 196 197 class D(C): 198 pass 199 200 C.a = 456 201 print D.a # remains 123 in Lichen, becomes 456 in Python 202 }}} 203 204 Permitting this requires indirection for all class attributes, requiring them to be treated differently from other kinds of attributes. Meanwhile, class attribute rebinding and the accessing of inherited attributes changed in this way is relatively rare. 205 206 === Unbound Methods === 207 208 Methods are defined on classes but are only available via instances: they are instance methods. Consequently, acquiring a method directly from a class and then invoking it should fail because the method will be unbound: the "context" of the method is not an instance. Furthermore, the Python technique of supplying an instance as the first argument in an invocation to bind the method to an instance, thus setting the context of the method, is not supported. See [[#ReservedSelf|"Reserved Self"]] for more information. 209 210 {{{#!python numbers=disable 211 class C: 212 def f(self, x): 213 self.x = x 214 def g(self): 215 C.f(123) # not permitted: C is not an instance 216 C.f(self, 123) # not permitted: self cannot be specified in the argument list 217 get_using(C.f, self)(123) # binds C.f to self, then the result is called 218 }}} 219 220 Binding methods to instances occurs when acquiring methods via instances or explicitly using the `get_using` built-in. The built-in checks the compatibility of the supplied method and instance. If compatible, it provides the bound method as its result. 221 222 Normal functions are callable without any further preparation, whereas unbound methods need the binding step to be performed and are not immediately callable. Were functions to become unbound methods upon assignment to a class attribute, they would need to be invalidated by having the preparation mechanism enabled on them. However, this invalidation would only be relevant to the specific case of assigning functions to classes and this would need to be tested for. Given the added complications, such functionality is arguably not worth supporting. 223 224 === Assigning Functions to Class Attributes === 225 226 Functions can be assigned to class attributes but do not become unbound methods as a result. 227 228 {{{#!python numbers=disable 229 class C: 230 def f(self): # will be replaced 231 return 234 232 233 def f(self): 234 return self 235 236 C.f = f # makes C.f a function, not a method 237 C().f() # not permitted: f requires an explicit argument 238 C().f(123) # permitted: f has merely been exposed via C.f 239 }}} 240 241 Methods are identified as such by their definition location, they contribute information about attributes to the class hierarchy, and they employ certain structure details at run-time to permit the binding of methods. Since functions can defined in arbitrary locations, no class hierarchy information is available, and a function could combine `self` with a range of attributes that are not compatible with any class to which the function might be assigned. 242 243 === Well-Defined Base Classes === 244 245 Base classes must be clearly identifiable as well-defined classes. This facilitates the cataloguing of program objects and further analysis on them. 246 247 {{{#!python numbers=disable 248 class C: 249 x = 123 250 251 def f(): 252 return C 253 254 class D(f()): # not permitted: f could return anything 255 pass 256 }}} 257 258 If base class identification could only be done reliably at run-time, class relationship information would be very limited without running the program or performing costly and potentially unreliable analysis. Indeed, programs employing such dynamic base classes are arguably resistant to analysis, which is contrary to the goals of a language like Lichen. 259 260 === Class Definitions and Functions === 261 262 Classes may not be defined in functions because functions provide dynamic namespaces, but Lichen relies on a static namespace hierarchy in order to clearly identify the principal objects in a program. If classes could be defined in functions, despite seemingly providing the same class over and over again on every invocation, a family of classes would, in fact, be defined. 263 264 {{{#!python numbers=disable 265 def f(x): 266 class C: # not permitted: this describes one of potentially many classes 267 y = x 268 return f 269 }}} 270 271 Moreover, issues of namespace nesting also arise, since the motivation for defining classes in functions would surely be to take advantage of local state to parameterise such classes. 272 273 == Modules and Packages == 274 275 {{{#!table 276 '''Lichen''' || '''Python''' || '''Rationale''' 277 == 278 Modules are independent: package hierarchies are not traversed when importing 279 || Modules exist in hierarchical namespaces: package roots must be imported before importing specific submodules 280 || Eliminating module traversal permits more precise imports and reduces superfluous code 281 == 282 Only specific names can be imported from a module or package using the `from` statement 283 || Importing "all" from a package or module is permitted 284 || Eliminating "all" imports simplifies the task of determining where names in use have come from 285 == 286 Modules must be specified using absolute names 287 || Imports can be absolute or relative 288 || Using only absolute names simplifies the import mechanism 289 == 290 Modules are imported independently and their dependencies subsequently resolved 291 || Modules are imported as import statements are encountered 292 || Statically-initialised objects can be used declaratively, although an initialisation order may still need establishing 293 }}} 294 295 === Independent Modules === 296 297 The inclusion of modules in a program affects only explicitly-named modules: they do not have relationships implied by their naming that would cause such related modules to be included in a program. 298 299 {{{#!python numbers=disable 300 from compiler import consts # defines consts 301 import compiler.ast # defines ast, not compiler 302 303 ast # is defined 304 compiler # is not defined 305 consts # is defined 306 }}} 307 308 Where modules should have relationships, they should be explicitly defined using `from` and `import` statements which target the exact modules required. In the above example, `compiler` is not routinely imported because modules within the `compiler` package have been requested. 309 310 === Specific Name Imports Only === 311 312 Lichen, unlike Python, also does not support the special `__all__` module attribute. 313 314 {{{#!python numbers=disable 315 from compiler import * # not permitted 316 from compiler import ast, consts # permitted 317 318 interpreter # undefined in compiler (yet it might be thought to reside there) and in this module 319 }}} 320 321 The `__all__` attribute supports `from ... import *` statements in Python, but without identifying the module or package involved and then consulting `__all__` in that module or package to discover which names might be involved (which might require the inspection of yet other modules or packages), the names imported cannot be known. Consequently, some names used elsewhere in the module performing the import might be assumed to be imported names when, in fact, they are unknown in both the importing and imported modules. Such uncertainty hinders the inspection of individual modules. 322 323 === Modules Imported Independently === 324 325 When indicating an import using the `from` and `import` statements, the [[../Toolchain|toolchain]] does not attempt to immediately import other modules. Instead, the imports act as declarations of such other modules or names from other modules, resolved at a later stage. This permits mutual imports to a greater extent than in Python. 326 327 {{{#!python numbers=disable 328 # Module M 329 from N import C # in Python: fails attempting to re-enter N 330 331 class D(C): 332 y = 456 333 334 # Module N 335 from M import D # in Python: causes M to be entered, fails when re-entered from N 336 337 class C: 338 x = 123 339 340 class E(D): 341 z = 789 342 343 # Main program 344 import N 345 }}} 346 347 Such flexibility is not usually needed, and circular importing usually indicates issues with program organisation. However, declarative imports can help to decouple modules and avoid combining import declaration and module initialisation order concerns. 348 349 == Syntax and Control-Flow == 350 351 {{{#!table 352 '''Lichen''' || '''Python''' || '''Rationale''' 353 == 354 If expressions and comprehensions are not supported 355 || If expressions and comprehensions are supported 356 || Omitting such syntactic features simplifies program inspection and translation 357 == 358 The `with` statement is not supported 359 || The `with` statement offers a mechanism for resource allocation and deallocation using context managers 360 || This syntactic feature can be satisfactorily emulated using existing constructs 361 == 362 Generators are not supported 363 || Generators are supported 364 || Omitting generator support simplifies run-time mechanisms 365 == 366 Only positional and keyword arguments are supported 367 || Argument unpacking (using `*` and `**`) is supported 368 || Omitting unpacking simplifies generic invocation handling 369 == 370 All parameters must be specified 371 || Catch-all parameters (`*` and `**`) are supported 372 || Omitting catch-all parameter population simplifies generic invocation handling 373 }}} 374 375 === No If Expressions or Comprehensions === 376 377 In order to support the classic [[WikiPedia:?:|ternary operator]], a construct was [[https://www.python.org/dev/peps/pep-0308/|added]] to the Python syntax that needed to avoid problems with the existing grammar and notation. Unfortunately, it reorders the components from the traditional form: 378 379 {{{#!python numbers=disable 380 # Not valid in Lichen, only in Python. 381 382 # In C: condition ? true_result : false_result 383 true_result if condition else false_result 384 385 # In C: (condition ? inner_true_result : inner_false_result) ? true_result : false_result 386 true_result if (inner_true_result if condition else inner_false_result) else false_result 387 }}} 388 389 Since if expressions may participate within expressions, they cannot be rewritten as if statements. Nor can they be rewritten as logical operator chains in general. 390 391 {{{#!python numbers=disable 392 # Not valid in Lichen, only in Python. 393 394 a = 0 if x else 1 # x being true yields 0 395 396 # Here, x being true causes (x and 0) to complete, yielding 0. 397 # But this causes ((x and 0) or 1) to complete, yielding 1. 398 399 a = x and 0 or 1 # not valid 400 }}} 401 402 But in any case, it would be more of a motivation to support the functionality if a better syntax could be adopted instead. However, if expressions are not particularly important in Python, and despite enhancement requests over many years, everybody managed to live without them. 403 404 List and generator comprehensions are more complicated but share some characteristics of if expressions: their syntax contradicts the typical conventions established by the rest of the Python language; they create implicit state that is perhaps most appropriately modelled by a separate function or similar object. Since Lichen does not support generators at all, it will obviously not support generator expressions. 405 406 Meanwhile, list comprehensions quickly encourage barely-readable programs: 407 408 {{{#!python numbers=disable 409 # Not valid in Lichen, only in Python. 410 411 x = [0, [1, 2, 0], 0, 0, [0, 3, 4]] 412 a = [z for y in x if y for z in y if z] 413 }}} 414 415 Supporting the creation of temporary functions to produce list comprehensions, while also hiding temporary names from the enclosing scope, adds complexity to the toolchain for situations where programmers would arguably be better creating their own functions and thus writing more readable programs. 416 417 === No With Statement === 418 419 The [[https://docs.python.org/2.7/reference/compound_stmts.html#the-with-statement|with statement]] introduced the concept of [[https://docs.python.org/2.7/reference/datamodel.html#context-managers|context managers]] in Python 2.5, with such objects supporting a [[https://docs.python.org/2.7/library/stdtypes.html#typecontextmanager|programming interface]] that aims to formalise certain conventions around resource management. For example: 420 421 {{{#!python numbers=disable 422 # Not valid in Lichen, only in Python. 423 424 with connection = db.connect(connection_args): 425 with cursor = connection.cursor(): 426 cursor.execute(query, args) 427 }}} 428 429 Although this makes for readable code, it must be supported by objects which define the `__enter__` and `__exit__` special methods. Here, the `connect` method invoked in the first `with` statement must return such an object; similarly, the `cursor` method must also provide an object with such characteristics. 430 431 However, the "pre-with" solution is as follows: 432 433 {{{#!python numbers=disable 434 connection = db.connect(connection_args) 435 try: 436 cursor = connection.cursor() 437 try: 438 cursor.execute(query, args) 439 finally: 440 cursor.close() 441 finally: 442 connection.close() 443 }}} 444 445 Although this seems less readable, its behaviour is more obvious because magic methods are not being called implicitly. Moreover, any parameterisation of the acts of resource deallocation or closure can be done in the `finally` clauses where such parameterisation would seem natural, rather than being specified through some kind of context manager initialisation arguments that must then be propagated to the magic methods so that they may take into consideration contextual information that is readily available in the place where the actual resource operations are being performed. 446 447 === No Generators === 448 449 [[https://www.python.org/dev/peps/pep-0255/|Generators]] were [[https://docs.python.org/release/2.3/whatsnew/section-generators.html|added]] to Python in the 2.2 release and became fully part of the language in the 2.3 release. They offer a convenient way of writing iterator-like objects, capturing execution state instead of obliging the programmer to manage such state explicitly. 450 451 {{{#!python numbers=disable 452 # Not valid in Lichen, only in Python. 453 454 def fib(): 455 a, b = 0, 1 456 while 1: 457 yield b 458 a, b = b, a+b 459 460 # Alternative form valid in Lichen. 461 462 class fib: 463 def __init__(self): 464 self.a, self.b = 0, 1 465 466 def next(self): 467 result = self.b 468 self.a, self.b = self.b, self.a + self.b 469 return result 470 471 # Main program. 472 473 seq = fib() 474 i = 0 475 while i < 10: 476 print seq.next() 477 i += 1 478 }}} 479 480 However, generators make additional demands on the mechanisms provided to support program execution. The encapsulation of the above example generator in a separate class illustrates the need for state that persists outside the execution of the routine providing the generator's results. Generators may look like functions, but they do not necessarily behave like them, leading to potential misunderstandings about their operation even if the code is superficially tidy and concise. 481 482 === Positional and Keyword Arguments Only === 483 484 When invoking callables, only positional arguments and keyword arguments can be used. Python also supports `*` and `**` arguments which respectively unpack sequences and mappings into the argument list, filling the list with sequence items (using `*`) and keywords (using `**`). 485 486 {{{#!python numbers=disable 487 def f(a, b, c, d): 488 return a + b + c + d 489 490 l = range(0, 4) 491 f(*l) # not permitted 492 493 m = {"c" : 10, "d" : 20} 494 f(2, 4, **m) # not permitted 495 }}} 496 497 While convenient, such "unpacking" arguments obscure the communication between callables and undermine the safety provided by function and method signatures. They also require run-time support for the unpacking operations. 498 499 === Positional Parameters Only === 500 501 Similarly, signatures may only contain named parameters that correspond to arguments. Python supports `*` and `**` in parameter lists, too, which respectively accumulate superfluous positional and keyword arguments. 502 503 {{{#!python numbers=disable 504 def f(a, b, *args, **kw): # not permitted 505 return a + b + sum(args) + kw.get("c", 0) + kw.get("d", 0) 506 507 f(1, 2, 3, 4) 508 f(1, 2, c=3, d=4) 509 }}} 510 511 Such accumulation parameters can be useful for collecting arbitrary data and applying some of it within a callable. However, they can easily proliferate throughout a system and allow erroneous data to propagate far from its origin because such parameters permit the deferral of validation until the data needs to be accessed. Again, run-time support is required to marshal arguments into the appropriate parameter of this nature, but programmers could just write functions and methods that employ general sequence and mapping parameters explicitly instead.