1 = Design Decisions = 2 3 The Lichen language design involves some different choices to those taken in Python's design. Many of these choices are motivated by the following criteria: 4 5 * To simplify the language and to make what programs do easier to understand and to predict 6 * To make analysis of programs easier, particularly [[../Deduction|deductions]] about the nature of the code 7 * To simplify and otherwise reduce the [[../Representations|representations]] employed and the operations performed at run-time 8 9 Lichen is in many ways a restricted form of Python. In particular, restrictions on the attribute names supported by each object help to clearly define the object types in a program, allowing us to identify those objects when they are used. Consequently, optimisations that can be employed in a Lichen program become possible in situations where they would have been difficult or demanding to employ in a Python program. 10 11 Some design choices evoke memories of earlier forms of Python. Removing nested scopes simplifies the [[../Inspection|inspection]] of programs and run-time [[../Representations|representations]] and mechanisms. Other choices seek to remedy difficult or defective aspects of Python, notably the behaviour of Python's [[../Imports|import]] system. 12 13 <<TableOfContents(2,3)>> 14 15 == Attributes == 16 17 {{{#!table 18 '''Lichen''' || '''Python''' || '''Rationale''' 19 == 20 Objects have a fixed set of attribute names 21 || Objects can gain and lose attributes at run-time 22 || Having fixed sets of attributes helps identify object types 23 == 24 Instance attributes may not shadow class attributes 25 || Instance attributes may shadow class attributes 26 || Forbidding shadowing simplifies access operations 27 == 28 Attributes are simple members of object structures 29 || Dynamic handling and computation of attributes is supported 30 || Forbidding dynamic attributes simplifies access operations 31 }}} 32 33 === Fixed Attribute Names === 34 35 Attribute names are bound for classes through assignment in the class namespace, for modules in the module namespace, and for instances in methods through assignment to `self`. Class and instance attributes are propagated to descendant classes and instances of descendant classes respectively. Once bound, attributes can be modified, but new attributes cannot be bound by other means, such as the assignment of an attribute to an arbitrary object that would not already support such an attribute. 36 37 {{{#!python numbers=disable 38 class C: 39 a = 123 40 def __init__(self): 41 self.x = 234 42 43 C.b = 456 # not allowed (b not bound in C) 44 C().y = 567 # not allowed (y not bound for C instances) 45 }}} 46 47 Permitting the addition of attributes to objects would then require that such addition attempts be associated with particular objects, leading to a potentially iterative process involving object type deduction and modification, also causing imprecise results. 48 49 === No Shadowing === 50 51 Instances may not define attributes that are provided by classes. 52 53 {{{#!python numbers=disable 54 class C: 55 a = 123 56 def shadow(self): 57 self.a = 234 # not allowed (attribute shadows class attribute) 58 }}} 59 60 Permitting this would oblige instances to support attributes that, when missing, are provided by consulting their classes but, when not missing, may also be provided directly by the instances themselves. 61 62 === No Dynamic Attributes === 63 64 Instance attributes cannot be provided dynamically, such that any missing attribute would be supplied by a special method call to determine the attribute's presence and to retrieve its value. 65 66 {{{#!python numbers=disable 67 class C: 68 def __getattr__(self, name): # not supported 69 if name == "missing": 70 return 123 71 }}} 72 73 Permitting this would require object types to potentially support any attribute, undermining attempts to use attributes to identify objects. 74 75 == Naming == 76 77 {{{#!table 78 '''Lichen''' || '''Python''' || '''Rationale''' 79 == 80 Names may be local, global or built-in: nested namespaces must be initialised explicitly 81 || Names may also be non-local, permitting closures 82 || Limited name scoping simplifies program inspection and run-time mechanisms 83 == 84 `self` is a reserved name and is optional in method parameter lists 85 || `self` is a naming convention, but the first method parameter must always refer to the accessed object 86 || Reserving `self` assists deduction; making it optional is a consequence of the method binding behaviour 87 == 88 Instance attributes can be initialised using `.name` parameter notation 89 || [[https://stackoverflow.com/questions/1389180/automatically-initialize-instance-variables|Workarounds]] involving decorators and introspection are required for similar brevity 90 || Initialiser notation eliminates duplication in program code and is convenient 91 }}} 92 93 === Traditional Local, Global and Built-In Scopes Only === 94 95 Namespaces reside within a hierarchy within modules: classes containing classes or functions; functions containing other functions. Built-in names are exposed in all namespaces, global names are defined at the module level and are exposed in all namespaces within the module, locals are confined to the namespace in which they are defined. 96 97 However, locals are not inherited by namespaces from surrounding or enclosing namespaces. 98 99 {{{#!python numbers=disable 100 def f(x): 101 def g(y): 102 return x + y # not permitted: x is not inherited from f in Lichen (it is in Python) 103 return g 104 105 def h(x): 106 def i(y, x=x): # x is initialised but held in the namespace of i 107 return x + y # succeeds: x is defined 108 return i 109 }}} 110 111 Needing to access outer namespaces in order to access any referenced names complicates the way in which such dynamic namespaces would need to be managed. Although the default initialisation technique demonstrated above could be automated, explicit initialisation makes programs easier to follow and avoids mistakes involving globals having the same name. 112 113 === Reserved Self === 114 115 The `self` name can be omitted in method signatures, but in methods it is always initialised to the instance on which the method is operating. 116 117 {{{#!python numbers=disable 118 class C: 119 def f(y): # y is not the instance 120 self.x = y # self is the instance 121 }}} 122 123 The assumption in methods is that `self` must always be referring to an instance of the containing class or of a descendant class. This means that `self` cannot be initialised to another kind of value, which Python permits through the explicit invocation of a method with the inclusion of the affected instance as the first argument. Consequently, `self` becomes optional in the signature because it is not assigned in the same way as the other parameters. 124 125 === Instance Attribute Initialisers === 126 127 In parameter lists, a special notation can be used to indicate that the given name is an instance attribute that will be assigned the argument value corresponding to the parameter concerned. 128 129 {{{#!python numbers=disable 130 class C: 131 def f(self, .a, .b, c): # .a and .b indicate instance attributes 132 self.c = c # a traditional assignment using a parameter 133 }}} 134 135 To use the notation, such dot-qualified parameters must appear only in the parameter lists of methods, not plain functions. The qualified parameters are represented as locals having the same name, and assignments to the corresponding instance attributes are inserted into the generated code. 136 137 {{{#!python numbers=disable 138 class C: 139 def f1(self, .a, .b): # equivalent to f2, below 140 pass 141 142 def f2(self, a, b): 143 self.a = a 144 self.b = b 145 146 def g(self, .a, .b, a): # not permitted: a appears twice 147 pass 148 }}} 149 150 Naturally, `self` can also be omitted from such parameter lists. 151 152 == Inheritance and Binding == 153 154 {{{#!table 155 '''Lichen''' || '''Python''' || '''Rationale''' 156 == 157 Class attributes are propagated to class hierarchy members during initialisation: rebinding class attributes does not affect descendant class attributes 158 || Class attributes are propagated live to class hierarchy members and must be looked up by the run-time system if not provided by a given class 159 || Initialisation-time propagation simplifies access operations and attribute table storage 160 == 161 Unbound methods must be bound using a special function taking an instance 162 || Unbound methods may be called using an instance as first argument 163 || Forbidding instances as first arguments simplifies the invocation mechanism 164 == 165 Functions assigned to class attributes do not become unbound methods 166 || Functions assigned to class attributes become unbound methods 167 || Removing method assignment simplifies deduction: methods are always defined in place 168 == 169 Base classes must be well-defined 170 || Base classes may be expressions 171 || Well-defined base classes are required to establish a well-defined hierarchy of types 172 == 173 Classes may not be defined in functions 174 || Classes may be defined in any kind of namespace 175 || Forbidding classes in functions prevents the definition of countless class variants that are awkward to analyse 176 }}} 177 178 === Inherited Class Attributes === 179 180 Class attributes that are changed for a class do not change for that class's descendants. 181 182 {{{#!python numbers=disable 183 class C: 184 a = 123 185 186 class D(C): 187 pass 188 189 C.a = 456 190 print D.a # remains 123 in Lichen, becomes 456 in Python 191 }}} 192 193 Permitting this requires indirection for all class attributes, requiring them to be treated differently from other kinds of attributes. Meanwhile, class attribute rebinding and the accessing of inherited attributes changed in this way is relatively rare. 194 195 === Unbound Methods === 196 197 Methods are defined on classes but are only available via instances: they are instance methods. Consequently, acquiring a method directly from a class and then invoking it should fail because the method will be unbound: the "context" of the method is not an instance. Furthermore, the Python technique of supplying an instance as the first argument in an invocation to bind the method to an instance, thus setting the context of the method, is not supported. See [[#ReservedSelf|"Reserved Self"]] for more information. 198 199 {{{#!python numbers=disable 200 class C: 201 def f(self, x): 202 self.x = x 203 def g(self): 204 C.f(123) # not permitted: C is not an instance 205 C.f(self, 123) # not permitted: self cannot be specified in the argument list 206 get_using(C.f, self)(123) # binds C.f to self, then the result is called 207 }}} 208 209 Binding methods to instances occurs when acquiring methods via instances or explicitly using the `get_using` built-in. The built-in checks the compatibility of the supplied method and instance. If compatible, it provides the bound method as its result. 210 211 Normal functions are callable without any further preparation, whereas unbound methods need the binding step to be performed and are not immediately callable. Were functions to become unbound methods upon assignment to a class attribute, they would need to be invalidated by having the preparation mechanism enabled on them. However, this invalidation would only be relevant to the specific case of assigning functions to classes and this would need to be tested for. Given the added complications, such functionality is arguably not worth supporting. 212 213 === Assigning Functions to Class Attributes === 214 215 Functions can be assigned to class attributes but do not become unbound methods as a result. 216 217 {{{#!python numbers=disable 218 class C: 219 def f(self): # will be replaced 220 return 234 221 222 def f(self): 223 return self 224 225 C.f = f # makes C.f a function, not a method 226 C().f() # not permitted: f requires an explicit argument 227 C().f(123) # permitted: f has merely been exposed via C.f 228 }}} 229 230 Methods are identified as such by their definition location, they contribute information about attributes to the class hierarchy, and they employ certain structure details at run-time to permit the binding of methods. Since functions can defined in arbitrary locations, no class hierarchy information is available, and a function could combine `self` with a range of attributes that are not compatible with any class to which the function might be assigned. 231 232 === Well-Defined Base Classes === 233 234 Base classes must be clearly identifiable as well-defined classes. This facilitates the cataloguing of program objects and further analysis on them. 235 236 {{{#!python numbers=disable 237 class C: 238 x = 123 239 240 def f(): 241 return C 242 243 class D(f()): # not permitted: f could return anything 244 pass 245 }}} 246 247 If base class identification could only be done reliably at run-time, class relationship information would be very limited without running the program or performing costly and potentially unreliable analysis. Indeed, programs employing such dynamic base classes are arguably resistant to analysis, which is contrary to the goals of a language like Lichen. 248 249 === Class Definitions and Functions === 250 251 Classes may not be defined in functions because functions provide dynamic namespaces, but Lichen relies on a static namespace hierarchy in order to clearly identify the principal objects in a program. If classes could be defined in functions, despite seemingly providing the same class over and over again on every invocation, a family of classes would, in fact, be defined. 252 253 {{{#!python numbers=disable 254 def f(x): 255 class C: # not permitted: this describes one of potentially many classes 256 y = x 257 return f 258 }}} 259 260 Moreover, issues of namespace nesting also arise, since the motivation for defining classes in functions would surely be to take advantage of local state to parameterise such classes. 261 262 == Modules and Packages == 263 264 {{{#!table 265 '''Lichen''' || '''Python''' || '''Rationale''' 266 == 267 Modules are independent: package hierarchies are not traversed when importing 268 || Modules exist in hierarchical namespaces: package roots must be imported before importing specific submodules 269 || Eliminating module traversal permits more precise imports and reduces superfluous code 270 == 271 Only specific names can be imported from a module or package using the `from` statement 272 || Importing "all" from a package or module is permitted 273 || Eliminating "all" imports simplifies the task of determining where names in use have come from 274 == 275 Modules must be specified using absolute names 276 || Imports can be absolute or relative 277 || Using only absolute names simplifies the import mechanism 278 == 279 Modules are imported independently and their dependencies subsequently resolved 280 || Modules are imported as import statements are encountered 281 || Statically-initialised objects can be used declaratively, although an initialisation order may still need establishing 282 }}} 283 284 === Independent Modules === 285 286 The inclusion of modules in a program affects only explicitly-named modules: they do not have relationships implied by their naming that would cause such related modules to be included in a program. 287 288 {{{#!python numbers=disable 289 from compiler import consts # defines consts 290 import compiler.ast # defines ast, not compiler 291 292 ast # is defined 293 compiler # is not defined 294 consts # is defined 295 }}} 296 297 Where modules should have relationships, they should be explicitly defined using `from` and `import` statements which target the exact modules required. In the above example, `compiler` is not routinely imported because modules within the `compiler` package have been requested. 298 299 === Specific Name Imports Only === 300 301 Lichen, unlike Python, also does not support the special `__all__` module attribute. 302 303 {{{#!python numbers=disable 304 from compiler import * # not permitted 305 from compiler import ast, consts # permitted 306 307 interpreter # undefined in compiler (yet it might be thought to reside there) and in this module 308 }}} 309 310 The `__all__` attribute supports `from ... import *` statements in Python, but without identifying the module or package involved and then consulting `__all__` in that module or package to discover which names might be involved (which might require the inspection of yet other modules or packages), the names imported cannot be known. Consequently, some names used elsewhere in the module performing the import might be assumed to be imported names when, in fact, they are unknown in both the importing and imported modules. Such uncertainty hinders the inspection of individual modules. 311 312 === Modules Imported Independently === 313 314 When indicating an import using the `from` and `import` statements, the [[../Toolchain|toolchain]] does not attempt to immediately import other modules. Instead, the imports act as declarations of such other modules or names from other modules, resolved at a later stage. This permits mutual imports to a greater extent than in Python. 315 316 {{{#!python numbers=disable 317 # Module M 318 from N import C # in Python: fails attempting to re-enter N 319 320 class D(C): 321 y = 456 322 323 # Module N 324 from M import D # in Python: causes M to be entered, fails when re-entered from N 325 326 class C: 327 x = 123 328 329 class E(D): 330 z = 789 331 332 # Main program 333 import N 334 }}} 335 336 Such flexibility is not usually needed, and circular importing usually indicates issues with program organisation. However, declarative imports can help to decouple modules and avoid combining import declaration and module initialisation order concerns. 337 338 == Syntax and Control-Flow == 339 340 {{{#!table 341 '''Lichen''' || '''Python''' || '''Rationale''' 342 == 343 If expressions and comprehensions are not supported 344 || If expressions and comprehensions are supported 345 || Omitting such syntactic features simplifies program inspection and translation 346 == 347 The `with` statement is not supported 348 || The `with` statement offers a mechanism for resource allocation and deallocation using context managers 349 || This syntactic feature can be satisfactorily emulated using existing constructs 350 == 351 Generators are not supported 352 || Generators are supported 353 || Omitting generator support simplifies run-time mechanisms 354 == 355 Only positional and keyword arguments are supported 356 || Argument unpacking (using `*` and `**`) is supported 357 || Omitting unpacking simplifies generic invocation handling 358 == 359 All parameters must be specified 360 || Catch-all parameters (`*` and `**`) are supported 361 || Omitting catch-all parameter population simplifies generic invocation handling 362 }}} 363 364 === No If Expressions or Comprehensions === 365 366 In order to support the classic [[WikiPedia:?:|ternary operator]], a construct was [[https://www.python.org/dev/peps/pep-0308/|added]] to the Python syntax that needed to avoid problems with the existing grammar and notation. Unfortunately, it reorders the components from the traditional form: 367 368 {{{#!python numbers=disable 369 # Not valid in Lichen, only in Python. 370 371 # In C: condition ? true_result : false_result 372 true_result if condition else false_result 373 374 # In C: (condition ? inner_true_result : inner_false_result) ? true_result : false_result 375 true_result if (inner_true_result if condition else inner_false_result) else false_result 376 }}} 377 378 Since if expressions may participate within expressions, they cannot be rewritten as if statements. Nor can they be rewritten as logical operator chains in general. 379 380 {{{#!python numbers=disable 381 # Not valid in Lichen, only in Python. 382 383 a = 0 if x else 1 # x being true yields 0 384 385 # Here, x being true causes (x and 0) to complete, yielding 0. 386 # But this causes ((x and 0) or 1) to complete, yielding 1. 387 388 a = x and 0 or 1 # not valid 389 }}} 390 391 But in any case, it would be more of a motivation to support the functionality if a better syntax could be adopted instead. However, if expressions are not particularly important in Python, and despite enhancement requests over many years, everybody managed to live without them. 392 393 List and generator comprehensions are more complicated but share some characteristics of if expressions: their syntax contradicts the typical conventions established by the rest of the Python language; they create implicit state that is perhaps most appropriately modelled by a separate function or similar object. Since Lichen does not support generators at all, it will obviously not support generator expressions. 394 395 Meanwhile, list comprehensions quickly encourage barely-readable programs: 396 397 {{{#!python numbers=disable 398 # Not valid in Lichen, only in Python. 399 400 x = [0, [1, 2, 0], 0, 0, [0, 3, 4]] 401 a = [z for y in x if y for z in y if z] 402 }}} 403 404 Supporting the creation of temporary functions to produce list comprehensions, while also hiding temporary names from the enclosing scope, adds complexity to the toolchain for situations where programmers would arguably be better creating their own functions and thus writing more readable programs. 405 406 === No With Statement === 407 408 The [[https://docs.python.org/2.7/reference/compound_stmts.html#the-with-statement|with statement]] introduced the concept of [[https://docs.python.org/2.7/reference/datamodel.html#context-managers|context managers]] in Python 2.5, with such objects supporting a [[https://docs.python.org/2.7/library/stdtypes.html#typecontextmanager|programming interface]] that aims to formalise certain conventions around resource management. For example: 409 410 {{{#!python numbers=disable 411 # Not valid in Lichen, only in Python. 412 413 with connection = db.connect(connection_args): 414 with cursor = connection.cursor(): 415 cursor.execute(query, args) 416 }}} 417 418 Although this makes for readable code, it must be supported by objects which define the `__enter__` and `__exit__` special methods. Here, the `connect` method invoked in the first `with` statement must return such an object; similarly, the `cursor` method must also provide an object with such characteristics. 419 420 However, the "pre-with" solution is as follows: 421 422 {{{#!python numbers=disable 423 connection = db.connect(connection_args) 424 try: 425 cursor = connection.cursor() 426 try: 427 cursor.execute(query, args) 428 finally: 429 cursor.close() 430 finally: 431 connection.close() 432 }}} 433 434 Although this seems less readable, its behaviour is more obvious because magic methods are not being called implicitly. Moreover, any parameterisation of the acts of resource deallocation or closure can be done in the `finally` clauses where such parameterisation would seem natural, rather than being specified through some kind of context manager initialisation arguments that must then be propagated to the magic methods so that they may take into consideration contextual information that is readily available in the place where the actual resource operations are being performed. 435 436 === No Generators === 437 438 [[https://www.python.org/dev/peps/pep-0255/|Generators]] were [[https://docs.python.org/release/2.3/whatsnew/section-generators.html|added]] to Python in the 2.2 release and became fully part of the language in the 2.3 release. They offer a convenient way of writing iterator-like objects, capturing execution state instead of obliging the programmer to manage such state explicitly. 439 440 {{{#!python numbers=disable 441 # Not valid in Lichen, only in Python. 442 443 def fib(): 444 a, b = 0, 1 445 while 1: 446 yield b 447 a, b = b, a+b 448 449 # Alternative form valid in Lichen. 450 451 class fib: 452 def __init__(self): 453 self.a, self.b = 0, 1 454 455 def next(self): 456 result = self.b 457 self.a, self.b = self.b, self.a + self.b 458 return result 459 460 # Main program. 461 462 seq = fib() 463 i = 0 464 while i < 10: 465 print seq.next() 466 i += 1 467 }}} 468 469 However, generators make additional demands on the mechanisms provided to support program execution. The encapsulation of the above example generator in a separate class illustrates the need for state that persists outside the execution of the routine providing the generator's results. Generators may look like functions, but they do not necessarily behave like them, leading to potential misunderstandings about their operation even if the code is superficially tidy and concise. 470 471 === Positional and Keyword Arguments Only === 472 473 When invoking callables, only positional arguments and keyword arguments can be used. Python also supports `*` and `**` arguments which respectively unpack sequences and mappings into the argument list, filling the list with sequence items (using `*`) and keywords (using `**`). 474 475 {{{#!python numbers=disable 476 def f(a, b, c, d): 477 return a + b + c + d 478 479 l = range(0, 4) 480 f(*l) # not permitted 481 482 m = {"c" : 10, "d" : 20} 483 f(2, 4, **m) # not permitted 484 }}} 485 486 While convenient, such "unpacking" arguments obscure the communication between callables and undermine the safety provided by function and method signatures. They also require run-time support for the unpacking operations. 487 488 === Positional Parameters Only === 489 490 Similarly, signatures may only contain named parameters that correspond to arguments. Python supports `*` and `**` in parameter lists, too, which respectively accumulate superfluous positional and keyword arguments. 491 492 {{{#!python numbers=disable 493 def f(a, b, *args, **kw): # not permitted 494 return a + b + sum(args) + kw.get("c", 0) + kw.get("d", 0) 495 496 f(1, 2, 3, 4) 497 f(1, 2, c=3, d=4) 498 }}} 499 500 Such accumulation parameters can be useful for collecting arbitrary data and applying some of it within a callable. However, they can easily proliferate throughout a system and allow erroneous data to propagate far from its origin because such parameters permit the deferral of validation until the data needs to be accessed. Again, run-time support is required to marshal arguments into the appropriate parameter of this nature, but programmers could just write functions and methods that employ general sequence and mapping parameters explicitly instead.