1 = Design Decisions = 2 3 The Lichen language design involves some different choices to those taken in Python's design. Many of these choices are motivated by the following criteria: 4 5 * To simplify the language and to make what programs do easier to understand and to predict 6 * To make analysis of programs easier, particularly [[../Deduction|deductions]] about the nature of the code 7 * To simplify and otherwise reduce the [[../Representations|representations]] employed and the operations performed at run-time 8 9 Lichen is in many ways a restricted form of Python. In particular, restrictions on the attribute names supported by each object help to clearly define the object types in a program, allowing us to identify those objects when they are used. Consequently, optimisations that can be employed in a Lichen program become possible in situations where they would have been difficult or demanding to employ in a Python program. 10 11 Some design choices evoke memories of earlier forms of Python. Removing nested scopes simplifies the [[../Inspection|inspection]] of programs and run-time [[../Representations|representations]] and mechanisms. Other choices seek to remedy difficult or defective aspects of Python, notably the behaviour of Python's [[../Imports|import]] system. 12 13 <<TableOfContents(2,3)>> 14 15 == Attributes == 16 17 {{{#!table 18 '''Lichen''' || '''Python''' || '''Rationale''' 19 == 20 Objects have a fixed set of attribute names 21 || Objects can gain and lose attributes at run-time 22 || Having fixed sets of attributes helps identify object types 23 == 24 Instance attributes may not shadow class attributes 25 || Instance attributes may shadow class attributes 26 || Forbidding shadowing simplifies access operations 27 == 28 Attributes are simple members of object structures 29 || Dynamic handling and computation of attributes is supported 30 || Forbidding dynamic attributes simplifies access operations 31 }}} 32 33 === Fixed Attribute Names === 34 35 Attribute names are bound for classes through assignment in the class namespace, for modules in the module namespace, and for instances in methods through assignment to `self`. Class and instance attributes are propagated to descendant classes and instances of descendant classes respectively. Once bound, attributes can be modified, but new attributes cannot be bound by other means, such as the assignment of an attribute to an arbitrary object that would not already support such an attribute. 36 37 {{{#!python numbers=disable 38 class C: 39 a = 123 40 def __init__(self): 41 self.x = 234 42 43 C.b = 456 # not allowed (b not bound in C) 44 C().y = 567 # not allowed (y not bound for C instances) 45 }}} 46 47 Permitting the addition of attributes to objects would then require that such addition attempts be associated with particular objects, leading to a potentially iterative process involving object type deduction and modification, also causing imprecise results. 48 49 === No Shadowing === 50 51 Instances may not define attributes that are provided by classes. 52 53 {{{#!python numbers=disable 54 class C: 55 a = 123 56 def shadow(self): 57 self.a = 234 # not allowed (attribute shadows class attribute) 58 }}} 59 60 Permitting this would oblige instances to support attributes that, when missing, are provided by consulting their classes but, when not missing, may also be provided directly by the instances themselves. 61 62 === No Dynamic Attributes === 63 64 Instance attributes cannot be provided dynamically, such that any missing attribute would be supplied by a special method call to determine the attribute's presence and to retrieve its value. 65 66 {{{#!python numbers=disable 67 class C: 68 def __getattr__(self, name): # not supported 69 if name == "missing": 70 return 123 71 }}} 72 73 Permitting this would require object types to potentially support any attribute, undermining attempts to use attributes to identify objects. 74 75 == Naming == 76 77 {{{#!table 78 '''Lichen''' || '''Python''' || '''Rationale''' 79 == 80 Names may be local, global or built-in: nested namespaces must be initialised explicitly 81 || Names may also be non-local, permitting closures 82 || Limited name scoping simplifies program inspection and run-time mechanisms 83 == 84 `self` is a reserved name and is optional in method parameter lists 85 || `self` is a naming convention, but the first method parameter must always refer to the accessed object 86 || Reserving `self` assists deduction; making it optional is a consequence of the method binding behaviour 87 }}} 88 89 === Traditional Local, Global and Built-In Scopes Only === 90 91 Namespaces reside within a hierarchy within modules: classes containing classes or functions; functions containing other functions. Built-in names are exposed in all namespaces, global names are defined at the module level and are exposed in all namespaces within the module, locals are confined to the namespace in which they are defined. 92 93 However, locals are not inherited by namespaces from surrounding or enclosing namespaces. 94 95 {{{#!python numbers=disable 96 def f(x): 97 def g(y): 98 return x + y # not permitted: x is not inherited from f in Lichen (it is in Python) 99 return g 100 101 def h(x): 102 def i(y, x=x): # x is initialised but held in the namespace of i 103 return x + y # succeeds: x is defined 104 return i 105 }}} 106 107 Needing to access outer namespaces in order to access any referenced names complicates the way in which such dynamic namespaces would need to be managed. Although the default initialisation technique demonstrated above could be automated, explicit initialisation makes programs easier to follow and avoids mistakes involving globals having the same name. 108 109 === Reserved Self === 110 111 The `self` name can be omitted in method signatures, but in methods it is always initialised to the instance on which the method is operating. 112 113 {{{#!python numbers=disable 114 class C: 115 def f(y): # y is not the instance 116 self.x = y # self is the instance 117 }}} 118 119 The assumption in methods is that `self` must always be referring to an instance of the containing class or of a descendant class. This means that `self` cannot be initialised to another kind of value, which Python permits through the explicit invocation of a method with the inclusion of the affected instance as the first argument. Consequently, `self` becomes optional in the signature because it is not assigned in the same way as the other parameters. 120 121 == Inheritance and Binding == 122 123 {{{#!table 124 '''Lichen''' || '''Python''' || '''Rationale''' 125 == 126 Class attributes are propagated to class hierarchy members during initialisation: rebinding class attributes does not affect descendant class attributes 127 || Class attributes are propagated live to class hierarchy members and must be looked up by the run-time system if not provided by a given class 128 || Initialisation-time propagation simplifies access operations and attribute table storage 129 == 130 Unbound methods must be bound using a special function taking an instance 131 || Unbound methods may be called using an instance as first argument 132 || Forbidding instances as first arguments simplifies the invocation mechanism 133 == 134 Functions assigned to class attributes do not become unbound methods 135 || Functions assigned to class attributes become unbound methods 136 || Removing method assignment simplifies deduction: methods are always defined in place 137 == 138 Base classes must be well-defined 139 || Base classes may be expressions 140 || Well-defined base classes are required to establish a well-defined hierarchy of types 141 == 142 Classes may not be defined in functions 143 || Classes may be defined in any kind of namespace 144 || Forbidding classes in functions prevents the definition of countless class variants that are awkward to analyse 145 }}} 146 147 === Inherited Class Attributes === 148 149 Class attributes that are changed for a class do not change for that class's descendants. 150 151 {{{#!python numbers=disable 152 class C: 153 a = 123 154 155 class D(C): 156 pass 157 158 C.a = 456 159 print D.a # remains 123 in Lichen, becomes 456 in Python 160 }}} 161 162 Permitting this requires indirection for all class attributes, requiring them to be treated differently from other kinds of attributes. Meanwhile, class attribute rebinding and the accessing of inherited attributes changed in this way is relatively rare. 163 164 === Unbound Methods === 165 166 Methods are defined on classes but are only available via instances: they are instance methods. Consequently, acquiring a method directly from a class and then invoking it should fail because the method will be unbound: the "context" of the method is not an instance. Furthermore, the Python technique of supplying an instance as the first argument in an invocation to bind the method to an instance, thus setting the context of the method, is not supported. See [[#ReservedSelf|"Reserved Self"]] for more information. 167 168 {{{#!python numbers=disable 169 class C: 170 def f(self, x): 171 self.x = x 172 def g(self): 173 C.f(123) # not permitted: C is not an instance 174 C.f(self, 123) # not permitted: self cannot be specified in the argument list 175 get_using(C.f, self)(123) # binds C.f to self, then the result is called 176 }}} 177 178 Binding methods to instances occurs when acquiring methods via instances or explicitly using the `get_using` built-in. The built-in checks the compatibility of the supplied method and instance. If compatible, it provides the bound method as its result. 179 180 Normal functions are callable without any further preparation, whereas unbound methods need the binding step to be performed and are not immediately callable. Were functions to become unbound methods upon assignment to a class attribute, they would need to be invalidated by having the preparation mechanism enabled on them. However, this invalidation would only be relevant to the specific case of assigning functions to classes and this would need to be tested for. Given the added complications, such functionality is arguably not worth supporting. 181 182 === Assigning Functions to Class Attributes === 183 184 Functions can be assigned to class attributes but do not become unbound methods as a result. 185 186 {{{#!python numbers=disable 187 class C: 188 def f(self): # will be replaced 189 return 234 190 191 def f(self): 192 return self 193 194 C.f = f # makes C.f a function, not a method 195 C().f() # not permitted: f requires an explicit argument 196 C().f(123) # permitted: f has merely been exposed via C.f 197 }}} 198 199 Methods are identified as such by their definition location, they contribute information about attributes to the class hierarchy, and they employ certain structure details at run-time to permit the binding of methods. Since functions can defined in arbitrary locations, no class hierarchy information is available, and a function could combine `self` with a range of attributes that are not compatible with any class to which the function might be assigned. 200 201 === Well-Defined Base Classes === 202 203 Base classes must be clearly identifiable as well-defined classes. This facilitates the cataloguing of program objects and further analysis on them. 204 205 {{{#!python numbers=disable 206 class C: 207 x = 123 208 209 def f(): 210 return C 211 212 class D(f()): # not permitted: f could return anything 213 pass 214 }}} 215 216 If base class identification could only be done reliably at run-time, class relationship information would be very limited without running the program or performing costly and potentially unreliable analysis. Indeed, programs employing such dynamic base classes are arguably resistant to analysis, which is contrary to the goals of a language like Lichen. 217 218 === Class Definitions and Functions === 219 220 Classes may not be defined in functions because functions provide dynamic namespaces, but Lichen relies on a static namespace hierarchy in order to clearly identify the principal objects in a program. If classes could be defined in functions, despite seemingly providing the same class over and over again on every invocation, a family of classes would, in fact, be defined. 221 222 {{{#!python numbers=disable 223 def f(x): 224 class C: # not permitted: this describes one of potentially many classes 225 y = x 226 return f 227 }}} 228 229 Moreover, issues of namespace nesting also arise, since the motivation for defining classes in functions would surely be to take advantage of local state to parameterise such classes. 230 231 == Modules and Packages == 232 233 {{{#!table 234 '''Lichen''' || '''Python''' || '''Rationale''' 235 == 236 Modules are independent: package hierarchies are not traversed when importing 237 || Modules exist in hierarchical namespaces: package roots must be imported before importing specific submodules 238 || Eliminating module traversal permits more precise imports and reduces superfluous code 239 == 240 Only specific names can be imported from a module or package using the `from` statement 241 || Importing "all" from a package or module is permitted 242 || Eliminating "all" imports simplifies the task of determining where names in use have come from 243 == 244 Modules must be specified using absolute names 245 || Imports can be absolute or relative 246 || Using only absolute names simplifies the import mechanism 247 == 248 Modules are imported independently and their dependencies subsequently resolved 249 || Modules are imported as import statements are encountered 250 || Statically-initialised objects can be used declaratively, although an initialisation order may still need establishing 251 }}} 252 253 === Independent Modules === 254 255 The inclusion of modules in a program affects only explicitly-named modules: they do not have relationships implied by their naming that would cause such related modules to be included in a program. 256 257 {{{#!python numbers=disable 258 from compiler import consts # defines consts 259 import compiler.ast # defines ast, not compiler 260 261 ast # is defined 262 compiler # is not defined 263 consts # is defined 264 }}} 265 266 Where modules should have relationships, they should be explicitly defined using `from` and `import` statements which target the exact modules required. In the above example, `compiler` is not routinely imported because modules within the `compiler` package have been requested. 267 268 === Specific Name Imports Only === 269 270 Lichen, unlike Python, also does not support the special `__all__` module attribute. 271 272 {{{#!python numbers=disable 273 from compiler import * # not permitted 274 from compiler import ast, consts # permitted 275 276 interpreter # undefined in compiler (yet it might be thought to reside there) and in this module 277 }}} 278 279 The `__all__` attribute supports `from ... import *` statements in Python, but without identifying the module or package involved and then consulting `__all__` in that module or package to discover which names might be involved (which might require the inspection of yet other modules or packages), the names imported cannot be known. Consequently, some names used elsewhere in the module performing the import might be assumed to be imported names when, in fact, they are unknown in both the importing and imported modules. Such uncertainty hinders the inspection of individual modules. 280 281 === Modules Imported Independently === 282 283 When indicating an import using the `from` and `import` statements, the [[../Toolchain|toolchain]] does not attempt to immediately import other modules. Instead, the imports act as declarations of such other modules or names from other modules, resolved at a later stage. This permits mutual imports to a greater extent than in Python. 284 285 {{{#!python numbers=disable 286 # Module M 287 from N import C # in Python: fails attempting to re-enter N 288 289 class D(C): 290 y = 456 291 292 # Module N 293 from M import D # in Python: causes M to be entered, fails when re-entered from N 294 295 class C: 296 x = 123 297 298 class E(D): 299 z = 789 300 301 # Main program 302 import N 303 }}} 304 305 Such flexibility is not usually needed, and circular importing usually indicates issues with program organisation. However, declarative imports can help to decouple modules and avoid combining import declaration and module initialisation order concerns. 306 307 == Syntax and Control-Flow == 308 309 {{{#!table 310 '''Lichen''' || '''Python''' || '''Rationale''' 311 == 312 If expressions and comprehensions are not supported 313 || If expressions and comprehensions are supported 314 || Omitting such syntactic features simplifies program inspection and translation 315 == 316 The `with` statement is not supported 317 || The `with` statement offers a mechanism for resource allocation and deallocation using context managers 318 || This syntactic feature can be satisfactorily emulated using existing constructs 319 == 320 Generators are not supported 321 || Generators are supported 322 || Omitting generator support simplifies run-time mechanisms 323 == 324 Only positional and keyword arguments are supported 325 || Argument unpacking (using `*` and `**`) is supported 326 || Omitting unpacking simplifies generic invocation handling 327 == 328 All parameters must be specified 329 || Catch-all parameters (`*` and `**`) are supported 330 || Omitting catch-all parameter population simplifies generic invocation handling 331 }}} 332 333 === No If Expressions or Comprehensions === 334 335 In order to support the classic [[WikiPedia:?:|ternary operator]], a construct was [[https://www.python.org/dev/peps/pep-0308/|added]] to the Python syntax that needed to avoid problems with the existing grammar and notation. Unfortunately, it reorders the components from the traditional form: 336 337 {{{#!python numbers=disable 338 # Not valid in Lichen, only in Python. 339 340 # In C: condition ? true_result : false_result 341 true_result if condition else false_result 342 343 # In C: (condition ? inner_true_result : inner_false_result) ? true_result : false_result 344 true_result if (inner_true_result if condition else inner_false_result) else false_result 345 }}} 346 347 Since if expressions may participate within expressions, they cannot be rewritten as if statements. Nor can they be rewritten as logical operator chains in general. 348 349 {{{#!python numbers=disable 350 # Not valid in Lichen, only in Python. 351 352 a = 0 if x else 1 # x being true yields 0 353 354 # Here, x being true causes (x and 0) to complete, yielding 0. 355 # But this causes ((x and 0) or 1) to complete, yielding 1. 356 357 a = x and 0 or 1 # not valid 358 }}} 359 360 But in any case, it would be more of a motivation to support the functionality if a better syntax could be adopted instead. However, if expressions are not particularly important in Python, and despite enhancement requests over many years, everybody managed to live without them. 361 362 List and generator comprehensions are more complicated but share some characteristics of if expressions: their syntax contradicts the typical conventions established by the rest of the Python language; they create implicit state that is perhaps most appropriately modelled by a separate function or similar object. Since Lichen does not support generators at all, it will obviously not support generator expressions. 363 364 Meanwhile, list comprehensions quickly encourage barely-readable programs: 365 366 {{{#!python numbers=disable 367 # Not valid in Lichen, only in Python. 368 369 x = [0, [1, 2, 0], 0, 0, [0, 3, 4]] 370 a = [z for y in x if y for z in y if z] 371 }}} 372 373 Supporting the creation of temporary functions to produce list comprehensions, while also hiding temporary names from the enclosing scope, adds complexity to the toolchain for situations where programmers would arguably be better creating their own functions and thus writing more readable programs. 374 375 === No With Statement === 376 377 The [[https://docs.python.org/2.7/reference/compound_stmts.html#the-with-statement|with statement]] introduced the concept of [[https://docs.python.org/2.7/reference/datamodel.html#context-managers|context managers]] in Python 2.5, with such objects supporting a [[https://docs.python.org/2.7/library/stdtypes.html#typecontextmanager|programming interface]] that aims to formalise certain conventions around resource management. For example: 378 379 {{{#!python numbers=disable 380 # Not valid in Lichen, only in Python. 381 382 with connection = db.connect(connection_args): 383 with cursor = connection.cursor(): 384 cursor.execute(query, args) 385 }}} 386 387 Although this makes for readable code, it must be supported by objects which define the `__enter__` and `__exit__` special methods. Here, the `connect` method invoked in the first `with` statement must return such an object; similarly, the `cursor` method must also provide an object with such characteristics. 388 389 However, the "pre-with" solution is as follows: 390 391 {{{#!python numbers=disable 392 connection = db.connect(connection_args) 393 try: 394 cursor = connection.cursor() 395 try: 396 cursor.execute(query, args) 397 finally: 398 cursor.close() 399 finally: 400 connection.close() 401 }}} 402 403 Although this seems less readable, its behaviour is more obvious because magic methods are not being called implicitly. Moreover, any parameterisation of the acts of resource deallocation or closure can be done in the `finally` clauses where such parameterisation would seem natural, rather than being specified through some kind of context manager initialisation arguments that must then be propagated to the magic methods so that they may take into consideration contextual information that is readily available in the place where the actual resource operations are being performed. 404 405 === No Generators === 406 407 [[https://www.python.org/dev/peps/pep-0255/|Generators]] were [[https://docs.python.org/release/2.3/whatsnew/section-generators.html|added]] to Python in the 2.2 release and became fully part of the language in the 2.3 release. They offer a convenient way of writing iterator-like objects, capturing execution state instead of obliging the programmer to manage such state explicitly. 408 409 {{{#!python numbers=disable 410 # Not valid in Lichen, only in Python. 411 412 def fib(): 413 a, b = 0, 1 414 while 1: 415 yield b 416 a, b = b, a+b 417 418 # Alternative form valid in Lichen. 419 420 class fib: 421 def __init__(self): 422 self.a, self.b = 0, 1 423 424 def next(self): 425 result = self.b 426 self.a, self.b = self.b, self.a + self.b 427 return result 428 429 # Main program. 430 431 seq = fib() 432 i = 0 433 while i < 10: 434 print seq.next() 435 i += 1 436 }}} 437 438 However, generators make additional demands on the mechanisms provided to support program execution. The encapsulation of the above example generator in a separate class illustrates the need for state that persists outside the execution of the routine providing the generator's results. Generators may look like functions, but they do not necessarily behave like them, leading to potential misunderstandings about their operation even if the code is superficially tidy and concise. 439 440 === Positional and Keyword Arguments Only === 441 442 When invoking callables, only positional arguments and keyword arguments can be used. Python also supports `*` and `**` arguments which respectively unpack sequences and mappings into the argument list, filling the list with sequence items (using `*`) and keywords (using `**`). 443 444 {{{#!python numbers=disable 445 def f(a, b, c, d): 446 return a + b + c + d 447 448 l = range(0, 4) 449 f(*l) # not permitted 450 451 m = {"c" : 10, "d" : 20} 452 f(2, 4, **m) # not permitted 453 }}} 454 455 While convenient, such "unpacking" arguments obscure the communication between callables and undermine the safety provided by function and method signatures. They also require run-time support for the unpacking operations. 456 457 === Positional Parameters Only === 458 459 Similarly, signatures may only contain named parameters that correspond to arguments. Python supports `*` and `**` in parameter lists, too, which respectively accumulate superfluous positional and keyword arguments. 460 461 {{{#!python numbers=disable 462 def f(a, b, *args, **kw): # not permitted 463 return a + b + sum(args) + kw.get("c", 0) + kw.get("d", 0) 464 465 f(1, 2, 3, 4) 466 f(1, 2, c=3, d=4) 467 }}} 468 469 Such accumulation parameters can be useful for collecting arbitrary data and applying some of it within a callable. However, they can easily proliferate throughout a system and allow erroneous data to propagate far from its origin because such parameters permit the deferral of validation until the data needs to be accessed. Again, run-time support is required to marshal arguments into the appropriate parameter of this nature, but programmers could just write functions and methods that employ general sequence and mapping parameters explicitly instead.