1 = Design Decisions = 2 3 The Lichen language design involves some different choices to those taken in 4 Python's design. Many of these choices are motivated by the following 5 criteria: 6 7 * To simplify the language and to make what programs do easier to understand 8 and to predict 9 * To make analysis of programs easier, particularly 10 [[../Deduction|deductions]] about the nature of the code 11 * To simplify and otherwise reduce the [[../Representations|representations]] 12 employed and the operations performed at run-time 13 14 Lichen is in many ways a restricted form of Python. In particular, 15 restrictions on the attribute names supported by each object help to clearly 16 define the object types in a program, allowing us to identify those objects 17 when they are used. Consequently, optimisations that can be employed in a 18 Lichen program become possible in situations where they would have been 19 difficult or demanding to employ in a Python program. 20 21 Some design choices evoke memories of earlier forms of Python. Removing nested 22 scopes simplifies the [[../Inspection|inspection]] of programs and run-time 23 [[../Representations|representations]] and mechanisms. Other choices seek to 24 remedy difficult or defective aspects of Python, notably the behaviour of 25 Python's [[../Imports|import]] system. 26 27 <<TableOfContents(2,3)>> 28 29 == Attributes == 30 31 {{{#!table 32 '''Lichen''' || '''Python''' || '''Rationale''' 33 == 34 Objects have a fixed set of attribute names 35 || Objects can gain and lose attributes at run-time 36 || Having fixed sets of attributes helps identify object types 37 == 38 Instance attributes may not shadow class attributes 39 || Instance attributes may shadow class attributes 40 || Forbidding shadowing simplifies access operations 41 == 42 Attributes are simple members of object structures 43 || Dynamic handling and computation of attributes is supported 44 || Forbidding dynamic attributes simplifies access operations 45 }}} 46 47 === Fixed Attribute Names === 48 49 Attribute names are bound for classes through assignment in the class 50 namespace, for modules in the module namespace, and for instances in methods 51 through assignment to `self`. Class and instance attributes are propagated to 52 descendant classes and instances of descendant classes respectively. Once 53 bound, attributes can be modified, but new attributes cannot be bound by other 54 means, such as the assignment of an attribute to an arbitrary object that 55 would not already support such an attribute. 56 57 {{{#!python numbers=disable 58 class C: 59 a = 123 60 def __init__(self): 61 self.x = 234 62 63 C.b = 456 # not allowed (b not bound in C) 64 C().y = 567 # not allowed (y not bound for C instances) 65 }}} 66 67 Permitting the addition of attributes to objects would then require that such 68 addition attempts be associated with particular objects, leading to a 69 potentially iterative process involving object type deduction and 70 modification, also causing imprecise results. 71 72 === No Shadowing === 73 74 Instances may not define attributes that are provided by classes. 75 76 {{{#!python numbers=disable 77 class C: 78 a = 123 79 def shadow(self): 80 self.a = 234 # not allowed (attribute shadows class attribute) 81 }}} 82 83 Permitting this would oblige instances to support attributes that, when 84 missing, are provided by consulting their classes but, when not missing, may 85 also be provided directly by the instances themselves. 86 87 === No Dynamic Attributes === 88 89 Instance attributes cannot be provided dynamically, such that any missing 90 attribute would be supplied by a special method call to determine the 91 attribute's presence and to retrieve its value. 92 93 {{{#!python numbers=disable 94 class C: 95 def __getattr__(self, name): # not supported 96 if name == "missing": 97 return 123 98 }}} 99 100 Permitting this would require object types to potentially support any 101 attribute, undermining attempts to use attributes to identify objects. 102 103 == Naming == 104 105 {{{#!table 106 '''Lichen''' || '''Python''' || '''Rationale''' 107 == 108 Names may be local, global or built-in: nested namespaces must be initialised 109 explicitly 110 || Names may also be non-local, permitting closures 111 || Limited name scoping simplifies program inspection and run-time mechanisms 112 == 113 `self` is a reserved name and is optional in method parameter lists 114 || `self` is a naming convention, but the first method parameter must always 115 .. refer to the accessed object 116 || Reserving `self` assists deduction; making it optional is a consequence of 117 .. the method binding behaviour 118 == 119 Instance attributes can be initialised using `.name` parameter notation 120 || [[https://stackoverflow.com/questions/1389180/automatically-initialize-instance-variables|Workarounds]] 121 .. involving decorators and introspection are required for similar brevity 122 || Initialiser notation eliminates duplication in program code and is convenient 123 }}} 124 125 === Traditional Local, Global and Built-In Scopes Only === 126 127 Namespaces reside within a hierarchy within modules: classes containing 128 classes or functions; functions containing other functions. Built-in names are 129 exposed in all namespaces, global names are defined at the module level and 130 are exposed in all namespaces within the module, locals are confined to the 131 namespace in which they are defined. 132 133 However, locals are not inherited by namespaces from surrounding or enclosing 134 namespaces. 135 136 {{{#!python numbers=disable 137 def f(x): 138 def g(y): 139 return x + y # not permitted: x is not inherited from f in Lichen (it is in Python) 140 return g 141 142 def h(x): 143 def i(y, x=x): # x is initialised but held in the namespace of i 144 return x + y # succeeds: x is defined 145 return i 146 }}} 147 148 Needing to access outer namespaces in order to access any referenced names 149 complicates the way in which such dynamic namespaces would need to be managed. 150 Although the default initialisation technique demonstrated above could be 151 automated, explicit initialisation makes programs easier to follow and avoids 152 mistakes involving globals having the same name. 153 154 === Reserved Self === 155 156 The `self` name can be omitted in method signatures, but in methods it is 157 always initialised to the instance on which the method is operating. 158 159 {{{#!python numbers=disable 160 class C: 161 def f(y): # y is not the instance 162 self.x = y # self is the instance 163 }}} 164 165 The assumption in methods is that `self` must always be referring to an 166 instance of the containing class or of a descendant class. This means that 167 `self` cannot be initialised to another kind of value, which Python permits 168 through the explicit invocation of a method with the inclusion of the affected 169 instance as the first argument. Consequently, `self` becomes optional in the 170 signature because it is not assigned in the same way as the other parameters. 171 172 === Instance Attribute Initialisers === 173 174 In parameter lists, a special notation can be used to indicate that the given 175 name is an instance attribute that will be assigned the argument value 176 corresponding to the parameter concerned. 177 178 {{{#!python numbers=disable 179 class C: 180 def f(self, .a, .b, c): # .a and .b indicate instance attributes 181 self.c = c # a traditional assignment using a parameter 182 }}} 183 184 To use the notation, such dot-qualified parameters must appear only in the 185 parameter lists of methods, not plain functions. The qualified parameters are 186 represented as locals having the same name, and assignments to the 187 corresponding instance attributes are inserted into the generated code. 188 189 {{{#!python numbers=disable 190 class C: 191 def f1(self, .a, .b): # equivalent to f2, below 192 pass 193 194 def f2(self, a, b): 195 self.a = a 196 self.b = b 197 198 def g(self, .a, .b, a): # not permitted: a appears twice 199 pass 200 }}} 201 202 Naturally, `self`, being a reserved name in methods, can also be omitted from 203 such parameter lists. Moreover, such initialising parameters can have default 204 values. 205 206 {{{#!python numbers=disable 207 class C: 208 def __init__(.a=1, .b=2): 209 pass 210 211 c1 = C() 212 c2 = C(3, 4) 213 print c1.a, c1.b # 1 2 214 print c2.a, c2.b # 3 4 215 }}} 216 217 == Inheritance and Binding == 218 219 {{{#!table 220 '''Lichen''' || '''Python''' || '''Rationale''' 221 == 222 Class attributes are propagated to class hierarchy members during 223 initialisation: rebinding class attributes does not affect descendant class 224 attributes 225 || Class attributes are propagated live to class hierarchy members and must be 226 .. looked up by the run-time system if not provided by a given class 227 || Initialisation-time propagation simplifies access operations and attribute 228 .. table storage 229 == 230 Unbound methods must be bound using a special function taking an instance 231 || Unbound methods may be called using an instance as first argument 232 || Forbidding instances as first arguments simplifies the invocation mechanism 233 == 234 Functions assigned to class attributes do not become unbound methods 235 || Functions assigned to class attributes become unbound methods 236 || Removing method assignment simplifies deduction: methods are always defined 237 .. in place 238 == 239 Base classes must be well-defined 240 || Base classes may be expressions 241 || Well-defined base classes are required to establish a well-defined 242 .. hierarchy of types 243 == 244 Classes may not be defined in functions 245 || Classes may be defined in any kind of namespace 246 || Forbidding classes in functions prevents the definition of countless class 247 .. variants that are awkward to analyse 248 }}} 249 250 === Inherited Class Attributes === 251 252 Class attributes that are changed for a class do not change for that class's 253 descendants. 254 255 {{{#!python numbers=disable 256 class C: 257 a = 123 258 259 class D(C): 260 pass 261 262 C.a = 456 263 print D.a # remains 123 in Lichen, becomes 456 in Python 264 }}} 265 266 Permitting this requires indirection for all class attributes, requiring them 267 to be treated differently from other kinds of attributes. Meanwhile, class 268 attribute rebinding and the accessing of inherited attributes changed in this 269 way is relatively rare. 270 271 === Unbound Methods === 272 273 Methods are defined on classes but are only available via instances: they are 274 instance methods. Consequently, acquiring a method directly from a class and 275 then invoking it should fail because the method will be unbound: the "context" 276 of the method is not an instance. Furthermore, the Python technique of 277 supplying an instance as the first argument in an invocation to bind the 278 method to an instance, thus setting the context of the method, is not 279 supported. See [[#Reserved Self|"Reserved Self"]] for more information. 280 281 {{{#!python numbers=disable 282 class C: 283 def f(self, x): 284 self.x = x 285 def g(self): 286 C.f(123) # not permitted: C is not an instance 287 C.f(self, 123) # not permitted: self cannot be specified in the argument list 288 get_using(C.f, self)(123) # binds C.f to self, then the result is called 289 }}} 290 291 Binding methods to instances occurs when acquiring methods via instances or 292 explicitly using the `get_using` built-in. The built-in checks the 293 compatibility of the supplied method and instance. If compatible, it provides 294 the bound method as its result. 295 296 Normal functions are callable without any further preparation, whereas unbound 297 methods need the binding step to be performed and are not immediately 298 callable. Were functions to become unbound methods upon assignment to a class 299 attribute, they would need to be invalidated by having the preparation 300 mechanism enabled on them. However, this invalidation would only be relevant 301 to the specific case of assigning functions to classes and this would need to 302 be tested for. Given the added complications, such functionality is arguably 303 not worth supporting. 304 305 === Assigning Functions to Class Attributes === 306 307 Functions can be assigned to class attributes but do not become unbound 308 methods as a result. 309 310 {{{#!python numbers=disable 311 class C: 312 def f(self): # will be replaced 313 return 234 314 315 def f(self): 316 return self 317 318 C.f = f # makes C.f a function, not a method 319 C().f() # not permitted: f requires an explicit argument 320 C().f(123) # permitted: f has merely been exposed via C.f 321 }}} 322 323 Methods are identified as such by their definition location, they contribute 324 information about attributes to the class hierarchy, and they employ certain 325 structure details at run-time to permit the binding of methods. Since 326 functions can defined in arbitrary locations, no class hierarchy information 327 is available, and a function could combine `self` with a range of attributes 328 that are not compatible with any class to which the function might be 329 assigned. 330 331 === Well-Defined Base Classes === 332 333 Base classes must be clearly identifiable as well-defined classes. This 334 facilitates the cataloguing of program objects and further analysis on them. 335 336 {{{#!python numbers=disable 337 class C: 338 x = 123 339 340 def f(): 341 return C 342 343 class D(f()): # not permitted: f could return anything 344 pass 345 }}} 346 347 If base class identification could only be done reliably at run-time, class 348 relationship information would be very limited without running the program or 349 performing costly and potentially unreliable analysis. Indeed, programs 350 employing such dynamic base classes are arguably resistant to analysis, which 351 is contrary to the goals of a language like Lichen. 352 353 === Class Definitions and Functions === 354 355 Classes may not be defined in functions because functions provide dynamic 356 namespaces, but Lichen relies on a static namespace hierarchy in order to 357 clearly identify the principal objects in a program. If classes could be 358 defined in functions, despite seemingly providing the same class over and over 359 again on every invocation, a family of classes would, in fact, be defined. 360 361 {{{#!python numbers=disable 362 def f(x): 363 class C: # not permitted: this describes one of potentially many classes 364 y = x 365 return f 366 }}} 367 368 Moreover, issues of namespace nesting also arise, since the motivation for 369 defining classes in functions would surely be to take advantage of local state 370 to parameterise such classes. 371 372 == Modules and Packages == 373 374 {{{#!table 375 '''Lichen''' || '''Python''' || '''Rationale''' 376 == 377 Modules are independent: package hierarchies are not traversed when importing 378 || Modules exist in hierarchical namespaces: package roots must be imported 379 .. before importing specific submodules 380 || Eliminating module traversal permits more precise imports and reduces 381 .. superfluous code 382 == 383 Only specific names can be imported from a module or package using the `from` 384 statement 385 || Importing "all" from a package or module is permitted 386 || Eliminating "all" imports simplifies the task of determining where names in 387 .. use have come from 388 == 389 Modules must be specified using absolute names 390 || Imports can be absolute or relative 391 || Using only absolute names simplifies the import mechanism 392 == 393 Modules are imported independently and their dependencies subsequently 394 resolved 395 || Modules are imported as import statements are encountered 396 || Statically-initialised objects can be used declaratively, although an 397 .. initialisation order may still need establishing 398 }}} 399 400 === Independent Modules === 401 402 The inclusion of modules in a program affects only explicitly-named modules: 403 they do not have relationships implied by their naming that would cause such 404 related modules to be included in a program. 405 406 {{{#!python numbers=disable 407 from compiler import consts # defines consts 408 import compiler.ast # defines ast, not compiler 409 410 ast # is defined 411 compiler # is not defined 412 consts # is defined 413 }}} 414 415 Where modules should have relationships, they should be explicitly defined 416 using `from` and `import` statements which target the exact modules required. 417 In the above example, `compiler` is not routinely imported because modules 418 within the `compiler` package have been requested. 419 420 === Specific Name Imports Only === 421 422 Lichen, unlike Python, also does not support the special `__all__` module 423 attribute. 424 425 {{{#!python numbers=disable 426 from compiler import * # not permitted 427 from compiler import ast, consts # permitted 428 429 interpreter # undefined in compiler (yet it might be thought to reside there) and in this module 430 }}} 431 432 The `__all__` attribute supports `from ... import *` statements in Python, but 433 without identifying the module or package involved and then consulting 434 `__all__` in that module or package to discover which names might be involved 435 (which might require the inspection of yet other modules or packages), the 436 names imported cannot be known. Consequently, some names used elsewhere in the 437 module performing the import might be assumed to be imported names when, in 438 fact, they are unknown in both the importing and imported modules. Such 439 uncertainty hinders the inspection of individual modules. 440 441 === Modules Imported Independently === 442 443 When indicating an import using the `from` and `import` statements, the 444 [[../Toolchain|toolchain]] does not attempt to immediately import other 445 modules. Instead, the imports act as declarations of such other modules or 446 names from other modules, resolved at a later stage. This permits mutual 447 imports to a greater extent than in Python. 448 449 {{{#!python numbers=disable 450 # Module M 451 from N import C # in Python: fails attempting to re-enter N 452 453 class D(C): 454 y = 456 455 456 # Module N 457 from M import D # in Python: causes M to be entered, fails when re-entered from N 458 459 class C: 460 x = 123 461 462 class E(D): 463 z = 789 464 465 # Main program 466 import N 467 }}} 468 469 Such flexibility is not usually needed, and circular importing usually 470 indicates issues with program organisation. However, declarative imports can 471 help to decouple modules and avoid combining import declaration and module 472 initialisation order concerns. 473 474 == Syntax and Control-Flow == 475 476 {{{#!table 477 '''Lichen''' || '''Python''' || '''Rationale''' 478 == 479 If expressions and comprehensions are not supported 480 || If expressions and comprehensions are supported 481 || Omitting such syntactic features simplifies program inspection and 482 .. translation 483 == 484 The `with` statement is not supported 485 || The `with` statement offers a mechanism for resource allocation and 486 .. deallocation using context managers 487 || This syntactic feature can be satisfactorily emulated using existing 488 .. constructs 489 == 490 Generators are not supported 491 || Generators are supported 492 || Omitting generator support simplifies run-time mechanisms 493 == 494 Only positional and keyword arguments are supported 495 || Argument unpacking (using `*` and `**`) is supported 496 || Omitting unpacking simplifies generic invocation handling 497 == 498 All parameters must be specified 499 || Catch-all parameters (`*` and `**`) are supported 500 || Omitting catch-all parameter population simplifies generic invocation 501 .. handling 502 }}} 503 504 === No If Expressions or Comprehensions === 505 506 In order to support the classic [[WikiPedia:?:|ternary operator]], a construct 507 was [[https://www.python.org/dev/peps/pep-0308/|added]] to the Python syntax 508 that needed to avoid problems with the existing grammar and notation. 509 Unfortunately, it reorders the components from the traditional form: 510 511 {{{#!python numbers=disable 512 # Not valid in Lichen, only in Python. 513 514 # In C: condition ? true_result : false_result 515 true_result if condition else false_result 516 517 # In C: (condition ? inner_true_result : inner_false_result) ? true_result : false_result 518 true_result if (inner_true_result if condition else inner_false_result) else false_result 519 }}} 520 521 Since if expressions may participate within expressions, they cannot be 522 rewritten as if statements. Nor can they be rewritten as logical operator 523 chains in general. 524 525 {{{#!python numbers=disable 526 # Not valid in Lichen, only in Python. 527 528 a = 0 if x else 1 # x being true yields 0 529 530 # Here, x being true causes (x and 0) to complete, yielding 0. 531 # But this causes ((x and 0) or 1) to complete, yielding 1. 532 533 a = x and 0 or 1 # not valid 534 }}} 535 536 But in any case, it would be more of a motivation to support the functionality 537 if a better syntax could be adopted instead. However, if expressions are not 538 particularly important in Python, and despite enhancement requests over many 539 years, everybody managed to live without them. 540 541 List and generator comprehensions are more complicated but share some 542 characteristics of if expressions: their syntax contradicts the typical 543 conventions established by the rest of the Python language; they create 544 implicit state that is perhaps most appropriately modelled by a separate 545 function or similar object. Since Lichen does not support generators at all, 546 it will obviously not support generator expressions. 547 548 Meanwhile, list comprehensions quickly encourage barely-readable programs: 549 550 {{{#!python numbers=disable 551 # Not valid in Lichen, only in Python. 552 553 x = [0, [1, 2, 0], 0, 0, [0, 3, 4]] 554 a = [z for y in x if y for z in y if z] 555 }}} 556 557 Supporting the creation of temporary functions to produce list comprehensions, 558 while also hiding temporary names from the enclosing scope, adds complexity to 559 the toolchain for situations where programmers would arguably be better 560 creating their own functions and thus writing more readable programs. 561 562 === No With Statement === 563 564 The 565 [[https://docs.python.org/2.7/reference/compound_stmts.html#the-with-statement|with 566 statement]] introduced the concept of 567 [[https://docs.python.org/2.7/reference/datamodel.html#context-managers|context 568 managers]] in Python 2.5, with such objects supporting a 569 [[https://docs.python.org/2.7/library/stdtypes.html#typecontextmanager|programming 570 interface]] that aims to formalise certain conventions around resource 571 management. For example: 572 573 {{{#!python numbers=disable 574 # Not valid in Lichen, only in Python. 575 576 with connection = db.connect(connection_args): 577 with cursor = connection.cursor(): 578 cursor.execute(query, args) 579 }}} 580 581 Although this makes for readable code, it must be supported by objects which 582 define the `__enter__` and `__exit__` special methods. Here, the `connect` 583 method invoked in the first `with` statement must return such an object; 584 similarly, the `cursor` method must also provide an object with such 585 characteristics. 586 587 However, the "pre-with" solution is as follows: 588 589 {{{#!python numbers=disable 590 connection = db.connect(connection_args) 591 try: 592 cursor = connection.cursor() 593 try: 594 cursor.execute(query, args) 595 finally: 596 cursor.close() 597 finally: 598 connection.close() 599 }}} 600 601 Although this seems less readable, its behaviour is more obvious because magic 602 methods are not being called implicitly. Moreover, any parameterisation of the 603 acts of resource deallocation or closure can be done in the `finally` clauses 604 where such parameterisation would seem natural, rather than being specified 605 through some kind of context manager initialisation arguments that must then 606 be propagated to the magic methods so that they may take into consideration 607 contextual information that is readily available in the place where the actual 608 resource operations are being performed. 609 610 === No Generators === 611 612 [[https://www.python.org/dev/peps/pep-0255/|Generators]] were 613 [[https://docs.python.org/release/2.3/whatsnew/section-generators.html|added]] 614 to Python in the 2.2 release and became fully part of the language in the 2.3 615 release. They offer a convenient way of writing iterator-like objects, 616 capturing execution state instead of obliging the programmer to manage such 617 state explicitly. 618 619 {{{#!python numbers=disable 620 # Not valid in Lichen, only in Python. 621 622 def fib(): 623 a, b = 0, 1 624 while 1: 625 yield b 626 a, b = b, a+b 627 628 # Alternative form valid in Lichen. 629 630 class fib: 631 def __init__(self): 632 self.a, self.b = 0, 1 633 634 def next(self): 635 result = self.b 636 self.a, self.b = self.b, self.a + self.b 637 return result 638 639 # Main program. 640 641 seq = fib() 642 i = 0 643 while i < 10: 644 print seq.next() 645 i += 1 646 }}} 647 648 However, generators make additional demands on the mechanisms provided to 649 support program execution. The encapsulation of the above example generator in 650 a separate class illustrates the need for state that persists outside the 651 execution of the routine providing the generator's results. Generators may 652 look like functions, but they do not necessarily behave like them, leading to 653 potential misunderstandings about their operation even if the code is 654 superficially tidy and concise. 655 656 === Positional and Keyword Arguments Only === 657 658 When invoking callables, only positional arguments and keyword arguments can 659 be used. Python also supports `*` and `**` arguments which respectively unpack 660 sequences and mappings into the argument list, filling the list with sequence 661 items (using `*`) and keywords (using `**`). 662 663 {{{#!python numbers=disable 664 def f(a, b, c, d): 665 return a + b + c + d 666 667 l = range(0, 4) 668 f(*l) # not permitted 669 670 m = {"c" : 10, "d" : 20} 671 f(2, 4, **m) # not permitted 672 }}} 673 674 While convenient, such "unpacking" arguments obscure the communication between 675 callables and undermine the safety provided by function and method signatures. 676 They also require run-time support for the unpacking operations. 677 678 === Positional Parameters Only === 679 680 Similarly, signatures may only contain named parameters that correspond to 681 arguments. Python supports `*` and `**` in parameter lists, too, which 682 respectively accumulate superfluous positional and keyword arguments. 683 684 {{{#!python numbers=disable 685 def f(a, b, *args, **kw): # not permitted 686 return a + b + sum(args) + kw.get("c", 0) + kw.get("d", 0) 687 688 f(1, 2, 3, 4) 689 f(1, 2, c=3, d=4) 690 }}} 691 692 Such accumulation parameters can be useful for collecting arbitrary data and 693 applying some of it within a callable. However, they can easily proliferate 694 throughout a system and allow erroneous data to propagate far from its origin 695 because such parameters permit the deferral of validation until the data needs 696 to be accessed. Again, run-time support is required to marshal arguments into 697 the appropriate parameter of this nature, but programmers could just write 698 functions and methods that employ general sequence and mapping parameters 699 explicitly instead.