1.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000
1.2 +++ b/docs/architecture.txt Sun Jan 07 23:44:39 2007 +0100
1.3 @@ -0,0 +1,77 @@
1.4 +Architecture Overview
1.5 +=====================
1.6 +
1.7 +The simplify modules take Python source code, obtain an abstract syntax tree
1.8 +(AST) using the standard library compiler module, produce a simplified node
1.9 +tree whose nodes correspond to primitive operations, and then perform a number
1.10 +of processes such as name resolution and type annotation on the simplified
1.11 +tree.
1.12 +
1.13 +The Simplified Nodes
1.14 +====================
1.15 +
1.16 +Unlike the selection of AST nodes produced by the compiler module which
1.17 +reflect syntactic constructs in Python source code, the simplified nodes more
1.18 +closely reflect the underlying operations performed when such constructs are
1.19 +executed in a running program. Thus, in some respects the simplified nodes
1.20 +have a certain similarity with CPython interpreter bytecodes. However, unlike
1.21 +the instruction set encoded in the selection of bytecodes understood by the
1.22 +CPython interpreter, simplified nodes are more low-level and eliminate the
1.23 +hidden complexity of certain bytecodes (eg. BINARY_ADD). Consequently, even a
1.24 +small Python program, producing a small AST, can produce a much larger
1.25 +simplified node tree.
1.26 +
1.27 +The simplify module is responsible for the production of a simplified node
1.28 +tree, using the full range of nodes defined in the simplified module. In some
1.29 +cases, the nodes created in this process may be discarded in favour of others
1.30 +after further analysis of the program.
1.31 +
1.32 +Name Resolution
1.33 +===============
1.34 +
1.35 +The scope of names or identifiers in Python programs can be determined by
1.36 +relatively simple methods, and the process of name resolution is concerned
1.37 +with identifying the scope in all name accesses within a program, changing the
1.38 +simplified node involved where necessary. For example, before the name
1.39 +resolution process the mention of a name in a program, as represented by a
1.40 +LoadName simplified node, may be found to refer to a module global instead of
1.41 +a local name; consequently, the LoadName node would be changed to a LoadAttr
1.42 +node which references the module.
1.43 +
1.44 +The fixnames module is responsible for the transformation of the simplified
1.45 +node tree and the replacement of nodes in order to indicate particular
1.46 +namespace operations.
1.47 +
1.48 +Type Annotation
1.49 +===============
1.50 +
1.51 +The principal motivation in developing this system is to discover the nature
1.52 +of data at each point in a program, and one important aspect of doing so is to
1.53 +examine the data types employed at certain points (principally the definition
1.54 +and instantiation of such types), and to propagate such information throughout
1.55 +the program, in a way simulating the execution of the program but without
1.56 +actually doing so with concrete values or objects associated with such types.
1.57 +In order to simulate the semantics of Python, the following concepts have been
1.58 +employed:
1.59 +
1.60 + * Attribute: a bundle containing a data type and its context.
1.61 + * Namespace: a mapping of names to attributes.
1.62 + * Accessor: a combination of an attribute and its origin - the owner of the
1.63 + namespace providing access to the attribute.
1.64 +
1.65 +Attributes
1.66 +----------
1.67 +
1.68 +The main purpose of the notion of an attribute is to support references to
1.69 +methods. Contrary to initial expectations (perhaps fuelled by experience with
1.70 +other programming languages [1] or by simple example programs), Python does
1.71 +not insist that methods be invoked immediately upon dereferencing objects
1.72 +providing such methods; instead the reference to a method may be stored in a
1.73 +variable and used in an invocation on a subsequent occasion. In order to
1.74 +permit this behaviour, the context of a method (typically the object from
1.75 +which the method was obtained) must be recalled and employed in the
1.76 +invocation. Thus, it becomes necessary to treat type information as a bundle
1.77 +containing the type of an attribute and any context information that may
1.78 +subsequently be useful.
1.79 +
1.80 +[1] http://java.sun.com/docs/white/delegates.html