Lichen

Annotated docs/wiki/Imports

905:709a26b53318
2019-06-02 Paul Boddie Merged changes from the default branch. trailing-data
paul@810 1
= Imports =
paul@810 2
paul@861 3
An '''import''' is a declaration of one or more names that are provided by
paul@861 4
another source file or '''module''':
paul@810 5
paul@810 6
  * `import` statements declare names that correspond to modules
paul@810 7
  * `from` statements declare names provided by modules
paul@810 8
paul@861 9
Imports occur either through explicit import operations initiated by the
paul@861 10
`from` and `import` statements, or through implicit import operations
paul@861 11
occurring to satisfy the requirements of another kind of operation.
paul@810 12
paul@810 13
<<TableOfContents(2,3)>>
paul@810 14
paul@810 15
== Packages and Submodules ==
paul@810 16
paul@861 17
A '''package''' is a collection of modules whose names are all prefixed by the
paul@861 18
package name. For example:
paul@810 19
paul@810 20
{{{
paul@810 21
compiler
paul@810 22
compiler.ast
paul@810 23
compiler.transformer
paul@810 24
}}}
paul@810 25
paul@861 26
Here, the `compiler` package is said to contain the `compiler.ast` and
paul@861 27
`compiler.transformer` modules.
paul@810 28
paul@810 29
=== Defining Packages ===
paul@810 30
paul@861 31
The package root or top-level module is defined in a file called `__init__.py`
paul@861 32
inside the directory bearing the package's name, and this file provides a
paul@861 33
namespace for the top-level module. However, a package does not expose its
paul@861 34
member modules ('''submodules''') as members of its top-level module. Instead,
paul@861 35
the hierarchical relationship between a package and its submodules exists
paul@861 36
purely in the naming of those modules, and where submodules are imported they
paul@861 37
must be done so using their full names.
paul@810 38
paul@861 39
Thus, relationships between packages and modules must be explicitly defined in
paul@861 40
module namespaces. For example, in the `compiler` module, the following would
paul@861 41
define relationships to the submodules:
paul@810 42
paul@810 43
{{{
paul@810 44
from compiler.ast import ast
paul@810 45
from compiler.transformer import transformer
paul@810 46
}}}
paul@810 47
paul@861 48
Without such import statements, no attempt will be made upon importing
paul@861 49
`compiler` to access the submodules and automatically populate the package.
paul@810 50
paul@810 51
=== Accessing Submodules Directly ===
paul@810 52
paul@861 53
Importing of submodules from packages will not cause the package itself to be
paul@861 54
imported. For example:
paul@810 55
paul@810 56
{{{
paul@810 57
import compiler.ast
paul@810 58
}}}
paul@810 59
paul@861 60
This initialises the name `ast` which refers to the `compiler.ast` module, but
paul@861 61
the `compiler` package and its top-level module will not be imported. Thus,
paul@861 62
submodules can be considered independent of their packages, although they may
paul@861 63
seek to import their package top-level module should they need to access
paul@861 64
objects provided by that module.
paul@810 65
paul@810 66
== Implicit Imports ==
paul@810 67
paul@810 68
The following kinds of operations cause implicit imports:
paul@810 69
paul@810 70
|| '''Operations''' ||<-2> '''Import names provided by...''' ||
paul@810 71
|| Augmented assignments ||<|5> `operator` || `operator.augmented` ||
paul@810 72
|| Binary operators || `operator.binary` ||
paul@810 73
|| Comparison operators || `operator.comparison` ||
paul@810 74
|| Slice operators || `operator.sequence`<<BR>>~-Subscript operators are converted to item method invocations-~ ||
paul@810 75
|| Unary operators || `operator.unary` ||
paul@810 76
|| Access to built-in name || `__builtins__` || (various modules in the [[../Builtins|built-ins]] package hierarchy) ||
paul@810 77
paul@861 78
Operator usage will cause a local name referring to an `operator` module
paul@861 79
function to be created, with the appropriate function being exposed by the
paul@861 80
`operator` module itself. However, the inspection process will seek to obtain
paul@861 81
a reference to the function in its actual definition location, ultimately
paul@861 82
referencing the function in one of the modules indicated above.
paul@810 83
paul@810 84
== Import Sequencing ==
paul@810 85
paul@861 86
In order to populate modules, other modules may themselves be required to
paul@861 87
provide names to a given module, and in turn these other modules may rely on
paul@861 88
yet more modules, and so on. One logical consequence of this is that circular
paul@861 89
imports become possible, but the resulting mutual dependencies may not be
paul@861 90
easily untangled without careful attention to the state of each of the
paul@861 91
participating modules. Consider the following situation:
paul@810 92
paul@810 93
{{{{#!table
paul@810 94
{{{#!graphviz
paul@810 95
//format=svg
paul@810 96
//transform=notugly
paul@810 97
digraph mutual {
paul@810 98
  node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Mutually-dependent modules"];
paul@810 99
  edge [tooltip="Mutually-dependent modules"];
paul@810 100
  rankdir=LR;
paul@810 101
paul@810 102
  subgraph {
paul@810 103
    rank=same;
paul@810 104
    moduleA [label="module A",shape=ellipse];
paul@810 105
    fromB [label="from B import C",style=filled,fillcolor=gold];
paul@810 106
    D [label="class D(C)"];
paul@810 107
  }
paul@810 108
paul@810 109
  subgraph {
paul@810 110
    rank=same;
paul@810 111
    moduleB [label="module B",shape=ellipse];
paul@810 112
    fromA [label="from A import D",style=filled,fillcolor=gold];
paul@810 113
    C [label="class C"];
paul@810 114
    E [label="class E(D)"];
paul@810 115
  }
paul@810 116
paul@810 117
  moduleA -> fromB -> D [dir=none,style=dashed];
paul@810 118
  moduleB -> fromA -> C -> E [dir=none,style=dashed];
paul@810 119
paul@810 120
  fromB -> fromA;
paul@810 121
  fromA -> fromB;
paul@810 122
}
paul@810 123
}}}
paul@810 124
||
paul@810 125
Module A:
paul@810 126
paul@810 127
{{{
paul@810 128
from B import C
paul@810 129
paul@810 130
class D(C):
paul@810 131
    ...
paul@810 132
}}}
paul@810 133
paul@810 134
Module B:
paul@810 135
paul@810 136
{{{
paul@810 137
from A import D
paul@810 138
paul@810 139
class C:
paul@810 140
    ...
paul@810 141
paul@810 142
class E(D):
paul@810 143
    ...
paul@810 144
}}}
paul@810 145
}}}}
paul@810 146
paul@861 147
If modules were loaded upon being encountered in import statements, module A
paul@861 148
would not be completely processed when attempting to import from module B, and
paul@861 149
thus the import within module B of module A would only yield some information
paul@861 150
about module A. Consequently, the details of class D might not be available,
paul@861 151
and this would then have an impact on whether module B could even be
paul@861 152
completely processed itself.
paul@810 153
paul@861 154
The approach taken to generally deal with such situations is to defer
paul@861 155
resolution until all modules have been populated. Then, names are resolved
paul@861 156
with any names employing kinds of references specified as `<depends>` (instead
paul@861 157
of, for example, `<class>`) being resolved according to the recorded import
paul@861 158
dependencies.
paul@810 159
paul@861 160
Since the classes in one module may depend on those in other modules, it is
paul@861 161
not always possible to finalise the details of classes in a module context.
paul@861 162
And since modules may depend on each other, it is not always possible to
paul@861 163
finalise the details of classes until the details of all classes in a program
paul@861 164
are known.
paul@810 165
paul@810 166
=== Module Initialisation ===
paul@810 167
paul@861 168
Although static objects can be defined with interdependencies in a declarative
paul@861 169
fashion, the initialisation of objects in modules may require the availability
paul@861 170
of completely-initialised objects defined in other modules. Thus, an
paul@861 171
initialisation order needs to be established, with some modules being
paul@861 172
initialised before others, so that all modules do not encounter uninitialised
paul@861 173
names when they are expecting those names to provide valid objects.
paul@810 174
paul@861 175
The most obvious example of a module requiring the initialisation of others
paul@861 176
before it is itself evaluated is, of course, the `__main__` module. Given that
paul@861 177
it may import instances defined as attribute on other modules, it clearly
paul@861 178
requires those modules to have been initialised and those instances to have
paul@861 179
been created. It would be absurd to consider running the body of the
paul@861 180
`__main__` module before such other modules. Similarly, such dependencies
paul@861 181
exist between other modules, and consequently, an appropriate initialisation
paul@861 182
ordering must be defined for them. In its entirety, then, a program must
paul@861 183
define a workable ordering for all of its modules, signalling a concrete error
paul@861 184
if no such ordering can be established.
paul@810 185
paul@810 186
== Hidden Modules ==
paul@810 187
paul@861 188
Imports that do not obtain the imported module name itself, such as those
paul@861 189
initiated by the `from` statement and by implicit operations, keep the
paul@861 190
imported module '''hidden'''. Unless other operations expose hidden modules,
paul@861 191
they will remain hidden and may consequently be omitted from the final
paul@861 192
generated program: there would be no way of referencing such modules and they
paul@861 193
would therefore be unable to contribute their contents to the rest of the
paul@861 194
program.
paul@810 195
paul@861 196
However, where an object provided by a module is referenced, a module cannot
paul@861 197
remain hidden, since the provided object may depend on other parts of the
paul@861 198
module in order to function correctly. And since a provided object might
paul@861 199
reference or return other objects in the module, the general module contents
paul@861 200
must also be exposed.
paul@810 201
paul@861 202
Import dependencies are defined for namespaces indicating modules that are
paul@861 203
required by each namespace. By following dependency relationships, it is
paul@861 204
possible to determine the eventual target of an import and to potentially skip
paul@861 205
over modules that merely import and expose names. For example:
paul@810 206
paul@810 207
{{{{#!table
paul@810 208
{{{#!graphviz
paul@810 209
//format=svg
paul@810 210
//transform=notugly
paul@810 211
digraph imports {
paul@810 212
  node [shape=box,fontsize="13.0",fontname="Helvetica",tooltip="Import dependencies"];
paul@810 213
  edge [tooltip="Import dependencies"];
paul@810 214
  rankdir=LR;
paul@810 215
paul@810 216
  importer [label="from A import C",style=filled,fillcolor=darkorange];
paul@810 217
paul@810 218
  subgraph {
paul@810 219
    rank=same;
paul@810 220
    moduleA [label="module A",shape=ellipse];
paul@810 221
    fromB [label="from B import C",style=filled,fillcolor=gold];
paul@810 222
  }
paul@810 223
paul@810 224
  subgraph {
paul@810 225
    rank=same;
paul@810 226
    moduleB [label="module B",shape=ellipse];
paul@810 227
    C [label="class C",style=filled,fillcolor=darkorange];
paul@810 228
  }
paul@810 229
paul@810 230
  moduleA -> fromB [dir=none,style=dashed];
paul@810 231
  moduleB -> C [dir=none,style=dashed];
paul@810 232
paul@810 233
  importer -> fromB -> C;
paul@810 234
}
paul@810 235
}}}
paul@810 236
||
paul@810 237
{{{
paul@810 238
from A import C
paul@810 239
}}}
paul@810 240
paul@810 241
Module A:
paul@810 242
paul@810 243
{{{
paul@810 244
from B import C
paul@810 245
}}}
paul@810 246
paul@810 247
Module B:
paul@810 248
paul@810 249
{{{
paul@810 250
class C:
paul@810 251
    ...
paul@810 252
}}}
paul@810 253
}}}}
paul@810 254
paul@861 255
Here, B is never explicitly referenced, nor does it provide any referenced
paul@861 256
objects other than an imported name. Consequently, B is hidden and ultimately
paul@861 257
excluded from the final program. Such techniques are employed in the
paul@861 258
[[../Builtins|built-ins]] package hierarchy to reduce the amount of
paul@861 259
functionality employed by (and bundled in) a generated program.