# HG changeset patch
# User Paul Boddie <paul@boddie.org.uk>
# Date 1587420497 -7200
# Node ID a3349113185c57c2f0b9de0bf1f453e499522913
# Parent  559cd86cd5acdde5575bea0ae4e9d25d0136979d
Added the beginnings of the development documentation.

diff -r 559cd86cd5ac -r a3349113185c docs/wiki/Development
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/docs/wiki/Development	Tue Apr 21 00:08:17 2020 +0200
@@ -0,0 +1,431 @@
+= Development =
+
+The current implementation of the [[idl|`idl`]] tool employs the stated
+[[Prerequisites|prerequisites]] with the software written in the C programming
+language. A Flex-based lexical analyser produces tokens that are consumed by a
+Bison-based parser generator whose rules call program functions to build
+structures and generate output.
+
+<<TableOfContents(2)>>
+
+== Main Program ==
+
+The `main.c` file contains the `main` function whose responsibilities include
+the following:
+
+ * Processing program arguments and options using C library `getopt`
+   functionality, configuring the program behaviour and output.
+
+ * Input file access and output coordination.
+
+ * Parser initialisation and invocation using the `yyrestart` and
+   `yyparse` functions, and parser state management.
+
+Various helper functions are also defined.
+
+== Configuration ==
+
+The `config.c` and `config.h` files define configuration state for the program
+as it processes input files, with the latter defining the nature of the state
+and the former recording the actual state set by the main program when
+interpreting program arguments and options. Thus, this configuration can be
+considered to be a concise, validated form of those arguments and options.
+
+== Lexical Analysis ==
+
+The `idl.lex` file defines a scanner or lexical analyser for the language used
+to describe interfaces. For the most part, it merely matches sequences of
+characters and produces token identifiers, with no additional information
+being provided for tokens having a fixed form (typically keywords and
+symbols).
+
+Tokens with varying input forms (identifiers, numbers, and so on) also have a
+value associated with them indicating such things as the precise sequence of
+matching characters or the numeric equivalent of those characters. Such
+details are stored using members of the value type, referenced via the
+``yylval`` variable. The token value type is itself defined in the parser
+definition.
+
+=== Operating Modes ===
+
+There are two distinct operating modes of the scanner:
+
+ * The default mode recognising most elements of the language.
+ * A comment mode consuming the text found within comment delimiters.
+
+When a comment start indicator (`/*`) is encountered in the default mode, the
+mode is switched to the comment mode and all input is consumed until the
+comment end indicator (`*/`) is encountered, at which point the default mode
+is selected again.
+
+A special indicator is used to declare the comment mode in the scanner file:
+
+{{{
+%x comment
+}}}
+
+All rules applicable in this mode are prefixed with `<comment>` in the file to
+distinguish them from rules applicable in the default mode.
+
+== Parsing ==
+
+The `idl.y` file defines a parser for the interface description language. It
+starts by attempting to satisfy the `file` rule, matching statements until the
+end of input, invoking the `write_files` function to generate the configured
+output from the tool.
+
+As each rule is evaluated, tokens are consumed from the scanner and operations
+are performed to build up a structure describing the input. Where a rule
+cannot be evaluated successfully, the included ``yyerror`` function emits a
+message and the parsing will halt.
+
+(Error handling and reporting could do to be improved.)
+
+=== Token and Rule Result Values ===
+
+For interoperability with the scanner or lexical analyser, the parser defines
+the nature of the values associated with tokens using the `%union`
+declaration, these including a `str` (string) interpretation and a `num`
+(numeric) interpretation of a value.
+
+Since token values are propagated between parser rules, the `%union`
+declaration is augmented with other interpretations that are employed by the
+rules. Consequently, rules can obtain values for tokens that were produced by
+the scanner (such as numbers and strings) but then incorporate them into other
+kinds of values or structures, passing them on to other rules. These other
+rules can treat such propagated values in the same way as those produced
+directly by the scanner.
+
+For example, the `include` rule obtains a value associated with a header
+filename. This filename is associated with the `HEADER` token and its value is
+interpreted as a string. However, the rule needs to prepare a structure that
+incorporates the filename and that can be referenced by other structures. To
+achieve this, a `%union` member is defined for the structure type concerned:
+
+{{{
+%union {
+  long num;
+  char *str;
+  struct include inc;
+  ...
+}
+}}}
+
+The member name, `inc`, is then associated with the `include` rule:
+
+{{{
+%type <inc> include
+}}}
+
+With this, the rule can be considered to be working to populate a value of the
+indicated type (`struct include`), and where other rules reference the result
+of the rule, they will be able to recognise the type of this result.
+
+=== Value Copying ===
+
+When rules obtain values to store them, it is necessary to copy the obtained
+values because these values may not be allocated in any permanent sense: they
+may only be available at a particular point during the scanning of input, and
+any attempt to reference them later may yield an invalid value. Consequently,
+a `copy` function and a suite of convenience macros (defined in `parser.h`)
+allocate memory for new values.
+
+(Currently, the management of allocated memory is deficient in that such
+memory is not deallocated. However, since the program is intended to have a
+limited running time and handle limited numbers of input files, no effort has
+been directed towards tidying up this allocated memory.)
+
+=== Rules and Structures ===
+
+Generally, the structures built using the result values reflect the structure
+of the rules describing the interface description language. However, the form
+of rules, necessary as it is for parsing, is not entirely optimal for a
+generated structure. Consider the following rule:
+
+{{{
+attributes : attribute SEP attributes
+           | attribute
+           ;
+}}}
+
+Consider a pair of attributes:
+
+{{{
+first,second
+}}}
+
+With the `attribute` rule matching each identifier, and with the `attributes`
+rule incorporating a single attribute value within a structure referencing
+other attribute, the following structure would emerge:
+
+######## A graph showing a generated structure derived from rules...
+
+{{{#!graphviz
+#format svg
+#transform notugly
+digraph rule_structure
+{
+  graph [fontsize="15.0",fontname="Helvetica"];
+  node [shape=record,fontname="Helvetica"];
+
+  attributes1 [label="{attributes | {<f> attribute |<a> attributes}}"];
+  first       [label="{attribute | { \"first\" | ... }}"];
+  attributes2 [label="{attributes | {<f> attribute |<a> attributes}}"];
+  second      [label="{attribute | { \"second\" | ... }}"];
+
+  attributes1:f -> first;
+  attributes1:a -> attributes2;
+  attributes2:f -> second;
+}
+}}}
+
+######## End of graph.
+
+A more natural structure would instead employ a linked list of attributes:
+
+######## A graph showing a generated structure employing lists...
+
+{{{#!graphviz
+#format svg
+#transform notugly
+digraph natural_structure
+{
+  graph [fontsize="15.0",fontname="Helvetica"];
+  node [shape=record,fontname="Helvetica"];
+
+  first  [label="{attribute | { \"first\" | ... |<a> tail }}"];
+  second [label="{attribute | { \"second\" | ... |<a> tail }}"];
+
+  first:a -> second;
+}
+}}}
+
+To achieve this, a tail member is defined in structures, and instead of
+wrapping results in new structures at each level of the rule hierarchy,
+results are effectively combined by having one result reference another via
+the tail member, thus linking together collections of results.
+
+######## End of graph.
+
+== Interface Structure ==
+
+The `types.h` file defines the structural elements of interfaces prepared
+during the processing of the input files. A hierarchy of structure types is
+defined as follows:
+
+######## A graph showing the relationship between structure types...
+
+{{{#!graphviz
+#format svg
+#transform notugly
+digraph types
+{
+  graph [fontsize="15.0",fontname="Helvetica"];
+  node [shape=record,fontname="Helvetica"];
+
+  interface  [label="{interface | {name |<s> signatures |<a> attributes |<i> includes |<t> tail}}"];
+  signature  [label="{signature | {qualifier | operation |<p> parameters |<a> attributes |<t> tail}}"];
+  attribute  [label="{attribute | {attribute |<i> identifiers |<t> tail}}"];
+  parameter  [label="{parameter | {specifier | class |<i> identifiers |<t> tail}}"];
+  identifier [label="{identifier | {identifier |<t> tail}}"];
+  include    [label="{include | {filename |<t> tail}}"];
+
+  interface:s -> signature;
+  interface:a -> attribute;
+  interface:i -> include;
+  interface:t -> interface;
+
+  signature:p -> parameter;
+  signature:a -> attribute;
+  signature:t -> signature;
+
+  attribute:i -> identifier;
+  attribute:t -> attribute;
+
+  parameter:i -> identifier;
+  parameter:t -> parameter;
+
+  identifier:t -> identifier;
+
+  include:t -> include;
+}
+}}}
+
+######## End of graph.
+
+The nature of the hierarchy should reflect the conceptual form of the input.
+It should be noted that header file information (represented by `include`
+structures) is associated with interface information (represented by
+`interface` structures). This arrangement merely attempts to indicate the
+header file declarations that preceded specific interface declarations, but
+the two different types of information should arguably be grouped within a
+file-oriented structure.
+
+== Code Generation ==
+
+The `program.c` and `program.h` files define the functions that coordinate the
+generation of program code. It is in `program.c` that files are opened for
+writing (using the `get_output_file` function provided by `common.c`), and two
+principal functions are involved in initiating the population of these files:
+
+ * `begin_compound_output`
+ * `write_files`
+
+=== Compound Interfaces ===
+
+The `begin_compound_output` function is called by the main program when
+compound interface generation has been requested. It produces extra output
+that references and augments output produced for individual interfaces.
+
+Various details of individual interfaces are incorporated into the compound
+interface output. To achieve this, once the `begin_compound_output` function
+has been called, individual interface output is generated. During this
+activity, the `write_compound_output` function is called for each individual
+interface to insert details of that interface into the appropriate place
+within the compound interface output.
+
+The `end_compound_output` function ultimately closes the files involved,
+either through being invoked by the main program or upon a failure condition.
+
+The following diagram summarises the general function organisation involved.
+
+######## A graph showing the function organisation involved in generating
+######## compound interfaces...
+
+{{{#!graphviz
+#format svg
+#transform notugly
+digraph compound
+{
+  graph [fontsize="15.0",fontname="Helvetica"];
+  node [shape=box,fontname="Helvetica",style=filled,fillcolor=white];
+  rankdir=LR;
+
+  parser [shape=ellipse];
+
+  subgraph {
+    rank=same;
+    server [shape=folder,fillcolor="#77ff77",label="..._server.{c,cc,h}"];
+    interface [shape=folder,fillcolor="#77ff77",label="..._interface.h"];
+    interfaces [shape=folder,fillcolor="#77ff77",label="..._interfaces.h"];
+  }
+
+  subgraph {
+    rank=same;
+    main -> yyparse -> parser -> write_files -> write_interfaces;
+  }
+
+  main -> begin_compound_output;
+  main -> end_compound_output;
+
+  begin_compound_output -> write_handler_signature;
+  begin_compound_output -> write_dispatcher_signature;
+
+  write_handler_signature -> server;
+
+  write_dispatcher_signature -> server;
+
+  write_files -> write_compound_dispatch_include -> server;
+
+  write_interfaces -> write_compound_output;
+
+  write_compound_output -> write_dispatcher_cases -> server;
+  write_compound_output -> write_compound_interface;
+
+  write_compound_interface -> interface;
+  write_compound_output -> write_include -> interfaces;
+}
+}}}
+
+######## End of graph.
+
+=== Individual Interfaces ===
+
+The `write_files` function coordinates the generation of individual interface
+output.
+
+######## A graph showing the function organisation involved in generating
+######## individual interfaces...
+
+{{{#!graphviz
+#format svg
+#transform notugly
+digraph individual
+{
+  graph [fontsize="15.0",fontname="Helvetica"];
+  node [shape=box,fontname="Helvetica",style=filled,fillcolor=white];
+  rankdir=LR;
+
+  parser [shape=ellipse];
+
+  subgraph {
+    rank=same;
+    client [shape=folder,fillcolor="#77ff77",label="..._client.{c,cc,h}"];
+    server [shape=folder,fillcolor="#77ff77",label="..._server.{c,cc,h}"];
+    interface [shape=folder,fillcolor="#77ff77",label="..._interface.h"];
+  }
+
+  subgraph {
+    rank=same;
+    main -> yyparse -> parser -> write_files -> write_interfaces;
+  }
+
+  write_interfaces -> write_client_interface;
+  write_interfaces -> write_dispatcher;
+  write_interfaces -> write_dispatcher_signature;
+  write_interfaces -> write_functions;
+  write_interfaces -> write_handler_signature;
+  write_interfaces -> write_include;
+  write_interfaces -> write_interface_definition;
+  write_interfaces -> write_signatures;
+
+  write_client_interface -> client;
+  write_dispatcher -> server;
+  write_dispatcher_signature -> server;
+  write_functions -> client;
+  write_handler_signature -> server;
+
+  write_include -> client;
+  write_include -> server;
+
+  write_interface_definition -> client;
+  write_interface_definition -> interface;
+
+  write_signatures -> server;
+}
+}}}
+
+######## End of graph.
+
+== Includes and Headers ==
+
+ * `includes.c`
+
+== Interface Definitions ==
+
+ * `interface.c`
+
+== Templates and Output ==
+
+ * `templates.h`
+
+== Servers ==
+
+ * `server.c`
+
+== Dispatchers and Handlers ==
+
+ * `dispatch.c`
+
+== Parameters and Members ==
+
+ * `declaration.c`
+
+== Message Structures and Access ==
+
+ * `message.c`
+ * `structure.c`
+
+== Summaries ==
+
+ * `summary.c`