# HG changeset patch # User Paul Boddie # Date 1587420497 -7200 # Node ID a3349113185c57c2f0b9de0bf1f453e499522913 # Parent 559cd86cd5acdde5575bea0ae4e9d25d0136979d Added the beginnings of the development documentation. diff -r 559cd86cd5ac -r a3349113185c docs/wiki/Development --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/wiki/Development Tue Apr 21 00:08:17 2020 +0200 @@ -0,0 +1,431 @@ += Development = + +The current implementation of the [[idl|`idl`]] tool employs the stated +[[Prerequisites|prerequisites]] with the software written in the C programming +language. A Flex-based lexical analyser produces tokens that are consumed by a +Bison-based parser generator whose rules call program functions to build +structures and generate output. + +<> + +== Main Program == + +The `main.c` file contains the `main` function whose responsibilities include +the following: + + * Processing program arguments and options using C library `getopt` + functionality, configuring the program behaviour and output. + + * Input file access and output coordination. + + * Parser initialisation and invocation using the `yyrestart` and + `yyparse` functions, and parser state management. + +Various helper functions are also defined. + +== Configuration == + +The `config.c` and `config.h` files define configuration state for the program +as it processes input files, with the latter defining the nature of the state +and the former recording the actual state set by the main program when +interpreting program arguments and options. Thus, this configuration can be +considered to be a concise, validated form of those arguments and options. + +== Lexical Analysis == + +The `idl.lex` file defines a scanner or lexical analyser for the language used +to describe interfaces. For the most part, it merely matches sequences of +characters and produces token identifiers, with no additional information +being provided for tokens having a fixed form (typically keywords and +symbols). + +Tokens with varying input forms (identifiers, numbers, and so on) also have a +value associated with them indicating such things as the precise sequence of +matching characters or the numeric equivalent of those characters. Such +details are stored using members of the value type, referenced via the +``yylval`` variable. The token value type is itself defined in the parser +definition. + +=== Operating Modes === + +There are two distinct operating modes of the scanner: + + * The default mode recognising most elements of the language. + * A comment mode consuming the text found within comment delimiters. + +When a comment start indicator (`/*`) is encountered in the default mode, the +mode is switched to the comment mode and all input is consumed until the +comment end indicator (`*/`) is encountered, at which point the default mode +is selected again. + +A special indicator is used to declare the comment mode in the scanner file: + +{{{ +%x comment +}}} + +All rules applicable in this mode are prefixed with `` in the file to +distinguish them from rules applicable in the default mode. + +== Parsing == + +The `idl.y` file defines a parser for the interface description language. It +starts by attempting to satisfy the `file` rule, matching statements until the +end of input, invoking the `write_files` function to generate the configured +output from the tool. + +As each rule is evaluated, tokens are consumed from the scanner and operations +are performed to build up a structure describing the input. Where a rule +cannot be evaluated successfully, the included ``yyerror`` function emits a +message and the parsing will halt. + +(Error handling and reporting could do to be improved.) + +=== Token and Rule Result Values === + +For interoperability with the scanner or lexical analyser, the parser defines +the nature of the values associated with tokens using the `%union` +declaration, these including a `str` (string) interpretation and a `num` +(numeric) interpretation of a value. + +Since token values are propagated between parser rules, the `%union` +declaration is augmented with other interpretations that are employed by the +rules. Consequently, rules can obtain values for tokens that were produced by +the scanner (such as numbers and strings) but then incorporate them into other +kinds of values or structures, passing them on to other rules. These other +rules can treat such propagated values in the same way as those produced +directly by the scanner. + +For example, the `include` rule obtains a value associated with a header +filename. This filename is associated with the `HEADER` token and its value is +interpreted as a string. However, the rule needs to prepare a structure that +incorporates the filename and that can be referenced by other structures. To +achieve this, a `%union` member is defined for the structure type concerned: + +{{{ +%union { + long num; + char *str; + struct include inc; + ... +} +}}} + +The member name, `inc`, is then associated with the `include` rule: + +{{{ +%type include +}}} + +With this, the rule can be considered to be working to populate a value of the +indicated type (`struct include`), and where other rules reference the result +of the rule, they will be able to recognise the type of this result. + +=== Value Copying === + +When rules obtain values to store them, it is necessary to copy the obtained +values because these values may not be allocated in any permanent sense: they +may only be available at a particular point during the scanning of input, and +any attempt to reference them later may yield an invalid value. Consequently, +a `copy` function and a suite of convenience macros (defined in `parser.h`) +allocate memory for new values. + +(Currently, the management of allocated memory is deficient in that such +memory is not deallocated. However, since the program is intended to have a +limited running time and handle limited numbers of input files, no effort has +been directed towards tidying up this allocated memory.) + +=== Rules and Structures === + +Generally, the structures built using the result values reflect the structure +of the rules describing the interface description language. However, the form +of rules, necessary as it is for parsing, is not entirely optimal for a +generated structure. Consider the following rule: + +{{{ +attributes : attribute SEP attributes + | attribute + ; +}}} + +Consider a pair of attributes: + +{{{ +first,second +}}} + +With the `attribute` rule matching each identifier, and with the `attributes` +rule incorporating a single attribute value within a structure referencing +other attribute, the following structure would emerge: + +######## A graph showing a generated structure derived from rules... + +{{{#!graphviz +#format svg +#transform notugly +digraph rule_structure +{ + graph [fontsize="15.0",fontname="Helvetica"]; + node [shape=record,fontname="Helvetica"]; + + attributes1 [label="{attributes | { attribute | attributes}}"]; + first [label="{attribute | { \"first\" | ... }}"]; + attributes2 [label="{attributes | { attribute | attributes}}"]; + second [label="{attribute | { \"second\" | ... }}"]; + + attributes1:f -> first; + attributes1:a -> attributes2; + attributes2:f -> second; +} +}}} + +######## End of graph. + +A more natural structure would instead employ a linked list of attributes: + +######## A graph showing a generated structure employing lists... + +{{{#!graphviz +#format svg +#transform notugly +digraph natural_structure +{ + graph [fontsize="15.0",fontname="Helvetica"]; + node [shape=record,fontname="Helvetica"]; + + first [label="{attribute | { \"first\" | ... | tail }}"]; + second [label="{attribute | { \"second\" | ... | tail }}"]; + + first:a -> second; +} +}}} + +To achieve this, a tail member is defined in structures, and instead of +wrapping results in new structures at each level of the rule hierarchy, +results are effectively combined by having one result reference another via +the tail member, thus linking together collections of results. + +######## End of graph. + +== Interface Structure == + +The `types.h` file defines the structural elements of interfaces prepared +during the processing of the input files. A hierarchy of structure types is +defined as follows: + +######## A graph showing the relationship between structure types... + +{{{#!graphviz +#format svg +#transform notugly +digraph types +{ + graph [fontsize="15.0",fontname="Helvetica"]; + node [shape=record,fontname="Helvetica"]; + + interface [label="{interface | {name | signatures | attributes | includes | tail}}"]; + signature [label="{signature | {qualifier | operation |

parameters | attributes | tail}}"]; + attribute [label="{attribute | {attribute | identifiers | tail}}"]; + parameter [label="{parameter | {specifier | class | identifiers | tail}}"]; + identifier [label="{identifier | {identifier | tail}}"]; + include [label="{include | {filename | tail}}"]; + + interface:s -> signature; + interface:a -> attribute; + interface:i -> include; + interface:t -> interface; + + signature:p -> parameter; + signature:a -> attribute; + signature:t -> signature; + + attribute:i -> identifier; + attribute:t -> attribute; + + parameter:i -> identifier; + parameter:t -> parameter; + + identifier:t -> identifier; + + include:t -> include; +} +}}} + +######## End of graph. + +The nature of the hierarchy should reflect the conceptual form of the input. +It should be noted that header file information (represented by `include` +structures) is associated with interface information (represented by +`interface` structures). This arrangement merely attempts to indicate the +header file declarations that preceded specific interface declarations, but +the two different types of information should arguably be grouped within a +file-oriented structure. + +== Code Generation == + +The `program.c` and `program.h` files define the functions that coordinate the +generation of program code. It is in `program.c` that files are opened for +writing (using the `get_output_file` function provided by `common.c`), and two +principal functions are involved in initiating the population of these files: + + * `begin_compound_output` + * `write_files` + +=== Compound Interfaces === + +The `begin_compound_output` function is called by the main program when +compound interface generation has been requested. It produces extra output +that references and augments output produced for individual interfaces. + +Various details of individual interfaces are incorporated into the compound +interface output. To achieve this, once the `begin_compound_output` function +has been called, individual interface output is generated. During this +activity, the `write_compound_output` function is called for each individual +interface to insert details of that interface into the appropriate place +within the compound interface output. + +The `end_compound_output` function ultimately closes the files involved, +either through being invoked by the main program or upon a failure condition. + +The following diagram summarises the general function organisation involved. + +######## A graph showing the function organisation involved in generating +######## compound interfaces... + +{{{#!graphviz +#format svg +#transform notugly +digraph compound +{ + graph [fontsize="15.0",fontname="Helvetica"]; + node [shape=box,fontname="Helvetica",style=filled,fillcolor=white]; + rankdir=LR; + + parser [shape=ellipse]; + + subgraph { + rank=same; + server [shape=folder,fillcolor="#77ff77",label="..._server.{c,cc,h}"]; + interface [shape=folder,fillcolor="#77ff77",label="..._interface.h"]; + interfaces [shape=folder,fillcolor="#77ff77",label="..._interfaces.h"]; + } + + subgraph { + rank=same; + main -> yyparse -> parser -> write_files -> write_interfaces; + } + + main -> begin_compound_output; + main -> end_compound_output; + + begin_compound_output -> write_handler_signature; + begin_compound_output -> write_dispatcher_signature; + + write_handler_signature -> server; + + write_dispatcher_signature -> server; + + write_files -> write_compound_dispatch_include -> server; + + write_interfaces -> write_compound_output; + + write_compound_output -> write_dispatcher_cases -> server; + write_compound_output -> write_compound_interface; + + write_compound_interface -> interface; + write_compound_output -> write_include -> interfaces; +} +}}} + +######## End of graph. + +=== Individual Interfaces === + +The `write_files` function coordinates the generation of individual interface +output. + +######## A graph showing the function organisation involved in generating +######## individual interfaces... + +{{{#!graphviz +#format svg +#transform notugly +digraph individual +{ + graph [fontsize="15.0",fontname="Helvetica"]; + node [shape=box,fontname="Helvetica",style=filled,fillcolor=white]; + rankdir=LR; + + parser [shape=ellipse]; + + subgraph { + rank=same; + client [shape=folder,fillcolor="#77ff77",label="..._client.{c,cc,h}"]; + server [shape=folder,fillcolor="#77ff77",label="..._server.{c,cc,h}"]; + interface [shape=folder,fillcolor="#77ff77",label="..._interface.h"]; + } + + subgraph { + rank=same; + main -> yyparse -> parser -> write_files -> write_interfaces; + } + + write_interfaces -> write_client_interface; + write_interfaces -> write_dispatcher; + write_interfaces -> write_dispatcher_signature; + write_interfaces -> write_functions; + write_interfaces -> write_handler_signature; + write_interfaces -> write_include; + write_interfaces -> write_interface_definition; + write_interfaces -> write_signatures; + + write_client_interface -> client; + write_dispatcher -> server; + write_dispatcher_signature -> server; + write_functions -> client; + write_handler_signature -> server; + + write_include -> client; + write_include -> server; + + write_interface_definition -> client; + write_interface_definition -> interface; + + write_signatures -> server; +} +}}} + +######## End of graph. + +== Includes and Headers == + + * `includes.c` + +== Interface Definitions == + + * `interface.c` + +== Templates and Output == + + * `templates.h` + +== Servers == + + * `server.c` + +== Dispatchers and Handlers == + + * `dispatch.c` + +== Parameters and Members == + + * `declaration.c` + +== Message Structures and Access == + + * `message.c` + * `structure.c` + +== Summaries == + + * `summary.c`