The libxml2dom package provides a traditional DOM wrapper around the Python bindings for libxml2. In contrast to the libxml2 bindings, libxml2dom provides an API reminiscent of minidom, pxdom and other Python-based and Python-related XML toolkits. Performance is disappointing, given the typical high speed of libxml2 processing, but this is to be expected since large numbers of Python objects are instantiated at two levels of document tree representation. However, serialisation of documents is much faster than many other toolkits because it can make direct use of libxml2.
The main libxml2dom package is relatively slow, even when compared to Python-only XML toolkits, but previous experiments into source code analysis suggested that with a slightly altered coding style, programs could be transformed into a style which utilises the underlying libxml2mod API directly; this API employs opaque handles which are exposed to Python but which can only be investigated through the functions in the API. One significant advantage of accessing the libxml2mod API directly is that the libxml2 wrapper objects do not need to be instantiated, let alone the additional libxml2dom wrapper objects, and the consequences are obvious: reduced memory consumption and improved performance.
The libxml2macro approach is as follows:
libxml2macro.py
on the source
file.A description of the process is given in the README.txt
file
within the source code distribution. However, what libxml2macro does is to
take code like this...
for my_node in my_element.childNodes: if my_node.nodeType == TEXT_NODE: print my_node.nodeValue
...and to transform it into something more or less like this (although in practice the actual libxml2mod calls are provided in a library, although more aggressive transformations could result in something actually like this):
for my_node in libxml2mod.children(my_element): if libxml2mod.type(my_node) == "text": print libxml2mod.xmlNodeGetContent(my_node)
The result is that developers can still write DOM-style code but not be penalised for the object-related overhead that such an approach typically incurs.
For reasons of consistency, libxml2dom uses the same MIT-style licence as
libxml2. See the file COPYING.txt
in the docs
directory within the source code distribution.
Given the availability of libxml2, libxml2dom only needs to reside on the
PYTHONPATH and can be installed using the setup.py
script
provided:
python setup.py install
The following descriptions identify dependencies and describe appropriate installation issues with each dependency:
Building libxml2 from source and configuring the Python bindings can be done as follows:
cd libxml2-2.6.16 ./configure --with-python=/usr/local/bin/python make
If you are to use an installation of Python installed outside
/usr/local
then specify the "prefix" accordingly. Install
(possibly as root
) in the usual way:
make install
Previous releases of libxml2 in the 2.6 series may work, but some bugs were observed with the previously recommended 2.6.0 and these may not have been fixed until 2.6.16 or slightly earlier.
The patches
directory in the source code distribution
contains a patch against libxml2 2.5.7 which resolves an issue exposed by
libxml2dom. Although it is recommended that later releases of libxml2 are
used instead, the source code distribution of libxml2 2.5.7 can be patched as
follows:
patch -p0 < libxml2dom/patches/libxml2/libxml.c.diff
The command should be run outside/above the libxml2-2.5.7
directory, and the stated path should be adjusted accordingly.
Python releases from 2.2 onwards should be compatible with libxml2dom. The principal requirement from such releases is the new-style class support which permits the use of properties in the libxml2dom implementation, thus simplifying the code somewhat.