wikipage: Data Presentation (Electronic Texts)

Specific tools for manipulating data into formats for publication are mentioned elsewhere (Querying and Analysis of Data) and these include ANASTASIA, TuStep and EDITION. The first of these is a currently available dedicated tool and labours under the full title, Analytical System Tools and SGML/XML Integration Applications. Where it differs from other SGML/XML publishing systems is that in addition to recognising the primary document hierarchy as expressed by the encoding, it also is capable of ‘reading’ the text according to its left to right relation in the document stream, by column, by page or indeed from any point in the text to any other point. This gets around the problem of not being able to take into account multiply hierarchical or overlapping sections of text (which SGML/XML/TEI struggles to accommodate – see Data Preparation). On the ANASTASIA sourceforge web page (http://anastasia.sourceforge.net/anaprocess.html), it states that the programme is ‘an event-driven procedural environment for handling XML document collections’ and also gives a link to a page that highlights the difference between it and XSLT (Extensible Stylesheet Language Template), which at a casual glance appears to fulfil the same function.

The use of ANASTASIA is recommended for large collections of data with complex structures where there is a requirement to build real-time views of information that is scattered throughout that data, and which cuts across XML element descriptions. For alternative scenarios where the hierarchical structure of the XML document presents no such difficulties, the most widely referenced XSLT processing systems appear to be SAXON (http://saxon.sourceforge.net/) and XALAN (http://xalan.apache.org/). These use the XSLT transformation vocabulary, in association with XPath (a language for addressing parts of XML documents), to rearrange, sort, combine and transform one XML document into another; or alternatively output that content into HTML or text file formats. XSLT can also be used in conjunction with Cascading Style Sheets (CSS), a W3C recommendation since 1996 which concentrates on the specification of the format of HTML (and SGML/XML) documents as they appear in a browser, but which lacks the sophisticated functionality of XSLT/XPath techniques to query and manipulate data.

The Versioning Machine (http://v-machine.org/index.php), conceived by Susan Schreibman and based at the University of Maryland, is a web-based system for displaying and comparing different version of the same text. It supports the display of XML texts encoded according to the guidelines of the TEI and allows different witness texts to be displayed side-by-side (a diplomatic edition next to a manipulable image of the witness text for example) in addition to features such as an enhanced typology of notes, synchronized scrolling and line matching functions. The system will accommodate separately encoded TEI documents of poems and prose and will display these files side-by-side but will be unable to take advantage of the synchronised scrolling and line matching features. Alternatively, one can use the TEI's "critical apparatus tagset" (TEI.textcrit) to encode all the witnesses in one XML file, thereby cutting down on the amount of repetitive encoding required and enabling all the functionality of the Versioning Machine, but at the cost of added complexity at the initial encoding stage.

Link to full working paper

Syndicate content