Tuesday, October 27, 2009

Object Artefact Script

A couple of weeks ago, I attended a workshop at the Edinburgh eScience Institute on the relation of text in ancient (and other) documents to its context and on the problems of reading difficult texts on difficult objects and ways in which technology can aid the process of interpretation and dissemination without getting in the way of it. The meeting was well summarized by Alejandro Giacometti in his blog, and the presentations are posted on the eSI wiki.

Kathryn Piquette discussed what would be required to digitally represent Egyptian hieroglyphic texts without divorcing them from their contexts as an integral part of monumental architecture. For example, the interpretation of the meaning of texts should be able to take into account the times of day (and/or year) when they would have been able to be read, their relationship to their surroundings, and so on. The established epigraphical practice of divorcing the transcribed text from its context, while often necessary, does some violence to its meaning, and this must be recognized and accounted for. At the same time, digital 3D reconstructions are themselves an interpretation, and it is important to disclose the evidence on which that interpretation is based.

Ségolène Tarte talked about the process of scholarly interpretation in reading the Vindolanda tablets and similar texts. As part of analysing the scholarly reading process, the eSAD project observed two experts reading a previously-published tablet. During the course of their work, they came up with a new reading that completely changed their understanding of the text. The previous reading hinged on the identification of a single word, which led to the (mistaken) recognition of the document as recording the sale of an ox. The new reading hinged on the recognition of a particular letterform as an 'a'. The ways in which readings of difficult texts are produced—involving skipping around looking for recognizable pieces of text upon which (multiple) partial mental models of the texts are constructed, which must then be resolved somehow into a reading—means that an Interpretation Support System (such as the one eSAD proposes to develop) must be sensitive to the different ways of reading scholars use and must be careful not to impose "spurious exactitude" on them.

Dot Porter gave an overview of a variety of projects that focus on representing text, transcription, and annotation alongside one another as a way into discussing the relationship between digital text and physical text. She cautioned against attempts to digitally replicate the experience of the codex, since there is a great deal of (necessary) data interpolation that goes on in any detailed digital reconstruction, and this elides the physical reality of the text. Digital representations may improve (or even make possible) the reading of difficult texts, such as the Vindolanda tablets or the Archimedes Palimpsest, so for purposes of interpretation, they may be superior to the physical reality. They can combine data, metadata, and other contextual information in ways that help a reader to work with documents. But they cannot satisfactorily replicate the physicality of the document, and it may be a bit dishonest to try.

I talked about the img2xml project I'm working on with colleagues from UNC Chapel Hill. I've got a post or two about that in the pipeline, so I won't say much here. It involves the generation of SVG tracings of text in manuscript documents as a foundation for linking and annotation. Since the technique involves linking to an XML-based representation of the text, it may prove superior to methods that rely simply on pointing at pixel coordinates in images of text.

Ryan Bauman talked about the use of digital images as scholarly evidence. He gave a fascinating overview of sophisticated techniques for imaging very difficult documents (e.g. carbonized, rolled up scrolls from Herculaneum) and talked about the need for documentation of the techniques used in generating the images. This is especially important because the images produced will not resemble the way the document looks in visible light. Ryan also talked about the difficulties involved in linking views of the document that may have been produced at different times, when the document was in different states, or may have used different techniques. The Archimedes Palimpsest project is a good example of what's involved in referencing all of the images so that they can be linked to the transcription.

Finally, Leif Isaksen talked about how some of the techniques discussed in the earlier presentations might be used in crowdsourcing the gathering of data about inscriptions. Inscriptions (both published and unpublished) are frequently encountered (both in museums and out in the open) by tourists who may be curious about their meaning, but lack the ability to interpret them. They may well, however, have sophisticated tools available for image capture, geo-referencing, and internet access (via digital cameras, smartphones, etc.). Can they be employed, in exchange for information about the texts they encounter, as data gatherers?

Some themes that emerged from the discussion included:

  • the importance of communicating the processes involved in generating digital representations of texts and their contexts (i.e. showing your work)

  • the need for standard ways of linking together image and textual data

  • the importance of disseminating data and code, not just results


This was a terrific workshop, and I hope to see followup on it. ESAD is holding a workshop next month on "Understanding image-based evidence," that I'm sorry I can't attend and from which look forward to seeing the output.

No comments: